Skip to content
PatentBrief
Get alertsTop ↑

How AI Predicts Who Will Speak Next in a Conversation

IBM's patent describes a system that uses neural networks to analyze speech patterns and intentions to predict which person will talk next in a conversation.

Granted 2023ActiveExpires 2040Owned by International Business Machines CorpInvented by Emily Mower Provost, Lazaros Polymenakos, Zakaria Aldeneh + 1 more

Original patent title: “End-of-turn detection in spoken dialogues

Plain-English explanation by SahiLast reviewed · June 15, 2026

IBM's patent describes a system that uses neural networks to analyze speech patterns and intentions to predict which person will talk next in a conversation. Granted to International Business Machines Corp in 2023 with 23 claims.

Key facts

Patent numberUS 11645473
StatusActive
FieldAI & Machine Learning
AssigneeInternational Business Machines Corp
InventorsEmily Mower Provost, Lazaros Polymenakos, Zakaria Aldeneh and 1 other
Filed2020
Granted2023
Claims23
Times cited0
LitigationNone on record
Value · $47K$150KMinimal

Coverage

What does this patent actually cover?

This system uses a neural network to monitor a conversation between two people and figure out who is likely to speak next. It does this by simultaneously analyzing two things: the speaker's 'intention' (like whether they are asking a question or making a statement) and the 'turn type' (like whether they are trying to hold the floor or are about to switch speakers). By combining these predictions using a joint loss function, the system can determine if the first person will keep talking or if the second person will jump in. For example, if the system detects a rising pitch at the end of a sentence (an acoustic cue) and identifies the intention as a question, it can predict a turn switch to the other person.

The gap

What does this patent NOT cover?

  • Does not cover systems that only look at text transcripts without analyzing acoustic cues like pitch or speaking rate.
  • Does not cover simple rule-based systems that rely solely on silence duration to detect turn-taking.
  • Does not cover systems that predict the content of the next sentence, only the identity of the next speaker.

These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.

What made this novel

The innovation lies in training a single neural network to jointly optimize two different tasks—intention recognition and turn-type detection—rather than treating them as separate, sequential steps.

End-of-turn detection in spoke…(Primary claim)ai mltelecommunicationsconsumer electronics

Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.

Where you've seen this

Real-world examples

01

Smart home voice assistants

02

Automated customer service phone bots

03

Real-time speech-to-text transcription software

Why it matters

The bigger picture

Managing natural conversation flow is a major hurdle for voice assistants like Siri or Alexa, which often interrupt users or stop listening too early. By accurately predicting turn-taking, this technology aims to make human-computer interactions feel more fluid and less robotic. It is part of a broader effort to move beyond simple keyword recognition into true conversational understanding.

Filed

December 23, 2020

Granted

May 9, 2023

Market context

Who's building on this

Companies in this space

IBM remains a primary player in enterprise AI and natural language processing. Other major tech companies like Google, Amazon, and Microsoft are actively researching similar conversational AI models to improve the responsiveness of their voice-enabled products.

Market impact

This technology contributes to the ongoing refinement of conversational AI, aiming to reduce the latency and awkwardness inherent in current voice interfaces. It supports the industry trend toward more human-like, context-aware digital assistants in both consumer and enterprise sectors.

Claim 1 — Plain English

What this patent covers

This system uses a neural network to monitor a conversation between two people and figure out who is likely to speak next. It does this by simultaneously analyzing two things: the speaker's 'intention' (like whether they are asking a question or making a statement) and the 'turn type' (like whether they are trying to hold the floor or are about to switch speakers). By combining these predictions using a joint loss function, the system can determine if the first person will keep talking or if the second person will jump in. For example, if the system detects a rising pitch at the end of a sentence (an acoustic cue) and identifies the intention as a question, it can predict a turn switch to the other person.

The clever bit

The innovation lies in training a single neural network to jointly optimize two different tasks—intention recognition and turn-type detection—rather than treating them as separate, sequential steps.

What it does not cover

  • Does not cover systems that only look at text transcripts without analyzing acoustic cues like pitch or speaking rate.
  • Does not cover simple rule-based systems that rely solely on silence duration to detect turn-taking.
  • Does not cover systems that predict the content of the next sentence, only the identity of the next speaker.

Patent timeline

Filing

Application submitted to the patent office

Publication

Application published, typically 18 months after filing

Grant

Patent officially issued

PatentBrief Score

Impact Score

Early stage

Citation count

0/40

No citations yet

Claim breadth

15/20

Broad claimsclaimsThe numbered statements at the end of a patent that legally define what the inventor owns.Read more →

Recency

20/20

Granted within 5 years

Assignee scale

0/20

Independent or smaller assigneeassigneeThe entity that owns the patent — usually the inventor's employer or a company.Read more →

PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.

Heuristic Value Estimate

What this patent might be worth

Minimal

$47K$150K

Midpoint $94K · 14.5 yr remaining · industry ×1.6

Adjust inputs →

Heuristic only — blends forward/backward citation counts, claim scope, time remaining, litigation history, and CPC-derived industry baseline. Real valuations need a professional appraisal.

The original legal language

Original claims

23 claims as filed with the patent office.

Concepts involved

ClaimPrior artNon-obviousnessNoveltySpecificationAssigneePatent term

Citations

Patent lineage

Cites earlier patents

25

earlier patents this invention cites as foundations

View prior art →

Cite this patent

Provost, E. M., Polymenakos, L., Aldeneh, Z., & Dimitriadis, D. B. (2023). How AI Predicts Who Will Speak Next in a Conversation (U.S. Patent No. 11,645,473). U.S. Patent and Trademark Office. https://patentbrief.org/patent/us/11645473/palm-pathways-language-model

Auto-generated from the patent record. Double-check author order and the issue date against the official USPTO document before submitting.

Embed

Add this patent to your site

Drop this plain-English patent card into any blog post or article — free, no signup. It always links back to the full breakdown here.

<div data-patentlens-widget data-patent-number="US11645473"></div>
<script src="https://patentbrief.org/embed.js" async></script>

Stay in the loop

Get a weekly digest of new patents.

One email per week. No spam. Unsubscribe anytime.

Keep exploring

Related patents you should know

US 4683195 · 1987

How to Make Billions of Copies of a DNA Segment

This patent describes the Polymerase Chain Reaction (PCR), a method to rapidly create many copies of a specific piece of DNA or RNA, enabling its detection and analysis.

Cetus Corp

US 8697359 · 2014

How to Edit Genes in Human Cells Using an Engineered CRISPR System

This patent describes an engineered CRISPR-Cas9 system for precisely cutting DNA in eukaryotic cells to change how genes work, opening the door for gene editing in complex organisms.

Massachusetts Institute of Technology

US 7657849 · 2010

How the iPhone's Slide-to-Unlock Gesture Works

Apple's 2010 patent describes unlocking a device by dragging a specific graphical image across the touchscreen along a predefined path, a gesture that became iconic with the original iPhone.

Apple Inc

US 4733665 · 1988

How Doctors Implant a Permanent Stent Using a Balloon

This patent describes the method for placing a permanent, expandable wire mesh tube inside a blood vessel or other body tube using a balloon-tipped catheter to widen it and keep it open.

Expandable Grafts Partnership

US 4965188 · 1990

How to Make Many Copies of a DNA Piece with Heat

This patent describes the Polymerase Chain Reaction (PCR) method, a technique to make millions of copies of a specific DNA segment using a heat-resistant enzyme and repeated temperature changes.

Cetus Corp

US 4235871 · 1980

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

This patent describes a method for trapping biologically active substances inside tiny, multi-layered fat bubbles called liposomes, using a specific water-in-oil emulsion and gel-forming process to improve how much material gets captured.

Individual

More to explore

More in AI & Machine Learning

Browse all AI & Machine Learning

New to patents?

What is a patent?How to read a patentAnatomy of a claimHow strong is this patent?What the citations meanWhat it doesn't coverPatent glossary

Common Questions

Frequently Asked Questions

What does How AI Predicts Who Will Speak Next in a Conversation cover?

IBM's patent describes a system that uses neural networks to analyze speech patterns and intentions to predict which person will talk next in a conversation.

Who owns patent US 11645473?

International Business Machines Corp owns this patent, granted in 2023.

When does this patent expire?

This patent is expected to expire on May 9, 2043, when the invention enters the public domain.

What problem does this patent solve?

Managing natural conversation flow is a major hurdle for voice assistants like Siri or Alexa, which often interrupt users or stop listening too early. By accurately predicting turn-taking, this technology aims to make human-computer interactions feel more fluid and less robotic. It is part of a broader effort to move beyond simple keyword recognition into true conversational understanding.

What does this patent NOT cover?

Does not cover systems that only look at text transcripts without analyzing acoustic cues like pitch or speaking rate.

View all →
US 10318876·2019

How AI Systems Adjust Their Behavior Based on User Mood

US 10140638·2018

How Cloud Services Automatically Switch Providers When One Fails

US 9959544·2018

How a Server Updates Smart Card Apps and Shows Ads

US 9405582·2016

How Computers Automatically Adjust Tasks to Run Faster in Data Centers

Patent monitoring

Get notified when International Business Machines Corp files a new patent

Get notified when this company files a new patent. Weekly digest · Confirm via email · Unsubscribe anytime.

Last reviewed: June 15, 2026 · PatentBrief is not a law firm and this is not legal advice.