How Computers Automatically Label Different Speakers in Audio Recordings
A method for identifying and labeling speakers in audio recordings, even if the system has never heard the person speak before, by grouping similar voices and asking a user to name them.
Original patent title: “Methods and apparatus for unknown speaker labeling using concurrent speech recognition, segmentation, classification and clustering”
A method for identifying and labeling speakers in audio recordings, even if the system has never heard the person speak before, by grouping similar voices and asking a user to name them. Granted to International Business Machines Corp in 2002 with 33 claims and 75 forward citations, and it is now in the public domain.
Key facts
Coverage
What does this patent actually cover?
This system processes audio to identify who is speaking by breaking the audio into segments and comparing them against a database. It uses 'background models' for people not yet in the system, such as a generic 'unenrolled male' or 'unenrolled female' profile. When the system detects a recurring voice that it doesn't recognize, it groups those segments together into a cluster. A user can then provide a name for that cluster, and the system automatically updates its database to recognize that person in future recordings.
The gap
What does this patent NOT cover?
- Does not cover real-time voice synthesis or voice modification.
- Does not cover hardware-based microphones or physical audio capture devices.
- Does not cover methods that require every speaker to be pre-enrolled before identification can begin.
- Does not cover the specific linguistic analysis of the words being spoken, only the identification of the speaker.
These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.
What made this novel
The system uses a 'background model' for unknown speakers, allowing it to track and cluster voices it hasn't learned yet, rather than simply failing to identify them or labeling them as noise.
The Patent Drawing

Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.
Where you've seen this
Real-world examples
Automated meeting transcription services like Otter.ai
Zoom and Microsoft Teams speaker identification features
Legal and medical transcription software
Call center analytics platforms
Why it matters
The bigger picture
This patent laid the groundwork for modern transcription services that distinguish between multiple participants in a meeting. It solved the 'cold start' problem in voice recognition, where a system could not label a speaker until they were manually registered. This is now a standard feature in enterprise meeting software and legal transcription tools.
Filed
November 5, 1999
Granted
July 23, 2002
Market context
Who's building on this
Companies in this space
IBM remains a significant player in enterprise AI and natural language processing. Modern cloud providers like Amazon (AWS Transcribe), Google (Cloud Speech-to-Text), and Microsoft (Azure Speech) have expanded these foundational clustering techniques into massive, scalable services.
Market impact
This technology enabled the transition from simple speech-to-text to 'speaker diarization,' which is the ability to create a 'who spoke when' log. It turned raw transcripts into structured data, which is essential for modern business productivity tools and compliance monitoring in regulated industries.
Claim 1 — Plain English
What this patent covers
This system processes audio to identify who is speaking by breaking the audio into segments and comparing them against a database. It uses 'background models' for people not yet in the system, such as a generic 'unenrolled male' or 'unenrolled female' profile. When the system detects a recurring voice that it doesn't recognize, it groups those segments together into a cluster. A user can then provide a name for that cluster, and the system automatically updates its database to recognize that person in future recordings.
The clever bit
The system uses a 'background model' for unknown speakers, allowing it to track and cluster voices it hasn't learned yet, rather than simply failing to identify them or labeling them as noise.
What it does not cover
- Does not cover real-time voice synthesis or voice modification.
- Does not cover hardware-based microphones or physical audio capture devices.
- Does not cover methods that require every speaker to be pre-enrolled before identification can begin.
- Does not cover the specific linguistic analysis of the words being spoken, only the identification of the speaker.
Patent Journey
From filing to expiry
PatentBrief Score
Impact Score
Moderate
Citation count
38/40
Highly cited
Claim breadth
20/20
Very broad protection
Recency
0/20
Older than 20 years
Assignee scale
0/20
Independent or smaller assigneeassigneeThe entity that owns the patent — usually the inventor's employer or a company.Read more →
PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.
Heuristic Value Estimate
What this patent might be worth
$58K – $184K
Midpoint $115K · expired or expiring · industry ×1.6
Heuristic only — blends forward/backward citation counts, claim scope, time remaining, litigation history, and CPC-derived industry baseline. Real valuations need a professional appraisal.
The original legal language
Original claims
33 claims as filed with the patent office.
Concepts involved
Citations
Patent lineage
Cite this patent
Tritschler, A. C. L., & Viswanathan, M. (2002). How Computers Automatically Label Different Speakers in Audio Recordings (U.S. Patent No. 6,424,946). U.S. Patent and Trademark Office. https://patentbrief.org/patent/us/6424946/amazon-personalized-recommendations
Auto-generated from the patent record. Double-check author order and the issue date against the official USPTO document before submitting.
Embed
Add this patent to your site
Drop this plain-English patent card into any blog post or article — free, no signup. It always links back to the full breakdown here.
<div data-patentlens-widget data-patent-number="US6424946"></div> <script src="https://patentbrief.org/embed.js" async></script>
Stay in the loop
Get a weekly digest of new patents.
One email per week. No spam. Unsubscribe anytime.
Keep exploring
Related patents you should know
US 4683195 · 1987
How to Make Billions of Copies of a DNA Segment
This patent describes the Polymerase Chain Reaction (PCR), a method to rapidly create many copies of a specific piece of DNA or RNA, enabling its detection and analysis.
Cetus Corp
US 8697359 · 2014
How to Edit Genes in Human Cells Using an Engineered CRISPR System
This patent describes an engineered CRISPR-Cas9 system for precisely cutting DNA in eukaryotic cells to change how genes work, opening the door for gene editing in complex organisms.
Massachusetts Institute of Technology
US 7657849 · 2010
How the iPhone's Slide-to-Unlock Gesture Works
Apple's 2010 patent describes unlocking a device by dragging a specific graphical image across the touchscreen along a predefined path, a gesture that became iconic with the original iPhone.
Apple Inc
US 4733665 · 1988
How Doctors Implant a Permanent Stent Using a Balloon
This patent describes the method for placing a permanent, expandable wire mesh tube inside a blood vessel or other body tube using a balloon-tipped catheter to widen it and keep it open.
Expandable Grafts Partnership
US 4405829 · 1983
How RSA Public-Key Encryption Keeps Digital Messages Secret
This patent describes the foundational RSA algorithm, a method for securely sending messages where anyone can encrypt a message using a public key, but only the intended recipient can decrypt it using a secret private key.
Massachusetts Institute of Technology
US 4575330 · 1986
How 3D Printers Build Objects Layer by Layer from Liquid
This patent describes the foundational method for 3D printing, where a machine builds a three-dimensional object layer by layer by hardening a liquid material with light or other energy.
UVP Inc
Semantically similar
You might also find these interesting
US 8168877 · 2012 · Harman International Industries Canada Ltd
How to Automatically Generate Musical Harmonies from Audio
US 10452978 · 2019 · Google LLC
How AI Models Understand Language Using 'Attention'
US 6732183 · 2004 · BroadWare Technologies LLC
How to Seamlessly Switch Video Streams for Many Viewers
US 9965705 · 2018 · Baidu USA LLC
How AI Uses Question-Guided Attention to Answer Questions About Images
More to explore
More in Software & Internet
US 4405829 · 1983 · Massachusetts Institute of Technology
How RSA Public-Key Encryption Keeps Digital Messages Secret
US 6285999 · 2001 · Leland Stanford Junior University
How Websites Get Ranked by Importance
US 5960411 · 1999 · Amazon com Inc
How Amazon's One-Click Ordering Works for Online Purchases
US 7669123 · 2010 · Facebook Inc
Displaying Friends' Activities in a Social Network Feed
New to patents?
Common Questions
Frequently Asked Questions
What does How Computers Automatically Label Different Speakers in Audio Recordings cover?
A method for identifying and labeling speakers in audio recordings, even if the system has never heard the person speak before, by grouping similar voices and asking a user to name them.
Who owns patent US 6424946?
International Business Machines Corp owns this patent, granted in 2002.
When does this patent expire?
This patent has expired and is now in the public domain — anyone can use the invention freely.
What is patent US 6424946 cited by?
This patent has been cited by 75 later patents that build on its ideas.
What problem does this patent solve?
This patent laid the groundwork for modern transcription services that distinguish between multiple participants in a meeting. It solved the 'cold start' problem in voice recognition, where a system could not label a speaker until they were manually registered. This is now a standard feature in enterprise meeting software and legal transcription tools.
What does this patent NOT cover?
Does not cover real-time voice synthesis or voice modification.
Same assignee
More from International Business Machines Corp
Patent monitoring







