PatentBrief — patentbrief.org

6424946 USView original on Google PatentsCompare ↗Brief ↗

How Computers Automatically Label Different Speakers in Audio Recordings

Name: PatentBrief
Address: Phoenix, AZ, US
Price range: Free

A method for identifying and labeling speakers in audio recordings, even if the system has never heard the person speak before, by grouping similar voices and asking a user to name them.

Granted 2002ExpiredExpired 2019Owned by International Business Machines CorpInvented by Alain Charles Louis Tritschler, Mahesh Viswanathan

Original patent title: “Methods and apparatus for unknown speaker labeling using concurrent speech recognition, segmentation, classification and clustering”

Plain-English explanation by SahiLast reviewed · June 13, 2026

A method for identifying and labeling speakers in audio recordings, even if the system has never heard the person speak before, by grouping similar voices and asking a user to name them. Granted to International Business Machines Corp in 2002 with 33 claims and 75 forward citations, and it is now in the public domain.

Coverage

What does this patent actually cover?

This system processes audio to identify who is speaking by breaking the audio into segments and comparing them against a database. It uses 'background models' for people not yet in the system, such as a generic 'unenrolled male' or 'unenrolled female' profile. When the system detects a recurring voice that it doesn't recognize, it groups those segments together into a cluster. A user can then provide a name for that cluster, and the system automatically updates its database to recognize that person in future recordings.

The gap

What does this patent NOT cover?

Does not cover real-time voice synthesis or voice modification.
Does not cover hardware-based microphones or physical audio capture devices.
Does not cover methods that require every speaker to be pre-enrolled before identification can begin.
Does not cover the specific linguistic analysis of the words being spoken, only the identification of the speaker.

These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.

Key facts

Patent number	US 6424946
Status	Expired
Field	Software & Internet
Assignee	International Business Machines Corp
Inventors	Alain Charles Louis Tritschler, Mahesh Viswanathan
Filed	1999
Granted	2002
Expires	2019 (expired)
Claims	33
Times cited	75
Litigation	None on record

Value · $58K–$184KModest

What made this novel

The system uses a 'background model' for unknown speakers, allowing it to track and cluster voices it hasn't learned yet, rather than simply failing to identify them or labeling them as noise.

The Patent Drawing

Representative patent drawing for Methods and apparatus for unknown speaker labeling using concurrent speech recognition, segmentation, classification and clustering (US 6424946) — Representative figure · US 6424946All figures on Google Patents →

Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.

Where you've seen this

Real-world examples

Automated meeting transcription services like Otter.ai

Zoom and Microsoft Teams speaker identification features

Legal and medical transcription software

Call center analytics platforms

Why it matters

The bigger picture

This patent laid the groundwork for modern transcription services that distinguish between multiple participants in a meeting. It solved the 'cold start' problem in voice recognition, where a system could not label a speaker until they were manually registered. This is now a standard feature in enterprise meeting software and legal transcription tools.

Filed

November 5, 1999

Granted

July 23, 2002

Market context

Who's building on this

Companies in this space

IBM remains a significant player in enterprise AI and natural language processing. Modern cloud providers like Amazon (AWS Transcribe), Google (Cloud Speech-to-Text), and Microsoft (Azure Speech) have expanded these foundational clustering techniques into massive, scalable services.

Market impact

This technology enabled the transition from simple speech-to-text to 'speaker diarization,' which is the ability to create a 'who spoke when' log. It turned raw transcripts into structured data, which is essential for modern business productivity tools and compliance monitoring in regulated industries.

Claim 1 — Plain English

What this patent covers

The clever bit

The system uses a 'background model' for unknown speakers, allowing it to track and cluster voices it hasn't learned yet, rather than simply failing to identify them or labeling them as noise.

What it does not cover

Does not cover real-time voice synthesis or voice modification.
Does not cover hardware-based microphones or physical audio capture devices.
Does not cover methods that require every speaker to be pre-enrolled before identification can begin.
Does not cover the specific linguistic analysis of the words being spoken, only the identification of the speaker.

Patent timeline

FilingNov 5, 1999

Application submitted to the patent office

PublicationJul 23, 2002

Application published, typically 18 months after filing

GrantJul 23, 2002

Patent officially issued

ExpirationNov 5, 2019

Patent enters public domain

This patent is in the public domain

See the Freedom to Build guide — what is free to use, what is not, and how to cite this patent.

View guide →

PatentBrief Score

Impact Score

Moderate

Citation count

38/40

Highly cited

Claim breadth

20/20

Very broad protection

Recency

0/20

Older than 20 years

Assignee scale

0/20

Independent or smaller assignee

PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.

Heuristic Value Estimate

What this patent might be worth

Modest

$58K – $184K

Midpoint $115K · expired or expiring · industry ×1.6

Adjust inputs →

Heuristic only — blends forward/backward citation counts, claim scope, time remaining, litigation history, and CPC-derived industry baseline. Real valuations need a professional appraisal.

Claim text not yet imported for this patent

The original legal language

Original claims

33 claims as filed with the patent office.

Concepts involved

Claim Prior art Non-obviousness Novelty Specification Assignee Patent term

Citations

Patent lineage

Cites earlier patents

earlier patents this invention cites as foundations

View prior art →

Cited by later patents

later patents that build on this invention

View patents →

Cite this patent

Tritschler, A. C. L., & Viswanathan, M. (2002). How Computers Automatically Label Different Speakers in Audio Recordings (U.S. Patent No. 6,424,946). U.S. Patent and Trademark Office. https://patentbrief.org/patent/us/6424946/amazon-personalized-recommendations

Auto-generated from the patent record. Double-check author order and the issue date against the official USPTO document before submitting.

Embed

Add this patent to your site

Drop this plain-English patent card into any blog post or article — free, no signup. It always links back to the full breakdown here.

<div data-patentlens-widget data-patent-number="US6424946"></div>
<script src="https://patentbrief.org/embed.js" async></script>

Stay in the loop

Get a weekly digest of new patents.

One email per week. No spam. Unsubscribe anytime.

Keep exploring

Related patents you should know

US 4683195 · 1987

How to Make Billions of Copies of a DNA Segment

This patent describes the Polymerase Chain Reaction (PCR), a method to rapidly create many copies of a specific piece of DNA or RNA, enabling its detection and analysis.

Cetus Corp

US 8697359 · 2014

How to Edit Genes in Human Cells Using an Engineered CRISPR System

This patent describes an engineered CRISPR-Cas9 system for precisely cutting DNA in eukaryotic cells to change how genes work, opening the door for gene editing in complex organisms.

Massachusetts Institute of Technology

US 7657849 · 2010

How the iPhone's Slide-to-Unlock Gesture Works

Apple's 2010 patent describes unlocking a device by dragging a specific graphical image across the touchscreen along a predefined path, a gesture that became iconic with the original iPhone.

Apple Inc

US 4733665 · 1988

How Doctors Implant a Permanent Stent Using a Balloon

This patent describes the method for placing a permanent, expandable wire mesh tube inside a blood vessel or other body tube using a balloon-tipped catheter to widen it and keep it open.

Expandable Grafts Partnership

US 4965188 · 1990

How to Make Many Copies of a DNA Piece with Heat

This patent describes the Polymerase Chain Reaction (PCR) method, a technique to make millions of copies of a specific DNA segment using a heat-resistant enzyme and repeated temperature changes.

Cetus Corp

US 4235871 · 1980

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

This patent describes a method for trapping biologically active substances inside tiny, multi-layered fat bubbles called liposomes, using a specific water-in-oil emulsion and gel-forming process to improve how much material gets captured.

Individual

Semantically similar

You might also find these interesting

SEARCH ALL

ai ml

US 11645473 · 2023 · International Business Machines Corp

How AI Predicts Who Will Speak Next in a Conversation

ai ml

US 11062228 · 2021 · Microsoft Technology Licensing LLC

How AI Learns New Tasks Using Old Data Labels

ai ml

US 11521601 · 2022 · Invoca Inc

How AI Cleans Up Irrelevant Topics in Recorded Phone Calls

consumer electronics

US 9965247 · 2018 · Sonos Inc

How Sonos Speakers Use Personalized Wake Words to Recognize Different Users

More to explore

Frequently Asked Questions

What does How Computers Automatically Label Different Speakers in Audio Recordings cover?

A method for identifying and labeling speakers in audio recordings, even if the system has never heard the person speak before, by grouping similar voices and asking a user to name them.

Who owns patent US 6424946?

International Business Machines Corp owns this patent, granted in 2002.

When does this patent expire?

This patent has expired and is now in the public domain — anyone can use the invention freely.

What is patent US 6424946 cited by?

This patent has been cited by 75 later patents that build on its ideas.

What problem does this patent solve?

What does this patent NOT cover?

Does not cover real-time voice synthesis or voice modification.

Same assignee

More from International Business Machines Corp

View all →

US 11645473·2023

How AI Predicts Who Will Speak Next in a Conversation

US 10318876·2019

How AI Systems Adjust Their Behavior Based on User Mood

US 10140638·2018

How Cloud Services Automatically Switch Providers When One Fails

US 9959544·2018

How a Server Updates Smart Card Apps and Shows Ads

Patent monitoring

Get notified when International Business Machines Corp files a new patent

Last reviewed: June 13, 2026 · PatentBrief is not a law firm and this is not legal advice.

How Computers Automatically Label Different Speakers in Audio Recordings

What does this patent actually cover?

What does this patent NOT cover?

Key facts

Real-world examples

The bigger picture

Who's building on this

What this patent covers

Patent timeline

Impact Score

What this patent might be worth

Original claims

Patent lineage

Cite this patent

Add this patent to your site

Get a weekly digest of new patents.

Related patents you should know

How to Make Billions of Copies of a DNA Segment

How to Edit Genes in Human Cells Using an Engineered CRISPR System

How the iPhone's Slide-to-Unlock Gesture Works

How Doctors Implant a Permanent Stent Using a Balloon

How to Make Many Copies of a DNA Piece with Heat

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

You might also find these interesting

How AI Predicts Who Will Speak Next in a Conversation

How AI Learns New Tasks Using Old Data Labels

How AI Cleans Up Irrelevant Topics in Recorded Phone Calls

How Sonos Speakers Use Personalized Wake Words to Recognize Different Users

More in Software & Internet

How RSA Public-Key Encryption Keeps Digital Messages Secret

How Websites Get Ranked by Importance

How Amazon's One-Click Ordering Works for Online Purchases

Displaying Friends' Activities in a Social Network Feed

Frequently Asked Questions

More from International Business Machines Corp

Get notified when International Business Machines Corp files a new patent