How Voice Assistants Change Their Speech Based on How You Talk
This patent describes a system where a voice-controlled device adjusts its text-to-speech output characteristics, like speed or tone, based on the user's speaking habits or current situation, making responses feel more natural and personalized.
Patent Number
US 10276149
Status
Active
Filing Date
December 21, 2016
Grant Date
April 30, 2019
Expiration
December 21, 2036
Claims
23
Assignee
Amazon Technologies
Inventors
Aaron Takayanagi Barnet, Nancy Yi Liang
Citations
26 forward · 20 backward
What it covers
This patent details a method for a computer system to dynamically adjust how it speaks to a user. First, it receives a user's spoken command, called a "first utterance," and figures out how many words were in it (claim 1). It then compares this to the user's typical speaking patterns, stored in their "user profile data," specifically an "average number of words of a user command." If the current command's word count "deviates from the average," the system uses this difference to select a special speech style, or "template text data." It then takes the actual response text and applies this special style, generating audio with a "non-default characteristic of speech" (claim 1). For example, if a user usually speaks long commands but suddenly gives a very short one, the system might interpret that as urgency and respond faster. The patent also covers adjusting speech rate if a user's calendar shows they are busy soon (claim 2) or based on their geographic location (claim 6).
What it doesn't cover
- —Does not cover a system that always responds with a default speech characteristic, regardless of user input or context.
- —Does not cover adjusting speech characteristics solely based on the content of the response, without considering user input characteristics or contextual data.
- —Does not cover a system that changes its speech without first comparing the current user input to an 'average' characteristic from the user's profile.
- —Does not cover adjusting speech based on biometric data like heart rate or facial expression, unless those directly influence a speech characteristic like 'number of words'.
- —Does not cover a system that adjusts its output purely based on the server's processing load or network conditions.
The clever bit
The clever part is not just responding to a command, but dynamically altering the *way* the response is delivered based on the user's current speaking pattern (specifically, how it deviates from their average) or relevant contextual information like their calendar or location. This allows for a more adaptive and human-like interaction.
Why it matters
This patent is significant because it allows voice assistants to provide more personalized and context-aware responses, moving beyond generic, one-size-fits-all interactions. By adapting speech characteristics like speed or tone, it can make interactions feel more natural and efficient for the user. This kind of personalization is crucial for improving user experience in devices like Amazon's Alexa, Google Assistant, and Apple's Siri.
Real-world examples
- 1.Amazon Alexa
- 2.Google Assistant
- 3.Apple Siri
- 4.Microsoft Cortana
- 5.Most modern smartphone voice assistants
Generated by PatentBrief · Not legal advice · patentbrief.org
US 10276149 · 2026