{
  "patent_number": "US 11361763",
  "country": "US",
  "title": "How Smart Speakers Know You're Talking to Them After a Command",
  "original_title": "Detecting system-directed speech",
  "summary": "This patent describes how a smart speaker system can tell if follow-up speech is meant for it, even without a \"wake word,\" by analyzing voice activity and partial speech recognition results using an AI model.",
  "what_it_does": "The patent details a method for a system, like a smart speaker, to identify if incoming speech is directed at it, especially after an initial interaction. First, the system receives an first command, processes it, and responds. Crucially, it then instructs the device to send subsequent audio without requiring a wake word (Claim 1). When this second audio arrives, the system first checks for voice activity. Then, it performs automatic speech recognition (ASR) on the audio. While ASR is running, it simultaneously creates a \"feature vector\" from the early parts of the ASR results and feeds this into a deep neural network (DNN). This DNN calculates a score indicating how likely the speech is intended for the system. If the score passes a certain level, the system proceeds to understand the full speech and act on it. For example, after you say \"Alexa, set a timer for 10 minutes,\" the system might then listen for a follow-up like \"and add a reminder for my meeting\" without needing you to say \"Alexa\" again.",
  "what_it_does_not_cover": [
    "Does not cover systems that always require a wake word for every interaction.",
    "Does not cover systems that rely solely on voice activity detection to determine system-directed speech.",
    "Does not cover determining system-directed speech without using a deep neural network on a feature vector derived from partial ASR results.",
    "Does not cover systems where the device itself determines the presence of a wake word in the second input audio data (Claim 1 explicitly states \"without the device determining a presence of a wakeword\").",
    "Does not cover systems that only use full ASR results, rather than partial ASR results, to create the feature vector for the DNN."
  ],
  "filed": "2017-09-01",
  "granted": "2022-06-14",
  "expires": "2037-09-01",
  "status": "active",
  "holder": "Amazon Technologies",
  "holder_url": "https://patentbrief.org/company/amazon-technologies",
  "inventors": [
    {
      "name": "Spyridon Matsoukas",
      "url": "https://patentbrief.org/inventor/spyridon-matsoukas"
    },
    {
      "name": "Sri Harish Reddy Mallidi",
      "url": "https://patentbrief.org/inventor/sri-harish-reddy-mallidi"
    },
    {
      "name": "Bjorn Hoffmeister",
      "url": "https://patentbrief.org/inventor/bjorn-hoffmeister"
    },
    {
      "name": "Roland Maximilian Rolf Maas",
      "url": "https://patentbrief.org/inventor/roland-maximilian-rolf-maas"
    }
  ],
  "times_cited": 64,
  "tags": [
    "consumer_electronics",
    "software",
    "ai_ml",
    "telecommunications"
  ],
  "abstract": "A speech-processing system capable of receiving and processing audio data to determine if the audio data includes speech that was intended for the system. Non-system directed speech may be filtered out while system-directed speech may be selected for further processing. A system-directed speech detector may use a trained machine learning model (such as a deep neural network or the like) to process a feature vector representing a variety of characteristics of the incoming audio data, including the results of automatic speech recognition and/or other data. Using the feature vector the model may output an indicator as to whether the speech is system-directed. The system may also incorporate other filters such as voice activity detection prior to speech recognition, or the like.",
  "url": "https://patentbrief.org/patent/us/11361763/detecting-system-directed-speech",
  "markdown_url": "https://patentbrief.org/patent/us/11361763/detecting-system-directed-speech/md",
  "google_patents_url": "https://patents.google.com/patent/US11361763",
  "relatedPatents": [
    {
      "patentNumber": "9548050",
      "countryCode": "US",
      "title": "How a Digital Assistant Launches Apps Using Your Voice",
      "url": "https://patentbrief.org/patent/us/9548050/continuity-handoff"
    },
    {
      "patentNumber": "11204787",
      "countryCode": "US",
      "title": "How Digital Assistants Control Apps and Ask for More Information",
      "url": "https://patentbrief.org/patent/us/11204787/github-copilot-code-generation-ai"
    },
    {
      "patentNumber": "9965247",
      "countryCode": "US",
      "title": "How Sonos Speakers Use Personalized Wake Words to Recognize Different Users",
      "url": "https://patentbrief.org/patent/us/9965247/icloud-drive"
    },
    {
      "patentNumber": "8521818",
      "countryCode": "US",
      "title": "How Software Detects What You Want Based on Your Social Media Posts",
      "url": "https://patentbrief.org/patent/us/8521818/facebook-share-button"
    },
    {
      "patentNumber": "6424946",
      "countryCode": "US",
      "title": "How Computers Automatically Label Different Speakers in Audio Recordings",
      "url": "https://patentbrief.org/patent/us/6424946/amazon-personalized-recommendations"
    }
  ]
}