PatentBrief — patentbrief.org

10289962 USView original on Google PatentsCompare ↗Brief ↗

How to Shrink Large AI Models Using Knowledge Distillation

Name: PatentBrief
Address: Phoenix, AZ, US
Price range: Free

A method for teaching small, efficient AI models to mimic the complex decision-making patterns of much larger, more powerful neural networks.

Granted 2019ActiveExpires 2035Owned by Google LLCInvented by Oriol Vinyals, Geoffrey E. Hinton, Jeffrey A. Dean

Original patent title: “Training distilled machine learning models”

Plain-English explanation by SahiLast reviewed · June 15, 2026

A method for teaching small, efficient AI models to mimic the complex decision-making patterns of much larger, more powerful neural networks. Granted to Google LLC in 2019 with 23 claims and 4 forward citations.

Coverage

What does this patent actually cover?

This patent describes a process called knowledge distillation. First, a large, heavy 'cumbersome' model is trained on a dataset to learn complex patterns. Then, a smaller 'distilled' model is trained, not just to predict the correct answer, but to mimic the probability distribution (the 'soft outputs') of the large model. By using a 'temperature constant' higher than 1 during training, the model is forced to pay attention to the relationships between incorrect answers, which provides more information than a simple right-or-wrong label. This allows the smaller model to achieve performance levels close to the large model while being much faster and lighter for mobile devices.

The gap

What does this patent NOT cover?

Does not cover training models from scratch without a pre-existing cumbersome model.
Does not cover hardware-specific optimization techniques like model quantization or pruning.
Does not cover methods where the distilled model is trained using only hard labels (e.g., just the correct class) instead of soft outputs.
Does not cover architectures where the distilled model has more parameters than the cumbersome model.

These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.

Key facts

Patent number	US 10289962
Status	Active
Field	AI & Machine Learning
Assignee	Google LLC
Inventors	Oriol Vinyals, Geoffrey E. Hinton, Jeffrey A. Dean
Filed	2015
Granted	2019
Claims	23
Times cited	4
Litigation	None on record

Value · $52K–$166KModest

What made this novel

The innovation is using a 'temperature' parameter to soften the output distribution, which reveals the 'dark knowledge'—the subtle hints about how the big model views the similarities between different categories.

Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.

Where you've seen this

Real-world examples

Mobile versions of Google Translate

On-device voice recognition on Android phones

Lightweight image classification models for mobile apps

Why it matters

The bigger picture

This technique is fundamental to modern AI deployment. It allows companies like Google to run sophisticated language models and image classifiers on smartphones and edge devices that lack the massive computing power required by the original, cumbersome models. It effectively bridges the gap between research-grade supercomputing and consumer-grade hardware.

Filed

June 4, 2015

Granted

May 14, 2019

Market context

Who's building on this

Companies in this space

Google remains a primary driver of this research, but the technique is now a standard practice across the AI industry. Companies like Meta, Microsoft, and various startups building on-device AI models use distillation to optimize their LLMs and computer vision systems for local execution.

Market impact

This patent formalized a practice that enabled the 'edge AI' revolution. By providing a reliable way to compress intelligence without losing significant accuracy, it allowed developers to move AI processing from expensive cloud servers directly onto consumer devices, reducing latency and improving user privacy.

Claim 1 — Plain English

What this patent covers

The clever bit

What it does not cover

Does not cover training models from scratch without a pre-existing cumbersome model.
Does not cover hardware-specific optimization techniques like model quantization or pruning.
Does not cover methods where the distilled model is trained using only hard labels (e.g., just the correct class) instead of soft outputs.
Does not cover architectures where the distilled model has more parameters than the cumbersome model.

Patent timeline

FilingJun 4, 2015

Application submitted to the patent office

PublicationMay 14, 2019

Application published, typically 18 months after filing

GrantMay 14, 2019

Patent officially issued

PatentBrief Score

Impact Score

Moderate

Citation count

14/40

Early citations

Claim breadth

15/20

Broad claims

Recency

10/20

Granted 5–10 years ago

Assignee scale

20/20

Major company or institution

PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.

Heuristic Value Estimate

What this patent might be worth

Modest

$52K – $166K

Midpoint $104K · 8.8 yr remaining · industry ×1.6

Adjust inputs →

Heuristic only — blends forward/backward citation counts, claim scope, time remaining, litigation history, and CPC-derived industry baseline. Real valuations need a professional appraisal.

Claim text not yet imported for this patent

The original legal language

Original claims

23 claims as filed with the patent office.

Concepts involved

Claim Prior art Non-obviousness Novelty Specification Assignee Patent term

Citations

Patent lineage

Cites earlier patents

earlier patents this invention cites as foundations

View prior art →

Cited by later patents

later patents that build on this invention

View patents →

Cite this patent

Vinyals, O., Hinton, G. E., & Dean, J. A. (2019). How to Shrink Large AI Models Using Knowledge Distillation (U.S. Patent No. 10,289,962). U.S. Patent and Trademark Office. https://patentbrief.org/patent/us/10289962/deep-q-networks-dqn

Auto-generated from the patent record. Double-check author order and the issue date against the official USPTO document before submitting.

Embed

Add this patent to your site

Drop this plain-English patent card into any blog post or article — free, no signup. It always links back to the full breakdown here.

<div data-patentlens-widget data-patent-number="US10289962"></div>
<script src="https://patentbrief.org/embed.js" async></script>

Stay in the loop

Get a weekly digest of new patents.

One email per week. No spam. Unsubscribe anytime.

Keep exploring

Related patents you should know

US 4683195 · 1987

How to Make Billions of Copies of a DNA Segment

This patent describes the Polymerase Chain Reaction (PCR), a method to rapidly create many copies of a specific piece of DNA or RNA, enabling its detection and analysis.

Cetus Corp

US 8697359 · 2014

How to Edit Genes in Human Cells Using an Engineered CRISPR System

This patent describes an engineered CRISPR-Cas9 system for precisely cutting DNA in eukaryotic cells to change how genes work, opening the door for gene editing in complex organisms.

Massachusetts Institute of Technology

US 7657849 · 2010

How the iPhone's Slide-to-Unlock Gesture Works

Apple's 2010 patent describes unlocking a device by dragging a specific graphical image across the touchscreen along a predefined path, a gesture that became iconic with the original iPhone.

Apple Inc

US 4733665 · 1988

How Doctors Implant a Permanent Stent Using a Balloon

This patent describes the method for placing a permanent, expandable wire mesh tube inside a blood vessel or other body tube using a balloon-tipped catheter to widen it and keep it open.

Expandable Grafts Partnership

US 4965188 · 1990

How to Make Many Copies of a DNA Piece with Heat

This patent describes the Polymerase Chain Reaction (PCR) method, a technique to make millions of copies of a specific DNA segment using a heat-resistant enzyme and repeated temperature changes.

Cetus Corp

US 4235871 · 1980

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

This patent describes a method for trapping biologically active substances inside tiny, multi-layered fat bubbles called liposomes, using a specific water-in-oil emulsion and gel-forming process to improve how much material gets captured.

Individual

Semantically similar

You might also find these interesting

SEARCH ALL

US 20220383078 · Huawei Technologies Co

Adapting AI Models to Fit Device Resources

US 12443890 · 2025 · Google

How Devices Train Shared AI Models While Keeping Your Data Private

US 20220012637 · Nokia Technologies Oy

Training AI Models Together with Unlabeled Data Using a Teacher

ai ml

US 11544573 · 2023 · Google LLC

How Projection Neural Networks Speed Up AI Predictions

More to explore

Frequently Asked Questions

What does How to Shrink Large AI Models Using Knowledge Distillation cover?

A method for teaching small, efficient AI models to mimic the complex decision-making patterns of much larger, more powerful neural networks.

Who owns patent US 10289962?

Google LLC owns this patent, granted in 2019.

When does this patent expire?

This patent is expected to expire on May 14, 2039, when the invention enters the public domain.

What is patent US 10289962 cited by?

This patent has been cited by 4 later patents that build on its ideas.

What problem does this patent solve?

What does this patent NOT cover?

Does not cover training models from scratch without a pre-existing cumbersome model.

Same assignee

More from Google LLC

View all →

US 11544573·2023

How Projection Neural Networks Speed Up AI Predictions

US 10452978·2019

How AI Models Understand Language Using 'Attention'

US 10318574·2019

How Google Automatically Groups Photos and Contextual Data into Moments

US 9998475·2018

How Smart Home Devices Automatically Connect to Utility Energy Programs

Patent monitoring

Get notified when Google LLC files a new patent

Last reviewed: June 15, 2026 · PatentBrief is not a law firm and this is not legal advice.

How to Shrink Large AI Models Using Knowledge Distillation

What does this patent actually cover?

What does this patent NOT cover?

Key facts

Real-world examples

The bigger picture

Who's building on this

What this patent covers

Patent timeline

Impact Score

What this patent might be worth

Original claims

Patent lineage

Cite this patent

Add this patent to your site

Get a weekly digest of new patents.

Related patents you should know

How to Make Billions of Copies of a DNA Segment

How to Edit Genes in Human Cells Using an Engineered CRISPR System

How the iPhone's Slide-to-Unlock Gesture Works

How Doctors Implant a Permanent Stent Using a Balloon

How to Make Many Copies of a DNA Piece with Heat

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

You might also find these interesting

Adapting AI Models to Fit Device Resources

How Devices Train Shared AI Models While Keeping Your Data Private

Training AI Models Together with Unlabeled Data Using a Teacher

How Projection Neural Networks Speed Up AI Predictions

More in AI & Machine Learning

How AI Models Understand Language Using 'Attention'

How Computers Find Hidden Connections Between Different Fields of Knowledge

How Cloud Systems Automatically Create and Train AI Data Models

How Facebook Uses Deep Learning to Predict What You Might Like

Frequently Asked Questions

More from Google LLC

Get notified when Google LLC files a new patent