PatentBrief — patentbrief.org

10452978 USView original on Google PatentsCompare ↗Brief ↗

How AI Models Understand Language Using 'Attention'

Name: PatentBrief
Address: Phoenix, AZ, US
Price range: Free

Granted 2019ActiveExpires 2038Owned by Google LLCInvented by Lukasz Mieczyslaw Kaiser, Illia Polosukhin, Llion Owen Jones + 5 more

Original patent title: “Attention-based sequence transduction neural networks”

Plain-English explanation by SahiLast reviewed · June 13, 2026

This patent describes a neural network architecture, known as a Transformer, that uses a "self-attention" mechanism to process sequences of information, like words in a sentence, by weighing the importance of different parts of the input. Granted to Google LLC in 2019 with 33 claims and 45 forward citations, and it is expected to expire in 2038.

Coverage

What does this patent actually cover?

The system described takes an input sequence, such as a sentence, and converts it into an output sequence, like a translation. It uses an "encoder neural network" (Claim 1) to process the input, which is made up of multiple "encoder subnetworks." Each subnetwork contains an "encoder self-attention sub-layer" (Claim 1). This sub-layer processes each part of the input by applying a self-attention mechanism, which involves determining a "query" from the current input position, and "keys" and "values" from all input positions (Claim 1). These are then used to generate a specific output for that input position, allowing the network to understand how different parts of the sequence relate to each other. For example, when translating a sentence, this mechanism helps the AI determine which words are most relevant to understanding the meaning of a particular word.

The gap

What does this patent NOT cover?

Does not cover neural networks that process sequences without using a self-attention mechanism.
Does not cover attention mechanisms where the "query," "keys," and "values" are not derived from the *same* subnetwork inputs for self-attention.
Does not cover systems that lack either an "encoder neural network" or a "decoder neural network" as part of the overall sequence transduction network (Claim 1).
Does not cover models that do not explicitly determine and use "query," "keys," and "values" from the subnetwork inputs as described in Claim 1.
Does not cover systems where the initial encoder subnetwork inputs are not formed by combining embedded representations with positional embeddings, as described in Claim 2.

These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.

Key facts

Patent number	US 10452978
Status	Active
Field	AI & Machine Learning
Assignee	Google LLC
Inventors	Lukasz Mieczyslaw Kaiser, Illia Polosukhin, Llion Owen Jones and 5 others
Filed	2018
Granted	2019
Expires	2038
Claims	33
Times cited	45
Litigation	None on record

Value · $288K–$922KSubstantial

What made this novel

The truly novel aspect is the "self-attention mechanism." Instead of processing sequences one step at a time, it allows the model to simultaneously weigh the importance of every other element in a sequence when processing a single element, efficiently capturing long-range dependencies.

The Patent Drawing

Representative patent drawing for Attention-based sequence transduction neural networks (US 10452978) — Representative figure · US 10452978All figures on Google Patents →

Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.

Where you've seen this

Real-world examples

Google Translate

ChatGPT (OpenAI)

Bard (Google)

BERT (Google)

DALL-E (OpenAI)

Most modern large language models (LLMs)

Why it matters

The bigger picture

This patent covers the core ideas behind the Transformer architecture, which fundamentally changed how artificial intelligence processes sequential data, especially in natural language processing (NLP). It enabled the development of large language models (LLMs) like GPT and BERT, significantly improving tasks such as machine translation, text summarization, and question answering. The architecture allowed for more efficient parallel processing of sequences, making training much faster than previous sequential models.

Filed

June 28, 2018

Granted

October 22, 2019

Market context

Who's building on this

Companies in this space

Google, as the assignee, continues to develop and deploy Transformer-based models like BERT, LaMDA, and PaLM across its products. Other major players include OpenAI (ChatGPT, GPT-3/4), Meta (LLaMA), and Microsoft (integrating into products like Bing Chat and Copilot), along with numerous startups in the AI space.

Market impact

This patent's underlying invention, the Transformer architecture, fundamentally reshaped the AI landscape, particularly in natural language processing. It enabled the creation of large language models, leading to a surge in AI research and commercial applications, and establishing a new standard for sequence modeling. It also spurred significant investment and competition in the AI industry.

Claim 1 — Plain English

What this patent covers

The clever bit

What it does not cover

Does not cover neural networks that process sequences without using a self-attention mechanism.
Does not cover attention mechanisms where the "query," "keys," and "values" are not derived from the *same* subnetwork inputs for self-attention.
Does not cover systems that lack either an "encoder neural network" or a "decoder neural network" as part of the overall sequence transduction network (Claim 1).
Does not cover models that do not explicitly determine and use "query," "keys," and "values" from the subnetwork inputs as described in Claim 1.
Does not cover systems where the initial encoder subnetwork inputs are not formed by combining embedded representations with positional embeddings, as described in Claim 2.

Patent timeline

FilingJun 28, 2018

Application submitted to the patent office

PublicationOct 22, 2019

Application published, typically 18 months after filing

GrantOct 22, 2019

Patent officially issued

ExpirationJun 28, 2038

Patent enters public domain

PatentBrief Score

Impact Score

High impact

Citation count

33/40

Moderately cited

Claim breadth

20/20

Very broad protection

Recency

10/20

Granted 5–10 years ago

Assignee scale

20/20

Major company or institution

PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.

Heuristic Value Estimate

What this patent might be worth

Substantial

$288K – $922K

Midpoint $576K · 12.0 yr remaining · industry ×1.6

Adjust inputs →

Heuristic only — blends forward/backward citation counts, claim scope, time remaining, litigation history, and CPC-derived industry baseline. Real valuations need a professional appraisal.

Claim text not yet imported for this patent

The original legal language

Original claims

33 claims as filed with the patent office.

Concepts involved

Claim Prior art Non-obviousness Novelty Specification Assignee Patent term

Citations

Patent lineage

Cites earlier patents

earlier patents this invention cites as foundations

View prior art →

Cited by later patents

later patents that build on this invention

View patents →

Cite this patent

Kaiser, L. M., Polosukhin, I., Jones, L. O., Shazeer, N. M., Parmar, N. J., Vaswani, A. T., Uszkoreit, J. D., & Gomez, A. N. (2019). How AI Models Understand Language Using 'Attention' (U.S. Patent No. 10,452,978). U.S. Patent and Trademark Office. https://patentbrief.org/patent/us/10452978/transformer-attention-mechanism

Auto-generated from the patent record. Double-check author order and the issue date against the official USPTO document before submitting.

Embed

Add this patent to your site

Drop this plain-English patent card into any blog post or article — free, no signup. It always links back to the full breakdown here.

<div data-patentlens-widget data-patent-number="US10452978"></div>
<script src="https://patentbrief.org/embed.js" async></script>

Stay in the loop

Get a weekly digest of new patents.

One email per week. No spam. Unsubscribe anytime.

Keep exploring

Related patents you should know

US 4683195 · 1987

How to Make Billions of Copies of a DNA Segment

This patent describes the Polymerase Chain Reaction (PCR), a method to rapidly create many copies of a specific piece of DNA or RNA, enabling its detection and analysis.

Cetus Corp

US 8697359 · 2014

How to Edit Genes in Human Cells Using an Engineered CRISPR System

This patent describes an engineered CRISPR-Cas9 system for precisely cutting DNA in eukaryotic cells to change how genes work, opening the door for gene editing in complex organisms.

Massachusetts Institute of Technology

US 7657849 · 2010

How the iPhone's Slide-to-Unlock Gesture Works

Apple's 2010 patent describes unlocking a device by dragging a specific graphical image across the touchscreen along a predefined path, a gesture that became iconic with the original iPhone.

Apple Inc

US 4733665 · 1988

How Doctors Implant a Permanent Stent Using a Balloon

This patent describes the method for placing a permanent, expandable wire mesh tube inside a blood vessel or other body tube using a balloon-tipped catheter to widen it and keep it open.

Expandable Grafts Partnership

US 4965188 · 1990

How to Make Many Copies of a DNA Piece with Heat

This patent describes the Polymerase Chain Reaction (PCR) method, a technique to make millions of copies of a specific DNA segment using a heat-resistant enzyme and repeated temperature changes.

Cetus Corp

US 4235871 · 1980

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

This patent describes a method for trapping biologically active substances inside tiny, multi-layered fat bubbles called liposomes, using a specific water-in-oil emulsion and gel-forming process to improve how much material gets captured.

Individual

Semantically similar

You might also find these interesting

SEARCH ALL

US 9965705 · 2018 · Baidu USA LLC

How AI Uses Question-Guided Attention to Answer Questions About Images

US 20220383078 · Huawei Technologies Co

Adapting AI Models to Fit Device Resources

software

US 10909459 · 2021 · Cognizant Technology Solutions US Corp

Teaching Computers to Understand Document Similarity Using AI

ai ml

US 10289962 · 2019 · Google LLC

How to Shrink Large AI Models Using Knowledge Distillation

More to explore

Frequently Asked Questions

What does How AI Models Understand Language Using 'Attention' cover?

Who owns patent US 10452978?

Google LLC owns this patent, granted in 2019.

When does this patent expire?

This patent is expected to expire on June 28, 2038, when the invention enters the public domain.

What is patent US 10452978 cited by?

This patent has been cited by 45 later patents that build on its ideas.

What problem does this patent solve?

What does this patent NOT cover?

Does not cover neural networks that process sequences without using a self-attention mechanism.

More from Google LLC

View all →

US 11544573·2023

How Projection Neural Networks Speed Up AI Predictions

US 10318574·2019

How Google Automatically Groups Photos and Contextual Data into Moments

US 10289962·2019

How to Shrink Large AI Models Using Knowledge Distillation

US 9998475·2018

How Smart Home Devices Automatically Connect to Utility Energy Programs

Patent monitoring

Get notified when Google LLC files a new patent

Last reviewed: June 13, 2026 · PatentBrief is not a law firm and this is not legal advice.

How AI Models Understand Language Using 'Attention'

What does this patent actually cover?

What does this patent NOT cover?

Key facts

Real-world examples

The bigger picture

Who's building on this

What this patent covers

Patent timeline

Impact Score

What this patent might be worth

Original claims

Patent lineage

Cite this patent

Add this patent to your site

Get a weekly digest of new patents.

Related patents you should know

How to Make Billions of Copies of a DNA Segment

How to Edit Genes in Human Cells Using an Engineered CRISPR System

How the iPhone's Slide-to-Unlock Gesture Works

How Doctors Implant a Permanent Stent Using a Balloon

How to Make Many Copies of a DNA Piece with Heat

How to Encapsulate Active Materials in Lipid Bubbles Efficiently

You might also find these interesting

How AI Uses Question-Guided Attention to Answer Questions About Images

Adapting AI Models to Fit Device Resources

Teaching Computers to Understand Document Similarity Using AI

How to Shrink Large AI Models Using Knowledge Distillation

More in AI & Machine Learning

How Computers Find Hidden Connections Between Different Fields of Knowledge

How Cloud Systems Automatically Create and Train AI Data Models

How Facebook Uses Deep Learning to Predict What You Might Like

How AI Learns New Tasks Using Old Data Labels

Frequently Asked Questions

More from Google LLC

Get notified when Google LLC files a new patent