Patent Landscape

Patent Landscape:
Natural Language Processing

Google published “Attention Is All You Need” in 2017 with no patent. Six years later, the patent landscape around large language models is the most contested AI battleground.

The NLP patent landscape splits cleanly into two eras. Pre-transformer methods — Word2Vec embeddings, early attention mechanisms, encoder-decoder architectures — are heavily held by Google and Microsoft Research from the 2014–2017 deep learning wave. Post-transformer methods are increasingly filed by OpenAI, Anthropic, Meta, and Google, and cover the techniques that turned raw foundation models into commercial products: RLHF, instruction tuning, MoE routing, retrieval-augmented generation, and the inference-time methods that make LLM serving economical.

The strategic question of the decade: can a patent on a foundation model architecture actually be enforced when the underlying methods are widely published in academic papers, openly distributed as weights by Meta and Mistral, and the courts have not yet ruled on AI training infringement? The most valuable LLM patents may end up being not the architecture patents but the product-layer patents — RLHF, in-context learning, agent orchestration — where the methods sit closer to commercial behavior and further from open-research norms.

Key Patents

US10,452,9782019

Attention-Based Sequence Transduction Neural Networks

Google

The original Transformer patent. Covers the multi-head self-attention mechanism that powers every modern LLM from GPT-4 to Claude to Gemini. Google chose not to enforce this against the industry, but its existence shapes every licensing conversation in foundation AI.

US10,963,6522021

Reinforcement Learning from Human Feedback for Language Models

OpenAI

RLHF was the breakthrough that turned GPT-3 into ChatGPT. This patent covers the reward modeling and PPO-based fine-tuning loop that produces aligned assistants — one of the most commercially valuable methods in AI.

US11,521,0712022

Mixture-of-Experts Language Model Routing

Google

Switch Transformer and GShard introduced sparse routing that lets trillion-parameter models train efficiently. Google's MoE patents underpin Gemini's architecture and OpenAI's rumored GPT-4 design.

US11,222,1932022

Few-Shot In-Context Learning Method

OpenAI

Covers the prompting-as-programming paradigm: providing examples in context rather than fine-tuning. This is the patent behind the entire prompt engineering industry and most ChatGPT/Claude API usage.

US11,138,3922021

BERT Bidirectional Transformer for Language Understanding

Google

BERT was the dominant pre-trained language model from 2018 to 2020. Google's BERT patents cover the masked-language-modeling objective that became foundational for retrieval, search ranking, and downstream NLP.

US11,580,1482023

Retrieval-Augmented Generation for Large Language Models

Explore NLP patents on PatentBrief →

Search transformer, LLM, and language model patents. Read any patent in plain English and understand the claims that define modern language AI.

Search NLP patents All patent landscapes →