Patent Landscape

Patent Landscape:
Machine Learning

Google left the transformer architecture unpatented. IBM filed 9,000 ML patents. The gap between what gets published and what gets patented defines the power structure of modern AI.

Machine learning patent strategy is defined by a fundamental tension: the most powerful results come from publishing openly and building a research community, while the most durable competitive advantages come from quietly patenting the implementations. Google published the transformer paper in 2017 without filing a patent — betting that a vibrant ML ecosystem would benefit Google more than a royalty stream. IBM took the opposite approach and built the world's largest ML patent portfolio by systematically filing on every enterprise application of neural networks.

The ML patent landscape has matured significantly since the early deep learning patents of the 1980s. Today the most active filing areas are not foundational algorithms — those are considered mathematical methods and are generally unpatentable — but specific architectural implementations, training optimization methods, and hardware-software integration. Understanding which organizations hold these implementation patents is essential context for any company building ML-powered products.

Key Patents

US4,760,6041988

Neural Network Learning by Back Propagation

Trustees of Boston University

One of the earliest patents on backpropagation — the algorithm that made multi-layer neural networks trainable at scale. Backpropagation is the foundational mechanism behind every deep learning model trained today, from image classifiers to large language models.

US9,406,0172016

Convolutional Neural Network for Image and Video Analysis

Google

Google's CNN architecture patent covers the specific layer configuration and weight-sharing methods used in its image classification systems. Convolutional networks are the backbone of computer vision — this patent's claims define how filters are applied across spatial dimensions to extract hierarchical features.

US10,339,4482019

Generative Adversarial Network Training Method

NVIDIA

NVIDIA's GAN training patent covers the adversarial optimization process where a generator and discriminator network compete to produce synthetic data indistinguishable from real samples. GANs power deepfake detection, image synthesis, drug discovery, and data augmentation — making this one of the most commercially significant ML patents filed in the 2010s.

US10,452,9742019

Recurrent Neural Network with Attention Mechanism

Google Brain

Covers the attention mechanism applied to recurrent architectures — the precursor insight to the full transformer model. Attention allows models to selectively weight different parts of an input sequence, solving the bottleneck problem that limited earlier sequence-to-sequence models in translation and summarization tasks.

US10,832,1362020

Federated Machine Learning with Privacy-Preserving Aggregation

Google

Federated learning allows ML models to be trained across distributed devices without raw data ever leaving those devices. Google developed this approach for training Gboard on users' phones. This patent covers the aggregation protocol that combines model updates while preserving differential privacy — now standard in healthcare and finance ML applications.

US11,120,2952021

Sparse Mixture-of-Experts Neural Network Architecture

Google Brain

The Mixture-of-Experts (MoE) architecture activates only a subset of model parameters for each input — enabling dramatically larger models without proportional compute costs. MoE is the architecture underlying GPT-4 and Gemini Ultra, making this patent directly relevant to the frontier model race.

Key Players

Google / DeepMind

Holds the deepest ML architecture patent portfolio across reinforcement learning, transformer variants, and large-scale training methods. Google deliberately left the original transformer paper (Vaswani et al., 2017) unpatented — but filed aggressively on surrounding applications. DeepMind's AlphaFold protein structure prediction patents represent a separate IP moat in scientific ML.

NVIDIA

The hardware-software integration of CUDA with ML frameworks is NVIDIA's primary IP moat. Beyond GPU architecture patents, NVIDIA holds significant IP in training optimization methods, inference acceleration, and distributed training across GPU clusters. The move into DGX cloud infrastructure is protected by an expanding systems patent portfolio.

IBM Research

IBM's ML patent portfolio exceeds 9,000 filings with emphasis on enterprise applications: explainable AI, bias detection, and automated machine learning (AutoML). IBM's strategy is to license into regulated industries — healthcare, finance, legal — where explainability requirements create demand for its specific IP.

Microsoft

Uses OpenAI's open publication strategy as a complement — OpenAI publishes research, Microsoft patents the commercial implementations. Azure ML infrastructure, GitHub Copilot's code completion pipeline, and Office 365 AI integration are all covered by distinct Microsoft ML patent filings.

What to Watch

Foundation Model Training Process Patents

The specific methods used to pre-train large foundation models — including data mixture recipes, learning rate schedules, and RLHF pipelines — are actively being patented by every major AI lab. These process patents could determine which organizations can train competitive models without licensing fees, making them the most strategically significant ML filings of the 2020s.

Model Compression and Efficient Inference IP

Quantization, pruning, knowledge distillation, and speculative decoding are all active patent areas as the industry races to run large models on constrained hardware. Apple's on-device ML inference patents, Qualcomm's neural processing unit IP, and startup patents from companies like Together AI represent a growing portfolio of deployment-side ML IP.

Synthetic Data Generation for Model Training

As high-quality training data becomes scarcer, synthetic data generation — using existing models to create training data for new models — is emerging as a core technique and a contested IP area. The methods used to generate, filter, and validate synthetic training datasets are being filed as trade secrets and patents by frontier labs.

From PatentBrief

Explore ML patents on PatentBrief →

Search neural network architectures, training methods, and ML application patents. Read any patent in plain English and understand the claims that define modern machine learning.

Search ML patents All patent landscapes →