How Google Distributed Machine Learning Across Many Computers
A 2003 Google patent describing a way to build machine learning models by splitting the work across a large network of computers rather than a single machine.
Patent Number
US 7222127
Status
Active
Filing Date
December 15, 2003
Grant Date
May 22, 2007
Expiration
~December 2023 (estimated)
Claims
45
Assignee
Google LLC
Inventors
Georges R. Harik, Simon Tong, Noam Shazeer, Jeremy Bem, Joshua L. Levenberg
Citations
72 forward · 28 backward
What it covers
This patent describes a distributed system where a large machine learning model is built by multiple computer nodes working together. Instead of one computer processing all data, one node selects a 'candidate condition'—a potential rule for the model—and asks other nodes to provide statistics about how often that condition occurs in their specific slice of data. These nodes calculate derivatives of log-likelihood or histograms to help determine if the rule is useful. Finally, the system aggregates this information to decide whether to add the rule to the final model, effectively allowing the model to grow in complexity by leveraging the combined power of the entire network.
What it doesn't cover
- —Does not cover machine learning models that run entirely on a single processor or single computer node.
- —Does not cover specific neural network architectures like Transformers or CNNs, as the claims focus on rule-based model generation.
- —Does not cover real-time inference or prediction methods, only the process of generating the model itself.
The clever bit
The system uses a 'feature-to-instance index' to quickly identify which data points satisfy a condition, then offloads the heavy mathematical lifting (calculating derivatives) to the nodes that actually hold the data, minimizing the need to move large datasets across the network.
Why it matters
This patent represents an early architectural blueprint for the massive-scale computing that defines modern Google. By enabling models to be trained across distributed clusters, it allowed for the processing of datasets far too large for the hardware of the early 2000s, laying the groundwork for the company's dominance in search ranking and ad-targeting algorithms.
Real-world examples
- 1.Google Search ranking algorithms
- 2.Large-scale ad-click prediction systems
- 3.Distributed training clusters in data centers
Generated by PatentBrief · Not legal advice · patentbrief.org
US 7222127 · 2026