Adapting AI Models to Fit Device Resources
This patent describes how a computer system can automatically shrink a large artificial intelligence model, specifically a "transformer" type, to fit the available computing power of a phone or other device.
Patent Number
US 20220383078
Status
Active
Filing Date
August 8, 2022
Grant Date
—
Expiration
August 8, 2042
Claims
22
Assignee
Huawei Technologies Co
Inventors
Lu HOU, Xin Jiang, Lifeng Shang
Citations
6 forward · 3 backward
What it covers
The patent describes a method for a processing device to adjust an AI model, called a "first neural network model," to better suit a "terminal device" (like a smartphone). It first checks the terminal device's "available resource state" (like how much memory or processing power it has) or a "performance requirement." Based on this, it creates a "second neural network model" by reducing parts of the first model. This reduction can involve making the second model have fewer "attention heads" in its "transformer layers" (claim 1), fewer "neurons" in its "intermediate layers" (claim 1), or fewer "transformer layers" overall (claim 1). For example, if a phone has limited memory, the system might remove some attention heads from the original AI model to create a smaller, faster version that still works well on that phone.
What it doesn't cover
- —Does not cover increasing the size of a neural network model based on available resources.
- —Does not cover adapting non-transformer neural network architectures.
- —Does not cover model adaptation methods that change the type of layers or neurons, only their quantity.
- —Does not cover methods where the selection of components to remove is random, as claim 5 suggests a capability-based selection.
- —Does not cover adapting models by changing their data types (e.g., from 32-bit to 16-bit floating point).
The clever bit
The novelty lies in systematically creating a family of smaller, efficient neural network models from a single larger model by selectively reducing specific structural components (attention heads, neurons, layers) based on a device's real-time resource availability. This allows for dynamic, on-the-fly adaptation rather than needing to pre-train and store many different model sizes.
Why it matters
This technology is crucial for deploying complex AI models, especially large language models, on devices with limited computing power, such as smartphones, smart home devices, and IoT sensors. It allows these devices to perform advanced AI tasks locally without constantly relying on powerful cloud servers, improving speed, privacy, and offline functionality. Huawei, as a major device manufacturer, would benefit from efficient on-device AI.
Real-world examples
- 1.On-device AI assistants (e.g., voice recognition on a smartphone)
- 2.Image processing on mobile cameras
- 3.Personalized recommendations on edge devices
- 4.AI-powered features in smartwatches
- 5.Federated learning clients
Generated by PatentBrief · Not legal advice · patentbrief.org
US 20220383078 · 2026