PatentBrief · Patent BriefUS 20220383078

Adapting AI Models to Fit Device Resources

This patent describes how a computer system can automatically shrink a large artificial intelligence model, specifically a "transformer" type, to fit the available computing power of a phone or other device.

Patent Number

US 20220383078

Status

Active

Filing Date

August 8, 2022

Grant Date

—

Expiration

August 8, 2042

Claims

Assignee

Huawei Technologies Co

Inventors

Lu HOU, Xin Jiang, Lifeng Shang

Citations

6 forward · 3 backward

What it covers

The patent describes a method for a processing device to adjust an AI model, called a "first neural network model," to better suit a "terminal device" (like a smartphone). It first checks the terminal device's "available resource state" (like how much memory or processing power it has) or a "performance requirement." Based on this, it creates a "second neural network model" by reducing parts of the first model. This reduction can involve making the second model have fewer "attention heads" in its "transformer layers" (claim 1), fewer "neurons" in its "intermediate layers" (claim 1), or fewer "transformer layers" overall (claim 1). For example, if a phone has limited memory, the system might remove some attention heads from the original AI model to create a smaller, faster version that still works well on that phone.

What it doesn't cover

—Does not cover increasing the size of a neural network model based on available resources.
—Does not cover adapting non-transformer neural network architectures.
—Does not cover model adaptation methods that change the type of layers or neurons, only their quantity.
—Does not cover methods where the selection of components to remove is random, as claim 5 suggests a capability-based selection.
—Does not cover adapting models by changing their data types (e.g., from 32-bit to 16-bit floating point).

The clever bit

The novelty lies in systematically creating a family of smaller, efficient neural network models from a single larger model by selectively reducing specific structural components (attention heads, neurons, layers) based on a device's real-time resource availability. This allows for dynamic, on-the-fly adaptation rather than needing to pre-train and store many different model sizes.

Why it matters

This technology is crucial for deploying complex AI models, especially large language models, on devices with limited computing power, such as smartphones, smart home devices, and IoT sensors. It allows these devices to perform advanced AI tasks locally without constantly relying on powerful cloud servers, improving speed, privacy, and offline functionality. Huawei, as a major device manufacturer, would benefit from efficient on-device AI.

Real-world examples

1.On-device AI assistants (e.g., voice recognition on a smartphone)
2.Image processing on mobile cameras
3.Personalized recommendations on edge devices
4.AI-powered features in smartwatches
5.Federated learning clients

Generated by PatentBrief · Not legal advice · patentbrief.org

US 20220383078 · 2026