# Adapting AI Models to Fit Device Resources

> This patent describes how a computer system can automatically shrink a large artificial intelligence model, specifically a "transformer" type, to fit the available computing power of a phone or other device.

- **Patent:** US 20220383078
- **Original title:** Data processing method and related device
- **Owner:** Huawei Technologies Co
- **Status:** Active
- **Times cited:** 6
- **Field:** ai_ml, consumer_electronics, telecommunications, software

## What it does

The patent describes a method for a processing device to adjust an AI model, called a "first neural network model," to better suit a "terminal device" (like a smartphone). It first checks the terminal device's "available resource state" (like how much memory or processing power it has) or a "performance requirement." Based on this, it creates a "second neural network model" by reducing parts of the first model. This reduction can involve making the second model have fewer "attention heads" in its "transformer layers" (claim 1), fewer "neurons" in its "intermediate layers" (claim 1), or fewer "transformer layers" overall (claim 1). For example, if a phone has limited memory, the system might remove some attention heads from the original AI model to create a smaller, faster version that still works well on that phone.

## What it does NOT cover

- Does not cover increasing the size of a neural network model based on available resources.
- Does not cover adapting non-transformer neural network architectures.
- Does not cover model adaptation methods that change the type of layers or neurons, only their quantity.
- Does not cover methods where the selection of components to remove is random, as claim 5 suggests a capability-based selection.
- Does not cover adapting models by changing their data types (e.g., from 32-bit to 16-bit floating point).

## The clever bit

The novelty lies in systematically creating a family of smaller, efficient neural network models from a single larger model by selectively reducing specific structural components (attention heads, neurons, layers) based on a device's real-time resource availability. This allows for dynamic, on-the-fly adaptation rather than needing to pre-train and store many different model sizes.

## Real-world examples

1. On-device AI assistants (e.g., voice recognition on a smartphone)
2. Image processing on mobile cameras
3. Personalized recommendations on edge devices
4. AI-powered features in smartwatches
5. Federated learning clients

## Why it matters

This technology is crucial for deploying complex AI models, especially large language models, on devices with limited computing power, such as smartphones, smart home devices, and IoT sensors. It allows these devices to perform advanced AI tasks locally without constantly relying on powerful cloud servers, improving speed, privacy, and offline functionality. Huawei, as a major device manufacturer, would benefit from efficient on-device AI.

## Frequently asked questions

### What does Adapting AI Models to Fit Device Resources cover?

This patent describes how a computer system can automatically shrink a large artificial intelligence model, specifically a "transformer" type, to fit the available computing power of a phone or other device.

### Who owns patent US 20220383078?

This patent is owned by Huawei Technologies Co.

### When does this patent expire?

This patent is expected to expire on August 8, 2042, when the invention enters the public domain.

### What is patent US 20220383078 cited by?

This patent has been cited by 6 later patents that build on its ideas.

### What problem does this patent solve?

This technology is crucial for deploying complex AI models, especially large language models, on devices with limited computing power, such as smartphones, smart home devices, and IoT sensors. It allows these devices to perform advanced AI tasks locally without constantly relying on powerful cloud servers, improving speed, privacy, and offline functionality. Huawei, as a major device manufacturer, would benefit from efficient on-device AI.

### What does this patent NOT cover?

Does not cover increasing the size of a neural network model based on available resources.

**Full plain-English explainer:** https://patentbrief.org/patent/us/20220383078/data-processing-method-and-related-device

**Original patent:** https://patents.google.com/patent/US20220383078

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._


## Related patents

Semantically similar inventions in the PatentBrief corpus:

- [How to Update AI on Small Devices with Slow Internet](https://patentbrief.org/patent/us/20250363357/systems-and-methods-for-deploying-and-updating-neural-networks-at-the-edge-of-a-) — This patent describes a method for efficiently updating artificial intelligence models on small, internet-connected devices, like smart cameras, by sending only the changes, or 'patches,' instead of the entire updated model, which saves bandwidth.
- [How to Automatically Expand Neural Networks by Adding New Nodes](https://patentbrief.org/patent/us/10832138/gpt-language-model-pre-training) — A method for growing artificial intelligence models by identifying underperforming parts of a network and adding new nodes based on the behavior of existing ones.
- [How to Shrink Large AI Models Using Knowledge Distillation](https://patentbrief.org/patent/us/10289962/deep-q-networks-dqn) — A method for teaching small, efficient AI models to mimic the complex decision-making patterns of much larger, more powerful neural networks.
- [How to Make AI Run Faster on Smaller Computer Chips](https://patentbrief.org/patent/us/10373050/tensor-processing-unit-tpu) — A method to shrink complex AI models by converting their high-precision math into simpler, faster formats that run efficiently on mobile devices.
- [How Projection Neural Networks Speed Up AI Predictions](https://patentbrief.org/patent/us/11544573/llama-large-language-model-architecture) — A method for making artificial intelligence models faster and more efficient by using fixed, non-trainable projections to simplify complex data before processing.