# How to Make AI Run Faster on Smaller Computer Chips

> A method to shrink complex AI models by converting their high-precision math into simpler, faster formats that run efficiently on mobile devices.

- **Patent:** US 10373050
- **Original title:** Fixed point neural network based on floating point neural network quantization
- **Owner:** Qualcomm Inc
- **Granted:** 2019
- **Status:** Active
- **Times cited:** 17
- **Field:** semiconductors, ai_ml, consumer_electronics, telecommunications

## What it does

This patent describes a way to compress artificial intelligence models, specifically neural networks, so they can run on hardware with limited power, like smartphones. It works by converting 'floating point' numbers—which are very precise but computationally expensive—into 'fixed point' numbers, which are much simpler for a processor to handle. The core mechanism involves analyzing the statistical 'moments' (like the average or spread) of the data flowing through the network. By shifting these values to create a 'zero-mean distribution' and adjusting the math functions accordingly, the system ensures the model stays accurate even after it has been simplified.

## What it does NOT cover

- Does not cover general neural network training methods that do not involve quantization.
- Does not cover hardware-specific circuit designs for performing multiplication or addition.
- Does not cover techniques that use retraining or fine-tuning to recover accuracy after quantization.
- Does not cover non-neural network machine learning models like support vector machines.

## The clever bit

Instead of just rounding numbers, the patent shifts the entire distribution of data to be centered around zero, which minimizes the mathematical errors introduced when you switch from high-precision to low-precision math.

## Real-world examples

1. Qualcomm Snapdragon mobile processors
2. On-device AI image processing
3. Mobile voice assistants
4. Edge computing AI accelerators

## Why it matters

As AI models grew larger, they became too heavy for mobile processors to run in real-time. This patent provides a standardized way for companies like Qualcomm to ensure their mobile chips can handle advanced AI tasks, such as image recognition or voice processing, without draining the battery or overheating the device.

## Frequently asked questions

### What does How to Make AI Run Faster on Smaller Computer Chips cover?

A method to shrink complex AI models by converting their high-precision math into simpler, faster formats that run efficiently on mobile devices.

### Who owns patent US 10373050?

Qualcomm Inc owns this patent, granted in 2019.

### When does this patent expire?

This patent is expected to expire on August 6, 2039, when the invention enters the public domain.

### What is patent US 10373050 cited by?

This patent has been cited by 17 later patents that build on its ideas.

### What problem does this patent solve?

As AI models grew larger, they became too heavy for mobile processors to run in real-time. This patent provides a standardized way for companies like Qualcomm to ensure their mobile chips can handle advanced AI tasks, such as image recognition or voice processing, without draining the battery or overheating the device.

### What does this patent NOT cover?

Does not cover general neural network training methods that do not involve quantization.

**Full plain-English explainer:** https://patentbrief.org/patent/us/10373050/tensor-processing-unit-tpu

**Original patent:** https://patents.google.com/patent/US10373050

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._
