# Making Neural Networks Faster by Skipping Unnecessary Calculations

> A method to speed up AI training by keeping data sparse, meaning it ignores zeros to save memory and processing power during both forward and backward passes.

- **Patent:** US 11429864
- **Original title:** System and method for bank-balanced sparse activation and joint-activation-weight-sparse training of neural networks
- **Owner:** Moffett International Co Ltd Hong Kong
- **Granted:** 2022
- **Status:** Active
- **Times cited:** 1
- **Field:** ai_ml, semiconductors, software

## What it does

This patent describes a way to train neural networks more efficiently by using 'sparse' data structures. Instead of calculating every single value in a neural network, which involves many zeros that don't change the outcome, the system keeps tensors (multidimensional arrays of numbers) sparse by removing or ignoring these zeros. During the forward pass, it generates a dense output but immediately sparsifies it. During the backward pass, it uses these sparse tensors to calculate gradients, ensuring that the training process only focuses on the most significant data points. It specifically uses a 'top-K' activation function to decide which values are important enough to keep, effectively pruning the network during the training process itself.

## What it does NOT cover

- Does not cover training methods that do not use sparse tensors or sparsity-based pruning.
- Does not cover standard dense neural network training where all weights are updated equally.
- Does not cover hardware-specific implementations that do not utilize the described bank-based top-K selection logic.
- Does not cover non-neural network machine learning models like linear regression or decision trees.

## The clever bit

The patent cleverly applies sparsity not just to the forward pass, but also to the backward pass (gradient calculation), using a bank-based top-K selection to maintain efficiency across the entire training loop.

## Real-world examples

1. Efficient training of deep learning models on edge AI hardware
2. Pruning-aware neural network training frameworks
3. Optimized GPU-based model training pipelines

## Why it matters

Training large AI models is incredibly expensive and energy-intensive. By reducing the number of active parameters through sparsity, this method allows for faster training cycles and lower memory requirements, which is essential for deploying large models on hardware with limited resources like mobile devices or edge servers.

## Frequently asked questions

### What does Making Neural Networks Faster by Skipping Unnecessary Calculations cover?

A method to speed up AI training by keeping data sparse, meaning it ignores zeros to save memory and processing power during both forward and backward passes.

### Who owns patent US 11429864?

Moffett International Co Ltd Hong Kong owns this patent, granted in 2022.

### When does this patent expire?

This patent is expected to expire on August 30, 2042, when the invention enters the public domain.

### What is patent US 11429864 cited by?

This patent has been cited by 1 later patents that build on its ideas.

### What problem does this patent solve?

Training large AI models is incredibly expensive and energy-intensive. By reducing the number of active parameters through sparsity, this method allows for faster training cycles and lower memory requirements, which is essential for deploying large models on hardware with limited resources like mobile devices or edge servers.

### What does this patent NOT cover?

Does not cover training methods that do not use sparse tensors or sparsity-based pruning.

**Full plain-English explainer:** https://patentbrief.org/patent/us/11429864/dlss-deep-learning-super-sampling

**Original patent:** https://patents.google.com/patent/US11429864

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._


## Related patents

Semantically similar inventions in the PatentBrief corpus:

- [How to Fix Faulty Memory Cells in AI Chips](https://patentbrief.org/patent/us/10956815/killing-asymmetric-resistive-processing-units-for-neural-network-training) — This patent describes a system that tests individual memory cells in AI chips for uneven behavior and then permanently disables the faulty ones before the chip starts learning, making AI training more efficient.
- [How Projection Neural Networks Speed Up AI Predictions](https://patentbrief.org/patent/us/11544573/llama-large-language-model-architecture) — A method for making artificial intelligence models faster and more efficient by using fixed, non-trainable projections to simplify complex data before processing.
- [Making AI Smarter by Focusing on Unsure 'Nodes'](https://patentbrief.org/patent/us/12423586/training-nodes-of-a-neural-network-to-be-decisive) — This 2025 patent from D5AI LLC describes a way to train AI models more effectively by boosting the learning signal for 'nodes' that aren't making clear decisions on data.
- [How to Speed Up Neural Network Training Using Momentum](https://patentbrief.org/patent/us/4914603/training-neural-networks) — A 1988 method for training artificial neural networks that uses an 'activating variable' to speed up how quickly the network learns from its mistakes.
- [How to Automatically Expand Neural Networks by Adding New Nodes](https://patentbrief.org/patent/us/10832138/gpt-language-model-pre-training) — A method for growing artificial intelligence models by identifying underperforming parts of a network and adding new nodes based on the behavior of existing ones.