# Making Neural Networks Faster by Skipping Unnecessary Calculations

> A method to speed up AI training by keeping data sparse, meaning it ignores zeros to save memory and processing power during both forward and backward passes.

- **Patent:** US 11429864
- **Original title:** System and method for bank-balanced sparse activation and joint-activation-weight-sparse training of neural networks
- **Owner:** Moffett International Co Ltd Hong Kong
- **Granted:** 2022
- **Status:** Active
- **Times cited:** 1
- **Field:** ai_ml, semiconductors, software

## What it does

This patent describes a way to train neural networks more efficiently by using 'sparse' data structures. Instead of calculating every single value in a neural network, which involves many zeros that don't change the outcome, the system keeps tensors (multidimensional arrays of numbers) sparse by removing or ignoring these zeros. During the forward pass, it generates a dense output but immediately sparsifies it. During the backward pass, it uses these sparse tensors to calculate gradients, ensuring that the training process only focuses on the most significant data points. It specifically uses a 'top-K' activation function to decide which values are important enough to keep, effectively pruning the network during the training process itself.

## What it does NOT cover

- Does not cover training methods that do not use sparse tensors or sparsity-based pruning.
- Does not cover standard dense neural network training where all weights are updated equally.
- Does not cover hardware-specific implementations that do not utilize the described bank-based top-K selection logic.
- Does not cover non-neural network machine learning models like linear regression or decision trees.

## The clever bit

The patent cleverly applies sparsity not just to the forward pass, but also to the backward pass (gradient calculation), using a bank-based top-K selection to maintain efficiency across the entire training loop.

## Real-world examples

1. Efficient training of deep learning models on edge AI hardware
2. Pruning-aware neural network training frameworks
3. Optimized GPU-based model training pipelines

## Why it matters

Training large AI models is incredibly expensive and energy-intensive. By reducing the number of active parameters through sparsity, this method allows for faster training cycles and lower memory requirements, which is essential for deploying large models on hardware with limited resources like mobile devices or edge servers.

## Frequently asked questions

### What does Making Neural Networks Faster by Skipping Unnecessary Calculations cover?

A method to speed up AI training by keeping data sparse, meaning it ignores zeros to save memory and processing power during both forward and backward passes.

### Who owns patent US 11429864?

Moffett International Co Ltd Hong Kong owns this patent, granted in 2022.

### When does this patent expire?

This patent is expected to expire on August 30, 2042, when the invention enters the public domain.

### What is patent US 11429864 cited by?

This patent has been cited by 1 later patents that build on its ideas.

### What problem does this patent solve?

Training large AI models is incredibly expensive and energy-intensive. By reducing the number of active parameters through sparsity, this method allows for faster training cycles and lower memory requirements, which is essential for deploying large models on hardware with limited resources like mobile devices or edge servers.

### What does this patent NOT cover?

Does not cover training methods that do not use sparse tensors or sparsity-based pruning.

**Full plain-English explainer:** https://patentbrief.org/patent/us/11429864/dlss-deep-learning-super-sampling

**Original patent:** https://patents.google.com/patent/US11429864

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._
