Making Neural Networks Faster by Skipping Unnecessary Calculations
A method to speed up AI training by keeping data sparse, meaning it ignores zeros to save memory and processing power during both forward and backward passes.
Patent Number
US 11429864
Status
Active
Filing Date
August 16, 2021
Grant Date
August 30, 2022
Expiration
~August 2041 (estimated)
Claims
20
Assignee
Moffett International Co Ltd Hong Kong
Inventors
Enxu Yan
Citations
1 forward · 2 backward
What it covers
This patent describes a way to train neural networks more efficiently by using 'sparse' data structures. Instead of calculating every single value in a neural network, which involves many zeros that don't change the outcome, the system keeps tensors (multidimensional arrays of numbers) sparse by removing or ignoring these zeros. During the forward pass, it generates a dense output but immediately sparsifies it. During the backward pass, it uses these sparse tensors to calculate gradients, ensuring that the training process only focuses on the most significant data points. It specifically uses a 'top-K' activation function to decide which values are important enough to keep, effectively pruning the network during the training process itself.
What it doesn't cover
- —Does not cover training methods that do not use sparse tensors or sparsity-based pruning.
- —Does not cover standard dense neural network training where all weights are updated equally.
- —Does not cover hardware-specific implementations that do not utilize the described bank-based top-K selection logic.
- —Does not cover non-neural network machine learning models like linear regression or decision trees.
The clever bit
The patent cleverly applies sparsity not just to the forward pass, but also to the backward pass (gradient calculation), using a bank-based top-K selection to maintain efficiency across the entire training loop.
Why it matters
Training large AI models is incredibly expensive and energy-intensive. By reducing the number of active parameters through sparsity, this method allows for faster training cycles and lower memory requirements, which is essential for deploying large models on hardware with limited resources like mobile devices or edge servers.
Real-world examples
- 1.Efficient training of deep learning models on edge AI hardware
- 2.Pruning-aware neural network training frameworks
- 3.Optimized GPU-based model training pipelines
Generated by PatentBrief · Not legal advice · patentbrief.org
US 11429864 · 2026