PatentBrief · Patent BriefUS 10956815

How to Fix Faulty Memory Cells in AI Chips

This patent describes a system that tests individual memory cells in AI chips for uneven behavior and then permanently disables the faulty ones before the chip starts learning, making AI training more efficient.

Patent Number

US 10956815

Status

Active

Filing Date

May 31, 2017

Grant Date

March 23, 2021

Expiration

May 31, 2037

Claims

Assignee

International Business Machines

Inventors

Tayfun Gokmen

Citations

3 forward · 50 backward

What it covers

The patent describes a system for improving neural network training on specialized hardware called Resistive Processing Unit (RPU) arrays. An RPU array has many tiny memory cells (RPUs) arranged in a grid, where each RPU's electrical state (its "conduction state") stores a "weight" for the AI. Before training begins, a controller measures each RPU's "asymmetry value" by sending a positive electrical pulse and a negative electrical pulse and comparing how much the RPU's conduction state changes for each (Claim 1). If an RPU's asymmetry is too high, meaning it behaves differently depending on the electrical direction, the system "burns" it by applying a high voltage to permanently disable it (Claim 2, Claim 6). This ensures only reliable RPUs are used for training.

What it doesn't cover

—Does not cover systems that train neural networks on traditional silicon chips like CPUs or GPUs, as it specifically targets RPU arrays.
—Does not cover methods of improving RPU array performance that don't involve measuring and disabling asymmetric RPUs.
—Does not cover RPU arrays where faulty units are simply ignored or remapped instead of being physically "burned" by a high voltage.
—Does not cover identifying faulty RPUs *during* or *after* the neural network training process.
—Does not cover RPUs that store information without also locally performing data processing operations (Claim 9).

The clever bit

The novelty lies in proactively identifying and permanently disabling individual, unevenly behaving memory cells (RPUs) *before* the main AI training process even starts. This prevents faulty components from corrupting the learning process or requiring complex software workarounds later.

Why it matters

Training large neural networks is computationally intensive and energy-hungry. This patent addresses a fundamental challenge in building specialized AI hardware: the inherent imperfections of analog memory components like RPUs. By identifying and disabling faulty RPUs early, it aims to make these AI accelerators more reliable and efficient, potentially speeding up AI development and reducing power consumption for complex models.

Real-world examples

1.IBM's AI hardware accelerators
2.Neuromorphic computing chips
3.In-memory computing architectures
4.Specialized AI training hardware

Generated by PatentBrief · Not legal advice · patentbrief.org

US 10956815 · 2026