How Computers Shrink Data by Finding Repeated Patterns
This patent describes a method for compressing data by finding the longest repeating sequences of characters, assigning them short codes, and building a dictionary of these sequences for both compression and decompression.
Patent Number
US 4558302
Status
Active
Filing Date
June 20, 1983
Grant Date
December 10, 1985
Expiration
~June 2003 (estimated)
Claims
183
Assignee
Sperry Corp
Inventors
Terry A. Welch
Citations
347 forward · 1 backward
What it covers
The patent details a system for compressing "data character signals" (Claim 1). First, it builds a "string table" (Claim 1) that stores sequences of characters it has seen before. When processing new data, it searches for the "longest match" (Claim 1) between the incoming data and the strings already in its table. Once the longest match is determined, the system sends a short "code signal" (Claim 1) representing that matched string instead of the actual characters. It then adds a new, slightly longer string to its table, formed by taking the "longest match" and adding the very next character from the input data (Claim 1). For example, if the input is "banana" and "ban" is the longest match, it sends the code for "ban" and then adds "bana" to its table. Decompression works by receiving these codes and rebuilding a similar string table to translate the codes back into the original character sequences.
What it doesn't cover
- —Compression methods that do not use a "string table" to store encountered data character signals (Claim 1).
- —Compression methods that do not search for the "longest match" between input data and stored strings (Claim 1).
- —Compression methods that do not assign a unique "code signal" to each stored string for transmission (Claim 1).
- —Compression methods that do not build an "extended string" by adding the next character to the longest match (Claim 1).
- —Compression methods that do not involve "data character signals" (Claim 1), implying it's not for raw binary data without character interpretation.
The clever bit
The core innovation is the dynamic creation of a dictionary (the "string table") on the fly, where frequently occurring sequences of characters are identified and replaced with shorter codes. This adaptive approach, combined with efficiently searching and updating the dictionary using a "limited search hashing procedure" (Abstract, Claim 5), allows for effective compression without needing a pre-defined dictionary.
Why it matters
This patent is a foundational work in dictionary-based data compression, specifically relating to the LZW (Lempel-Ziv-Welch) algorithm. LZW became widely adopted for its efficiency and simplicity, influencing many common file formats and protocols. Its commercial impact was significant, leading to widespread use in image compression, archive formats, and network protocols.
Real-world examples
- 1.GIF image format
- 2.TIFF image format
- 3.Some forms of ZIP compression
- 4.Early modem data compression
Generated by PatentBrief · Not legal advice · patentbrief.org
US 4558302 · 2026