Stony Brook Algorithm Repository


Text Compression

Input
Output

Input Description: A text string \(S\).
Problem: A shortest text string \(S'\) such that \(S\) can be reconstructed from \(S'\).

Excerpt from The Algorithm Design Manual: Secondary storage devices fill up quickly on every computer system, even though their capacity doubles each year. Decreasing storage prices have only increased interest in data compression, since there is now more data to compress than ever before. Data compression is the algorithmic problem of finding alternative, space-efficient encodings for a given data file. With the rise of computer networks, a new mission for data compression has arisen, that of increasing the effective bandwidth of networks by reducing the number of bits before transmission.

Data compression is a problem for which practical people like to invent ad hoc methods, designed for their particular applications. Sometimes these outperform general methods, but often they do not.


Implementations

GZIP (rating 10)
FiniteStateEntropy (rating 8)
pigz (rating 8)
LZMA-JS (rating 7)
brieflz (rating 5)
huffman (rating 5)
JPEG (rating 3)
Netlib (rating 2)


Recommended Books

Introduction to Algorithms by T. Cormen and C. Leiserson and R. Rivest and C. Stein Text Compression by Timothy C. Bell, Ian H. Witten, and John Cleary Introduction to Algorithms by U. Manber
Data compression: methods and theory by J. Storer Data Structures and Algorithms by A. Aho and J. Hopcroft and J. Ullman Graph Algorithms by S. Even

Related Problems


Cryptography

Discrete Fourier Transform

Shortest Common Superstring

Go To Main Page