Footnotes

A variant of one of our algorithms has been used successfully for several years in the virtual memory system of the Apple Newton, a personal digital assistant with no disk [SW91] (Walter Smith, personal communication 1994, 1997). While we have not previously published this algorithm, we sketched it for Smith and he used it in the Newton, with good results--it achieved slightly less compression than a Ziv-Lempel algorithm Apple had used previously, but was much faster. Unfortunately, we do not have any detailed performance comparisons.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... false,²

It is worth stressing this again, because there is widespread confusion about the ``optimality'' of some compression algorithms. In general, an encoding scheme (such as Huffman coding or arithmetic coding) can be provably optimal within some small factor, but a compressor cannot, unless the regularities in the data are known in advance and in detail. Sometimes compression algorithms are proven optimal based on the simplifying assumption that the source is a stochastic (randomized, typically Markov) source, but real data sources in programs are generally not stochastic[WJNB95], so the proof does not hold for real data.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... different.³

The 22/10 split was arrived at experimentally, using an early data set that partially overlaps the one used in this study. The effectiveness of the algorithm is not very sensitive to this parameter, however, and varying the split by 2 bits does not seem to make much difference--using more high bits means that matches are encoded more compactly, but somewhat fewer things match.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

...[Nel95].⁴

``Delta coding'' is something of a misnomer because it's really a modeling technique with an obvious encoding strategy.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Scott F. Kaplan
1999-04-27