Unbounded Length Contexts for PPM

1 January 1997

journal article
research article
Published by Oxford University Press (OUP) in The Computer Journal

Vol. 40 (2 and 3), 67-75
https://doi.org/10.1093/comjnl/40.2_and_3.67

Abstract

The PPM data compression scheme has set the performance standard in lossless compression of text throughout the past decade. PPM is a finite-context statistical modelling technique that can be viewed as blending together several fixed-order context models to predict the next character in the input sequence. This paper gives a brief introduction to PPM, and describes a variant of the algorithm, called PPM*, which exploits contexts of unbounded length. Although requiring considerably greater computational resources (in both time and space), this reliably achieves compression superior to the benchmark PPMC version. Its major contribution is that it shows that the full information available by considering all substrings of the input string can be used effectively to generate high-quality predictions. Hence, it provides a useful tool for exploring the bounds of compression.

Keywords

Cited by 114 articles