Abstract
A recent development in text compression is a ‘block sorting’ algorithm which permutes the input text according to a special sort procedure and then processes the permuted text with Move-To-Front (MTF) and a final statistical compressor. The technique combines good speed with excellent compression performance. This paper investigates the fundamental operation of the algorithm and presents some improvements based on that analysis. Although block sorting is clearly related to previous compression techniques, it appears that it is best described by techniques derived from work by Shannon on the prediction and entropy of English text. A simple model is developed which relates the compression to the proportion of zeros after the MTF stage.