Comparison of Text Models for BWT
- 1 March 2007
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Burrows-Wheeler Transform (BWT) is a compression method, which reorders an input string into the the form, which is preferable to another compression. Usually, Move-To-Front transform and then Huffman coding is used to the permutated string. This work is to compare the single file parsing methods used on input text files by means of Burrows-Wheeler Transform for different languages (English, Czech, and German). Since present methods based on BWT use different block sizes and moreover, they are oriented to the compression of one element type - what makes harder the mutual comparison, we modified the method to be able to compress using all required elements and to have the block size 5 MB, which is more than size of any test input file.Keywords
This publication has 2 references indexed in Scilit:
- Dictionary-Based Compression for Long Time-Series SimilarityIEEE Transactions on Knowledge and Data Engineering, 2009
- Word-based block-sorting text compressionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002