Maximum-likelihood syntactic decoding

Abstract

A model of a linguistic information source is proposed as a grammar that generates a language over some finite alphabet. It is pointed out that grammatical sentences generated by the source grammar contain intrinsic "redundancy" that can be exploited for error-corrections. Symbols occurring in the sentences are composed according to some syntactic rules determined by the source grammar, and hence are different in nature from the lexicographical source symbols assumed in information theory and algebraic coding theory. Almost all programming languages and some simple natural languages can be described by the linguistic source model proposed in this paper. In order to combat excessive errors for very noisy channels, a conventional encoding-decoding scheme that does not utilize the source structure is introduced into the communication system. Decoded strings coming out of the lexicographical decoder may not be grammatical, which indicates that some uncorrected errors still remain in the individual sentences and will be reprocessed by a syntactic decoder that converts ungrammatical strings into legal sentences of the source language by the maximum-likelihood criterion. Thus more errors in the strings coming out of the noisy channel can be corrected by the syntactic decoder using syntactic analysis than the !exicographical decoder is capable of correcting or even of detecting. To design the syntactic decoder we use parsing techniques from the study of compilers and formal languages.

Keywords

This publication has 5 references indexed in Scilit:

Sequential syntactical decoding
International Journal of Parallel Programming, 1974
Syntax-directed least-errors analysis for context-free languages
Communications of the ACM, 1974
Error detection in precedence parsers
Theory of Computing Systems, 1973
Error detection in formal languages
Journal of Computer and System Sciences, 1970
An error-correcting parse algorithm
Communications of the ACM, 1963

Cited by 10 articles