Amino acid translation program for full-length cDNA sequences with frameshift errors.
- 8 March 2001
- journal article
- research article
- Published by American Physiological Society in Physiological Genomics
- Vol. 5 (2), 81-87
- https://doi.org/10.1152/physiolgenomics.2001.5.2.81
Abstract
Here we present an amino acid translation program designed to suggest the position of experimental frameshift errors and predict amino acid sequences for full-length cDNA sequences having phred scores. Our program generates artificial insertions into artificial deletions from low-accuracy positions of the original sequence, thereby generating many candidate sequences. The validity of the most probable sequence (the likelihood that it represents the actual protein) is evaluated by using a score (Va) that is calculated in light of the Kozak consensus, preferred codon usage, and position of the initiation codon. To evaluate the software, we have used a database in which, out of 612 cDNA sequences, 524 (86%) carried 773 frameshift errors in the coding sequence. Our software detected and corrected 48% of the total frameshift errors in 62% of the total cDNA sequences with frameshift errors. The false positive rate of frameshift correction was 9%, and 91% of the suggested frameshifts were true.Keywords
This publication has 14 references indexed in Scilit:
- Statistical Analysis of the 5′ Untranslated Region of Human mRNA Using “Oligo-Capped” cDNA LibrariesGenomics, 2000
- Detecting and Analyzing DNA Sequencing Errors: Toward a Higher Quality of the Bacillus subtilis Genome SequenceGenome Research, 1999
- The translational signal database, TransTerm, is now a relational databaseNucleic Acids Research, 1998
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- PROGRAMMED TRANSLATIONAL FRAMESHIFTINGAnnual Review of Genetics, 1996
- Pulling the Ribosome out of Frame by 11 at a Programmed Frameshift Site by Cognate Binding of Aminoacyl-tRNAMolecular and Cellular Biology, 1995
- The translational termination signal databaseNucleic Acids Research, 1993
- A quality control algorithm for DNA sequencing projectsNucleic Acids Research, 1993
- An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAsNucleic Acids Research, 1987
- The codon preference plot: graphic analysis of protein coding sequences and prediction of gene expressionNucleic Acids Research, 1984