MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform
Top Cited Papers
Open Access
- 15 July 2002
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 30 (14), 3059-3066
- https://doi.org/10.1093/nar/gkf436
Abstract
A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homo logous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT-NS-2) and the iterative refinement method (FFT-NS-i), are implemented in MAFFT. The performances of FFT-NS-2 and FFT-NS-i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT-NS-2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT-NS-i is over 100 times faster than T-COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.This publication has 33 references indexed in Scilit:
- A comprehensive comparison of multiple sequence alignment programsNucleic Acids Research, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Generating benchmarks for multiple sequence alignments and phylogenetic reconstructions.1997
- Evolutionary motif and its biological and structural significanceJournal of Molecular Evolution, 1997
- Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural AlignmentsJournal of Molecular Biology, 1996
- Multiple DNA and protein sequence alignment based on segment-to-segment comparison.Proceedings of the National Academy of Sciences of the United States of America, 1996
- A weighting system and aigorithm for aligning many phylogenetically related sequencesBioinformatics, 1995
- An Assessment of Amino Acid Exchange Matrices in Aligning Protein Sequences: The Twilight Zone RevisitedJournal of Molecular Biology, 1995
- Comprehensive study on iterative algorithms of multiple sequence alignmentBioinformatics, 1995
- Comparative analysis of multiple protein-sequence alignment methods.Molecular Biology and Evolution, 1994