The Impact of rRNA Secondary Structure Consideration in Alignment and Tree Reconstruction: Simulated Data and a Case Study on the Phylogeny of Hexapods
Open Access
- 7 June 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 27 (11), 2507-2521
- https://doi.org/10.1093/molbev/msq140
Abstract
The use of secondary structures has been advocated to improve both the alignment and the tree reconstruction processes of ribosomal RNA (rRNA) data sets. We used simulated and empirical rRNA data to test the impact of secondary structure consideration in both steps of molecular phylogenetic analyses. A simulation approach was used to generate realistic rRNA data sets based on real 16S, 18S, and 28S sequences and structures in combination with different branch length and topologies. Alignment and tree reconstruction performance of four recent structural alignment methods was compared with exclusively sequence-based approaches. As empirical data, we used a hexapod rRNA data set to study the influence of nucleotide interdependencies in sequence alignment and tree reconstruction. Structural alignment methods delivered significantly better sequence alignments compared with pure sequence-based methods. Also, structural alignment methods delivered better trees judged by topological congruence to simulation base trees. However, the advantage of structural alignments was less pronounced and even vanished in several instances. For simulated data, application of mixed RNA/DNA models to stems and loops, respectively, led to significantly shorter branches. The application of mixed RNA/DNA models in the hexapod analyses delivered partly implausible relationships. This can be interpreted as a stronger sensitivity of mixed model setups to nonphylogenetic signal. Secondary structure consideration clearly influenced sequence alignment and tree reconstruction of ribosomal genes. Although sequence alignment quality can considerably be improved by the use of secondary structure information, the application of mixed models in tree reconstructions needs further studies to understand the observed effects.Keywords
This publication has 72 references indexed in Scilit:
- Accurate and efficient reconstruction of deep phylogenies from structured RNAsNucleic Acids Research, 2009
- Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based frameworkBMC Bioinformatics, 2008
- A fast structural multiple alignment method for long RNA sequencesBMC Bioinformatics, 2008
- BEAST: Bayesian evolutionary analysis by sampling treesBMC Evolutionary Biology, 2007
- Improvement of Phylogenies after Removing Divergent and Ambiguously Aligned Blocks from Protein Sequence AlignmentsSystematic Biology, 2007
- Characteristics of the nuclear (18S, 5.8S, 28S and 5S) and mitochondrial (12S and 16S) rRNA genes ofApis mellifera(Insecta: Hymenoptera): structure, organization, and retrotransposable elementsInsect Molecular Biology, 2006
- MAFFT version 5: improvement in accuracy of multiple sequence alignmentNucleic Acids Research, 2005
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- The equilibrium partition function and base pair binding probabilities for RNA secondary structurePeptide Science, 1990