Finding regulatory DNA motifs using alignment-free evolutionary conservation information
Open Access
- 4 January 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 38 (6), e90
- https://doi.org/10.1093/nar/gkp1166
Abstract
As an increasing number of eukaryotic genomes are being sequenced, comparative studies aimed at detecting regulatory elements in intergenic sequences are becoming more prevalent. Most comparative methods for transcription factor (TF) binding site discovery make use of global or local alignments of orthologous regulatory regions to assess whether a particular DNA site is conserved across related organisms, and thus more likely to be functional. Since binding sites are usually short, sometimes degenerate, and often independent of orientation, alignment algorithms may not align them correctly. Here, we present a novel, alignment-free approach for using conservation information for TF binding site discovery. We relax the definition of conserved sites: we consider a DNA site within a regulatory region to be conserved in an orthologous sequence if it occurs anywhere in that sequence, irrespective of orientation. We use this definition to derive informative priors over DNA sequence positions, and incorporate these priors into a Gibbs sampling algorithm for motif discovery. Our approach is simple and fast. It requires neither sequence alignments nor the phylogenetic relationships between the orthologous sequences, yet it is more effective on real biological data than methods that do.Keywords
This publication has 53 references indexed in Scilit:
- A Library of Yeast Transcription Factor Motifs Reveals a Widespread Function for Rsc3 in Targeting Nucleosome Exclusion at PromotersMolecular Cell, 2008
- Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequencesBioinformatics, 2008
- Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem CellsCell, 2008
- A systems approach to delineate functions of paralogous transcription factors: Role of the Yap family in the DNA damage responseProceedings of the National Academy of Sciences of the United States of America, 2008
- Evolution of genes and genomes on the Drosophila phylogenyNature, 2007
- Discovery of functional elements in 12 Drosophila genomes using evolutionary signaturesNature, 2007
- Reliable prediction of regulator targets using 12 Drosophila genomesGenome Research, 2007
- A phylogenetic Gibbs sampler that yields centroid solutions forcis-regulatory site predictionBioinformatics, 2007
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- The Human Genome Browser at UCSCGenome Research, 2002