Rapid detection and curation of conserved DNA via enhanced-BLAT and EvoPrinterHD analysis

Abstract
Multi-genome comparative analysis has yielded important insights into the molecular details of gene regulation. We have developed EvoPrinter, a web-accessed genomics tool that provides a single uninterrupted view of conserved sequences as they appear in a species of interest. An EvoPrint reveals with near base-pair resolution those sequences that are essential for gene function. We describe here EvoPrinterHD, a 2nd-generation comparative genomics tool that automatically generates from a single input sequence an enhanced view of sequence conservation between evolutionarily distant species. Currently available for 5 nematode, 3 mosquito, 12 Drosophila, 20 vertebrate, 17 Staphylococcus and 20 enteric bacteria genomes, EvoPrinterHD employs a modified BLAT algorithm [enhanced-BLAT (eBLAT)], which detects up to 75% more conserved bases than identified by the BLAT alignments used in the earlier EvoPrinter program. The new program also identifies conserved sequences within rearranged DNA, highlights repetitive DNA, and detects sequencing gaps. EvoPrinterHD currently holds over 112 billion bp of indexed genomes in memory and has the flexibility of selecting a subset of genomes for analysis. An EvoDifferences profile is also generated to portray conserved sequences that are uniquely lost in any one of the orthologs. Finally, EvoPrinterHD incorporates options that allow for (1) re-initiation of the analysis using a different genome's aligning region as the reference DNA to detect species-specific changes in less-conserved regions, (2) rapid extraction and curation of conserved sequences, and (3) for bacteria, identifies unique or uniquely shared sequences present in subsets of genomes. EvoPrinterHD is a fast, high-resolution comparative genomics tool that automatically generates an uninterrupted species-centric view of sequence conservation and enables the discovery of conserved sequences within rearranged DNA. When combined with cis-Decoder, a program that discovers sequence elements shared among tissue specific enhancers, EvoPrinterHD facilitates the analysis of conserved sequences that are essential for coordinate gene regulation.