A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines
Open Access
- 27 May 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 39 (15), e100
- https://doi.org/10.1093/nar/gkr362
Abstract
SnowShoes-FTD, developed for fusion transcript detection in paired-end mRNA-Seq data, employs multiple steps of false positive filtering to nominate fusion transcripts with near 100% confidence. Unique features include: (i) identification of multiple fusion isoforms from two gene partners; (ii) prediction of genomic rearrangements; (iii) identification of exon fusion boundaries; (iv) generation of a 5′–3′ fusion spanning sequence for PCR validation; and (v) prediction of the protein sequences, including frame shift and amino acid insertions. We applied SnowShoes-FTD to identify 50 fusion candidates in 22 breast cancer and 9 non-transformed cell lines. Five additional fusion candidates with two isoforms were confirmed. In all, 30 of 55 fusion candidates had in-frame protein products. No fusion transcripts were detected in non-transformed cells. Consideration of the possible functions of a subset of predicted fusion proteins suggests several potentially important functions in transformation, including a possible new mechanism for overexpression of ERBB2 in a HER-positive cell line. The source code of SnowShoes-FTD is provided in two formats: one configured to run on the Sun Grid Engine for parallelization, and the other formatted to run on a single LINUX node. Executables in PERL are available for download from our web site: http://mayoresearch.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm.This publication has 21 references indexed in Scilit:
- Discovery of non-ETS gene fusions in human prostate cancer using next-generation RNA sequencingGenome Research, 2010
- FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing dataGenome Biology, 2010
- Chimeric transcript discovery by paired-end transcriptome sequencingProceedings of the National Academy of Sciences of the United States of America, 2009
- Transcriptome-guided characterization of genomic rearrangements in a breast cancer cell lineProceedings of the National Academy of Sciences of the United States of America, 2009
- Transcriptome sequencing to detect gene fusions in cancerNature, 2009
- A transcriptional sketch of a primary human breast cancer by 454 deep sequencingBMC Genomics, 2009
- Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancerNature, 2007
- Recurrent Fusion of TMPRSS2 and ETS Transcription Factor Genes in Prostate CancerScience, 2005
- Association of the human papillomavirus type 16 E7 oncoprotein with the 600-kDa retinoblastoma protein-associated factor, p600Proceedings of the National Academy of Sciences of the United States of America, 2005
- A census of human cancer genesNature Reviews Cancer, 2004