ISOLATE: a computational strategy for identifying the primary origin of cancers using high-throughput sequencing
Open Access
- 19 June 2009
- journal article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (21), 2882-2889
- https://doi.org/10.1093/bioinformatics/btp378
Abstract
Motivation: One of the most deadly cancer diagnoses is the carcinoma of unknown primary origin. Without the knowledge of the site of origin, treatment regimens are limited in their specificity and result in high mortality rates. Though supervised classification methods have been developed to predict the site of origin based on gene expression data, they require large numbers of previously classified tumors for training, in part because they do not account for sample heterogeneity, which limits their application to well-studied cancers. Results: We present ISOLATE, a new statistical method that simultaneously predicts the primary site of origin of cancers and addresses sample heterogeneity, while taking advantage of new high-throughput sequencing technology that promises to bring higher accuracy and reproducibility to gene expression profiling experiments. ISOLATE makes predictions de novo, without having seen any training expression profiles of cancers with identified origin. Compared with previous methods, ISOLATE is able to predict the primary site of origin, de-convolve and remove the effect of sample heterogeneity and identify differentially expressed genes with higher accuracy, across both synthetic and clinical datasets. Methods such as ISOLATE are invaluable tools for clinicians faced with carcinomas of unknown primary origin. Availability: ISOLATE is available for download at: http://morrislab.med.utoronto.ca/software Contact: gerald.quon@utoronto.ca; quaid.morris@utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online.This publication has 39 references indexed in Scilit:
- An Integrated Genomic Analysis of Human Glioblastoma MultiformeScience, 2008
- Core Signaling Pathways in Human Pancreatic Cancers Revealed by Global Genomic AnalysesScience, 2008
- RNA-seq: An assessment of technical reproducibility and comparison with gene expression arraysGenome Research, 2008
- Mapping and quantifying mammalian transcriptomes by RNA-SeqNature Methods, 2008
- Probabilistic Latent Variable Models as Nonnegative FactorizationsComputational Intelligence and Neuroscience, 2008
- How many human genes can be defined as housekeeping with current expression data?BMC Genomics, 2008
- Gene expression profiling may improve diagnosis in patients with carcinoma of unknown primaryBritish Journal of Cancer, 2008
- Assessing natural variations in gene expression in humans by comparing with monozygotic twins using microarraysPhysiological Genomics, 2005
- A gene atlas of the mouse and human protein-encoding transcriptomesProceedings of the National Academy of Sciences of the United States of America, 2004
- Systematic variation in gene expression patterns in human cancer cell linesNature Genetics, 2000