Similarity network fusion for aggregating data types on a genomic scale
Top Cited Papers
- 26 January 2014
- journal article
- Published by Springer Science and Business Media LLC in Nature Methods
- Vol. 11 (3), 333-337
- https://doi.org/10.1038/nmeth.2810
Abstract
Recent technologies have made it cost-effective to collect diverse types of genome-wide data. Computational methods are needed to combine these data to create a comprehensive view of a given disease or a biological process. Similarity network fusion (SNF) solves this problem by constructing networks of samples (e.g., patients) for each available data type and then efficiently fusing these into one network that represents the full spectrum of underlying data. For example, to create a comprehensive view of a disease given a cohort of patients, SNF computes and fuses patient similarity networks obtained from each of their data types separately, taking advantage of the complementarity in the data. We used SNF to combine mRNA expression, DNA methylation and microRNA (miRNA) expression data for five cancer data sets. SNF substantially outperforms single data type analysis and established integrative approaches when identifying cancer subtypes and is effective for predicting survival.Keywords
This publication has 22 references indexed in Scilit:
- Bayesian correlated clustering to integrate multiple datasetsBioinformatics, 2012
- Comprehensive molecular portraits of human breast tumoursNature, 2012
- Comprehensive genomic characterization of squamous cell lung cancersNature, 2012
- Comprehensive molecular characterization of human colon and rectal cancerNature, 2012
- The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroupsNature, 2012
- Protein alterations associated with temozolomide resistance in subclones of human glioblastoma cell linesJournal of Neuro-Oncology, 2011
- Integrated Genomic Analysis Identifies Clinically Relevant Subtypes of Glioblastoma Characterized by Abnormalities in PDGFRA, IDH1, EGFR, and NF1Cancer Cell, 2010
- Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysisBioinformatics, 2009
- PennCNV: An integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping dataGenome Research, 2007
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences of the United States of America, 2001