An integrative approach to reveal driver gene fusions from paired-end sequencing data in cancer

Abstract
With a flood of cancer genome sequences expected soon, distinguishing 'driver' from 'passenger' mutations will be an important task. Wang et al. describe a bioinformatic method for identifying cancer-associated fusions and apply it to discover a recurrent rearrangement in lung cancer. Cancer genomes contain many aberrant gene fusions—a few that drive disease and many more that are nonspecific passengers. We developed an algorithm (the concept signature or 'ConSig' score) that nominates biologically important fusions from high-throughput data by assessing their association with 'molecular concepts' characteristic of cancer genes, including molecular interactions, pathways and functional annotations. Copy number data supported candidate fusions and suggested a breakpoint principle for intragenic copy number aberrations in fusion partners. By analyzing lung cancer transcriptome sequencing and genomic data, we identified a novel R3HDM2-NFE2 fusion in the H1792 cell line. Lung tissue microarrays revealed 2 of 76 lung cancer patients with genomic rearrangement at the NFE2 locus, suggesting recurrence. Knockdown of NFE2 decreased proliferation and invasion of H1792 cells. Together, these results present a systematic analysis of gene fusions in cancer and describe key characteristics that assist in new fusion discovery.