The mutation spectrum revealed by paired genome sequences from a lung cancer patient

Abstract
Complete genome sequencing has already provided insights into the mutation spectra of a number of cancer types, including lung cancer. The latest sequencing technologies mean that it is possible to provide a genome-wide view of mutation differences, and this has now been done for lung cancer, comparing the complete sequences of a primary lung tumour — an adenocarcinoma in a male who reported smoking an average of 25 cigarettes a day for 15 years — and adjacent normal tissue. The comparison revealed more than 50,000 point mutations of which 530 were validated, 392 of them in coding regions, including previously known variations such as KRAS proto-oncogene mutation and amplification. The data suggest that genetically complex tumours may contain many partially redundant mutations, and that identifying recurrent cancer-causing driver mutations will require the sequencing of many more samples yet. Complete genome sequencing has already provided insights into the mutation spectra of several cancer types. Here, the first complete sequences are provided of a primary lung tumour and adjacent normal tissue. Comparison of the two reveals a variety of somatic mutations in the cancer genome, including changes in the KRAS proto-oncogene. The results reveal a distinct pattern of selection against mutations within expressed genes compared to non-expressed genes, and selection against mutations in promoter regions. Lung cancer is the leading cause of cancer-related mortality worldwide, with non-small-cell lung carcinomas in smokers being the predominant form of the disease1,2. Although previous studies have identified important common somatic mutations in lung cancers, they have primarily focused on a limited set of genes and have thus provided a constrained view of the mutational spectrum3,4,5,6,7,8. Recent cancer sequencing efforts have used next-generation sequencing technologies to provide a genome-wide view of mutations in leukaemia, breast cancer and cancer cell lines9,10,11,12,13. Here we present the complete sequences of a primary lung tumour (60× coverage) and adjacent normal tissue (46×). Comparing the two genomes, we identify a wide variety of somatic variations, including >50,000 high-confidence single nucleotide variants. We validated 530 somatic single nucleotide variants in this tumour, including one in the KRAS proto-oncogene and 391 others in coding regions, as well as 43 large-scale structural variations. These constitute a large set of new somatic mutations and yield an estimated 17.7 per megabase genome-wide somatic mutation rate. Notably, we observe a distinct pattern of selection against mutations within expressed genes compared to non-expressed genes and in promoter regions up to 5 kilobases upstream of all protein-coding genes. Furthermore, we observe a higher rate of amino acid-changing mutations in kinase genes. We present a comprehensive view of somatic alterations in a single lung tumour, and provide the first evidence, to our knowledge, of distinct selective pressures present within the tumour environment.