Identification of Driver Genes in Primary Liver Cancer by Integrating NGS and TCGA Mutation Data

Abstract
Background: This study is aimed towards an exploration of mutant genes in primary liver cancer (PLC) patients by using bioinformatics and data mining techniques. Methods: Peripheral blood or paraffin-embedded tissues from 8 patients with PLC were analyzed using a 551 cancer-related gene panel on an Illumina NextSeq500 Sequencer (Illumina). Meanwhile, the data of 396 PLC cases were downloaded from The Cancer Genome Atlas (TCGA) database. The common mutated genes were obtained after integrating the mutation information of the above two cohorts, followed by functional enrichment and protein-protein interaction (PPI) analyses. Three well-known databases, including Vogelstein’s list, the Network of Cancer Gene (NCG), and the Catalog of Somatic Mutations in Cancer (COSMIC) database were used to screen driver genes. Furthermore, the Chi-square and logistic analysis were performed to analyze the correlation between the driver genes and clinicopathological characteristics, and Kaplan-Meier (KM) method and multivariate Cox analysis were conducted to evaluate the overall survival outcome. Results: In total, 84 mutation genes were obtained after 8 PLC patients undergoing gene mutation detection with next-generation sequencing (NGS). The top 100 most mutate gene data from PLC patients in TCGA database were downloaded. After integrating the above two cohorts, 17 common mutated genes were identified. Next, 11 driver genes were screened out by analyzing the intersection of the 17 mutation genes and the genes in the three well-known databases. Among them, RB1, TP53, and KRAS gene mutations were connected with clinicopathological characteristics, while all the 11 gene mutations had no relationship with overall survival. Conclusion: This study investigated the mutant genes with significant clinical implications in PLC patients, which may improve the knowledge of gene mutations in PLC molecular pathogenesis.