Identification of a five‐gene signature with prognostic value in colorectal cancer

Abstract
Colorectal cancer (CRC) ranks as one of the most commonly diagnosed malignancies worldwide. Although mortality rates have been decreasing, the prognosis of CRC patients is still highly dependent on the individual. Therefore, identifying and understanding novel biomarkers for CRC prognosis remains crucial. The gene expression profiles of five‐gene expression omnibus (GEO) data sets of CRC were first downloaded. A total of 352 consistent differentially expressed genes (DEGs) were identified for CRC and paired with normal tissues. Functional analysis including gene ontology and Kyoto encyclopedia of genes and genomes pathway enrichment revealed that these DEGs were related to metabolic pathways, tight junctions, and the cell cycle. Ten hub DEGs were identified based on the search tool for the retrieval of interacting genes database and protein–protein interaction networks. By using univariate Cox proportional hazard regression analysis, we found 11 survival‐related genes among these DEGs. We finally established a five‐gene signature (kinesin family member 15, N‐acetyltransferase 2, glutathione peroxidase 3, secretogranin II, and chloride channel accessory 1) with prognostic value in CRC by step multivariate Cox regression analysis. Based on this risk scoring system, patients in the high‐risk group had significantly poorer survival results compared with those in the low‐risk group (log‐rank test, p < 0.0001). Finally, we validated our gene signature scoring system in two independent GEO cohorts (GSE17536 and GSE33113). We found all five of the signature genes to be DEGs in The Cancer Genome Atlas database. In conclusion, our findings suggest that our five DEG‐based signature can provide a novel biomarker with useful applications in CRC prognosis.