Predicting Lung Cancer Survival Using Probabilistic Reclassification of TNM Editions With a Bayesian Network

Open Access

1 November 2020

journal article
research article
Published by American Society of Clinical Oncology (ASCO) in JCO Clinical Cancer Informatics

Vol. 4 (4), 436-443
https://doi.org/10.1200/cci.19.00136

Abstract

The TNM classification system is used for prognosis, treatment, and research. Regular updates potentially break backward compatibility. Reclassification is not always possible, is labor intensive, or requires additional data. We developed a Bayesian network (BN) for reclassifying the 5th, 6th, and 7th editions of the TNM and predicting survival for non–small-cell lung cancer (NSCLC) without training data with known classifications in multiple editions. Data were obtained from the Netherlands Cancer Registry (n = 146,084). A BN was designed with nodes for TNM edition and survival, and a group of nodes was designed for all TNM editions, with a group for edition 7 only. Before learning conditional probabilities, priors for relations between the groups were manually specified after analysis of changes between editions. For performance evaluation only, part of the 7th edition test data were manually reclassified. Performance was evaluated using sensitivity, specificity, and accuracy. Two-year survival was evaluated with the receiver operating characteristic area under the curve (AUC), and model calibration was visualized. Manual reclassification of 7th to 6th edition stage group as ground truth for testing was impossible in 5.6% of the patients. Predicting 6th edition stage grouping using 7th edition data and vice versa resulted in average accuracies, sensitivities, and specificities between 0.85 and 0.99. The AUC for 2-year survival was 0.81. We have successfully created a BN for reclassifying TNM stage grouping across TNM editions and predicting survival in NSCLC without knowing the true TNM classification in various editions in the training set. We suggest binary prediction of survival is less relevant than predicted probability and model calibration. For research, probabilities can be used for weighted reclassification.

This publication has 7 references indexed in Scilit:

Should the 7th Edition of the Lung Cancer Stage Classification System Change Treatment Algorithms in Non-small Cell Lung Cancer?
Journal of Thoracic Oncology, 2010
Data Structures for Statistical Computing in Python
Proceedings of the Python in Science Conference, 2010
Causality
Published by Cambridge University Press (CUP) ,2009
A systematic analysis of performance measures for classification tasks
Information Processing & Management, 2009
The Staging of Cancer: A Retrospective and Prospective Appraisal
CA: A Cancer Journal for Clinicians, 2008
The IASLC Lung Cancer Staging Project: Proposals for the Revision of the TNM Stage Groupings in the Forthcoming (Seventh) Edition of the TNM Classification of Malignant Tumours
Journal of Thoracic Oncology, 2007
IPython: A System for Interactive Scientific Computing
Computing in Science & Engineering, 2007

Cited by 4 articles