Volatile biomarkers in the breath of women with breast cancer

Abstract
We sought biomarkers of breast cancer in the breath because the disease is accompanied by increased oxidative stress and induction of cytochrome P450 enzymes, both of which generate volatile organic compounds (VOCs) that are excreted in breath. We analyzed breath VOCs in 54 women with biopsy-proven breast cancer and 204 cancer-free controls, using gas chromatography/mass spectroscopy. Chromatograms were converted into a series of data points by segmenting them into 900 time slices (8 s duration, 4 s overlap) and determining their alveolar gradients (abundance in breath minus abundance in ambient room air). Monte Carlo simulations identified time slices with better than random accuracy as biomarkers of breast cancer by excluding random identifiers. Patients were randomly allocated to training sets or test sets in 2:1 data splits. In the training sets, time slices were ranked according their C-statistic values (area under curve of receiver operating characteristic), and the top ten time slices were combined in multivariate algorithms that were cross-validated in the test sets. Monte Carlo simulations identified an excess of correct over random time slices, consistent with non-random biomarkers of breast cancer in the breath. The outcomes of ten random data splits (mean (standard deviation)) in the training sets were sensitivity = 78.5% (6.14), specificity = 88.3% (5.47), C-statistic = 0.89 (0.03) and in the test sets, sensitivity = 75.3% (7.22), specificity = 84.8 (9.97), C-statistic = 0.83 (0.06). A breath test identified women with breast cancer, employing a combination of volatile biomarkers in a multivariate algorithm.