Robustness-Driven Feature Selection in Classification of Fibrotic Interstitial Lung Disease Patterns in Computed Tomography Using 3D Texture Features
- 21 July 2015
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Medical Imaging
- Vol. 35 (1), 144-157
- https://doi.org/10.1109/tmi.2015.2459064
Abstract
Lack of classifier robustness is a barrier to widespread adoption of computer-aided diagnosis systems for computed tomography (CT). We propose a novel Robustness-Driven Feature Selection (RDFS) algorithm that preferentially selects features robust to variations in CT technical factors. We evaluated RDFS in CT classification of fibrotic interstitial lung disease using 3D texture features. CTs were collected for 99 adult subjects separated into three datasets: training, multi-reconstruction, testing. Two thoracic radiologists provided cubic volumes of interest corresponding to six classes: pulmonary fibrosis, ground-glass opacity, honeycombing, normal lung parenchyma, airway, vessel. The multi-reconstruction dataset consisted of CT raw sinogram data reconstructed by systematically varying slice thickness, reconstruction kernel, and tube current (using a synthetic reduced-tube-current algorithm). Two support vector machine classifiers were created, one using RDFS (“with-RDFS”) and one not (“without-RDFS”). Classifier robustness was compared on the multi-reconstruction dataset, using Cohen's kappa to assess classification agreement against a reference reconstruction. Classifier performance was compared on the testing dataset using the extended g-mean (EGM) measure. With-RDFS exhibited superior robustness (kappa 0.899-0.989) compared to without-RDFS (kappa 0.827-0.968). Both classifiers demonstrated similar performance on the testing dataset (EGM 0.778 for with-RDFS; 0.785 for without-RDFS), indicating that RDFS does not compromise classifier performance when discarding nonrobust features. RDFS is highly effective at improving classifier robustness against slice thickness, reconstruction kernel, and tube current without sacrificing performance, a result that has implications for multicenter clinical trials that rely on accurate and reproducible quantitative analysis of CT images collected under varied conditions across multiple sites, scanners, and timepoints.This publication has 28 references indexed in Scilit:
- LIBSVMACM Transactions on Intelligent Systems and Technology, 2011
- Effects of CT Section Thickness and Reconstruction Kernel on Emphysema Quantification: Relationship to the Magnitude of the CT Emphysema IndexAcademic Radiology, 2010
- The WEKA data mining softwareACM SIGKDD Explorations Newsletter, 2009
- Computerized detection of diffuse lung disease in MDCT: the usefulness of statistical texture featuresPhysics in Medicine & Biology, 2009
- High-resolution computed tomography and scleroderma lung diseaseRheumatology, 2008
- Classification of Parenchymal Abnormality in Scleroderma Lung Using a Novel Approach to Denoise Images Collected via a Multicenter StudyAcademic Radiology, 2008
- A comparison of methods for multiclass support vector machinesIEEE Transactions on Neural Networks, 2002
- Floating search methods in feature selectionPattern Recognition Letters, 1994
- STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENTThe Lancet, 1986
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960