CT texture analysis reliability in pulmonary lesions: the influence of 3D vs. 2D lesion segmentation and volume definition by a Hounsfield-unit threshold

Abstract
Objective Reproducibility problems are a known limitation of radiomics. The segmentation of the target lesion plays a critical role in texture analysis variability. This study’s aim was to compare the interobserver reliability of manual 2D vs. 3D lung lesion segmentation with and without pre-definition of the volume using a threshold of − 50 HU. Methods Seventy-five patients with histopathologically proven lung lesions (15 patients each with adenocarcinoma, squamous cell carcinoma, small cell lung cancer, carcinoid, and organizing pneumonia) who underwent an unenhanced CT scan of the chest were included. Three radiologists independently segmented each lesion manually in 3D and 2D with and without pre-segmentation volume definition by a HU threshold, and shape parameters and original, Laplacian of Gaussian–filtered, and wavelet-based texture features were derived. To assess interobserver reliability and identify the most robust texture features, intraclass correlation coefficients (ICCs) for different segmentation settings were calculated. Results Shape parameters had high reliability (64–79% had excellent and good ICCs). Texture features had weak reliability levels, with the highest ICCs (38% excellent or good) found for original features in 3D segmentation without the use of a HU threshold. A small proportion (4.3–11.5%) of texture features had excellent or good ICC values at all segmentation settings. Conclusion Interobserver reliability of texture features from CT scans of a heterogeneous collection of manually segmented lung lesions was low with a small proportion of features demonstrating high reliability independent of the segmentation settings. These results indicate a limited applicability of texture analysis and the need to define robust texture features in patients with lung lesions. Key Points • Our study showed a low reproducibility of texture features when 3 radiologists independently segmented lung lesions in CT images, which highlights a serious limitation of texture analysis. • Interobserver reliability of texture features was low regardless of whether the lesion was segmented in 2D and 3D with or without a HU threshold. • In contrast to texture features, shape parameters showed a high interobserver reliability when lesions were segmented in 2D vs. 3D with and without a HU threshold of − 50.
Funding Information
  • Medical University of Graz