Automatic Detection and Scoring of Kidney Stones on Noncontrast CT Images Using STONE Nephrolithometry: Combined Deep Learning and Thresholding Methods

Abstract
Purpose To develop and validate a deep learning and thresholding-based model for automatic kidney stone detection and scoring according to S.T.O.N.E. nephrolithometry. Procedures Abdominal noncontrast computed tomography (NCCT) images were retrospectively archived from February 2018 to April 2019 for three parts: a segmentation dataset (n = 167), a hydronephrosis classification dataset (n = 282), and test dataset (n = 117). The model consisted of four steps. First, the 3D U-Nets for kidney and renal sinus segmentation were developed. Second, the deep 3D dual-path networks for hydronephrosis grading were developed. Third, the thresholding methods were used to detect and segment stones in the renal sinus region. The stone size, CT attenuation, and tract length were calculated from the segmented stone region. Fourth, the stone's location was determined. The stone detection performance was estimated with sensitivity and positive predictive value (PPV). The hydronephrosis grading and stone size, tract length, number of involved calyces, and essence grading were estimated with the area under the curve (AUC) method and linear-weighted kappa statistics, respectively. Results The stone detection algorithm reached a sensitivity of 95.9 % (236/246) and a PPV of 98.7 % (236/239). The hydronephrosis classification algorithm achieved an AUC of 0.97. The scoring model results showed good agreement with radiologist results for the stone size, tract length, number of involved calyces, and essence grading (kappa = 0.95, 95 % confidence interval [CI]: 0.92, 0.98; kappa = 0.97, 95 % CI: 0.95, 1.00; kappa = 0.95, 95 % CI: 0.92, 0.98; and kappa = 0.97, 95 % CI: 0.94, 1.00), respectively. Conclusions The scoring model was constructed that can automatically detect and score stones in NCCT images.