Fully automated prediction of liver fibrosis using deep learning analysis of gadoxetic acid–enhanced MRI

Abstract
Objectives To (1) develop a fully automated deep learning (DL) algorithm based on gadoxetic acid–enhanced hepatobiliary phase (HBP) MRI and (2) compare the diagnostic performance of DL vs. MR elastography (MRE) for noninvasive staging of liver fibrosis. Methods This single-center retrospective study included 355 patients (M/F 238/117, mean age 60 years; training, n = 178; validation, n = 123; test, n = 54) who underwent gadoxetic acid–enhanced abdominal MRI, including HBP and MRE, and pathological evaluation of the liver within 1 year of MRI. Cropped liver HBP images from a custom-written fully automated liver segmentation were used as input for DL. A transfer learning approach based on the ImageNet VGG16 model was used. Different DL models were built for the prediction of fibrosis stages F1-4, F2-4, F3-4, and F4. ROC analysis was performed to evaluate the performance of DL in training, validation, and test sets and of MRE liver stiffness in the test set. Results AUC values of DL were 0.99/0.70/0.77 (F1-4), 0.92/0.71/0.91 (F2-4), 0.91/0.78/0.90 (F3-4), and 0.98/0.83/0.85 (F4) for training/validation/test sets, respectively. The AUCs of MRE liver stiffness in the test set were 0.86 (F1-4), 0.87 (F2-4), 0.92 (F3-4), and 0.86 (F4). AUCs of MRE and DL were not significantly different for any of the fibrosis stages (p > 0.134). Conclusions The fully automated DL models based on HBP gadoxetic acid MRI showed good-to-excellent diagnostic performance for staging of liver fibrosis, with similar diagnostic performance to MRE. After validation in independent sets, the DL algorithm may allow for noninvasive liver fibrosis assessment without the need for additional MRI hardware. Key Points • The developed deep learning algorithm, based on routine standard-of-care gadoxetic acid–enhanced MRI data, showed good-to-excellent diagnostic performance for noninvasive staging of liver fibrosis.The diagnostic performance of the deep learning algorithm was equivalent to that of MR elastography in a separate test set.