Interrater Agreement and Reliability of PERCIST and Visual Assessment When Using 18F-FDG-PET/CT for Response Monitoring of Metastatic Breast Cancer

Open Access

24 November 2020

journal article
research article
Published by MDPI AG in Diagnostics

Vol. 10 (12), 1001
https://doi.org/10.3390/diagnostics10121001

Abstract

Response evaluation at regular intervals is indicated for treatment of metastatic breast cancer (MBC). FDG-PET/CT has the potential to monitor treatment response accurately. Our purpose was to: (a) compare the interrater agreement and reliability of the semi-quantitative PERCIST criteria to qualitative visual assessment in response evaluation of MBC and (b) investigate the intrarater agreement when comparing visual assessment of each rater to their respective PERCIST assessment. We performed a retrospective study on FDG-PET/CT in women who received treatment for MBC. Three specialists in nuclear medicine categorized response evaluation by qualitative assessment and standardized one-lesion PERCIST assessment. The scans were categorized into complete metabolic response, partial metabolic response, stable metabolic disease, and progressive metabolic disease. 37 patients with 179 scans were included. Visual assessment categorization yielded moderate agreement with an overall proportion of agreement (PoA) between raters of 0.52 (95% CI 0.44–0.66) and a Fleiss kappa estimate of 0.54 (95% CI 0.46–0.62). PERCIST response categorization yielded substantial agreement with an overall PoA of 0.65 (95% CI 0.57–0.73) and a Fleiss kappa estimate of 0.68 (95% CI 0.60–0.75). The difference in PoA between overall estimates for PERCIST and visual assessment was 0.13 (95% CI 0.06–0.21; p = 0.001), that of kappa was 0.14 (95% CI 0.06–0.21; p < 0.001). The overall intrarater PoA was 0.80 (95% CI 0.75–0.84) with substantial agreement by a Fleiss kappa of 0.74 (95% CI 0.69–0.79). Semi-quantitative PERCIST assessment achieved significantly higher level of overall agreement and reliability compared with qualitative assessment among three raters. The achieved high levels of intrarater agreement indicated no obvious conflicting elements between the two methods. PERCIST assessment may, therefore, give more consistent interpretations between raters when using FDG-PET/CT for response evaluation in MBC.

This publication has 36 references indexed in Scilit:

Integrated ¹⁸F-FDG PET/CT and Perfusion CT of Primary Colorectal Cancer: Effect of Inter- and Intraobserver Agreement on Metabolic-Vascular Parameters
American Journal of Roentgenology, 2012
Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed
Journal of Clinical Epidemiology, 2011
From RECIST to PERCIST: Evolving Considerations for PET Response Criteria in Solid Tumors
Journal of Nuclear Medicine, 2009
New response evaluation criteria in solid tumours: Revised RECIST guideline (version 1.1)
European Journal of Cancer, 2009
Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support
Journal of Biomedical Informatics, 2008
2-Deoxy-2[F-18]FDG-PET for Detection of Recurrent Laryngeal Carcinoma after Radiotherapy: Interobserver Variability in Reporting
Molecular Imaging & Biology, 2008
Observer Variation of 2-Deoxy-2-[F-18]fluoro-d-Glucose-Positron Emission Tomography in Mediastinal Staging of Non-Small Cell Lung Cancer as a Function of Experience, and its Potential Clinical Impact
Molecular Imaging & Biology, 2007
When to use agreement versus reliability measures
Journal of Clinical Epidemiology, 2006
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT
The Lancet, 1986

Cited by 10 articles