Interactive digital slides with heat maps: a novel method to improve the reproducibility of Gleason grading

Abstract
Our aims were to analyze reporting of Gleason pattern (GP) 3 and 4 prostate cancer with the ISUP 2005 Gleason grading and to collect consensus cases for standardization. We scanned 25 prostate biopsy cores diagnosed as Gleason score (GS) 6–7. Fifteen genitourinary pathologists graded the digital slides and circled GP 4 and 5 in a slide viewer. Grading difficulty was scored as 1–3. GP 4 components were classified as type 1 (cribriform), 2 (fused), or 3 (poorly formed glands). A GS of 5–6, 7 (3 + 4), 7 (4 + 3), and 8–9 was given in 29%, 41%, 19%, and 10% (mean GS 6.84, range 6.44–7.36). In 15 cases, at least 67% of observers agreed on GS groups (consensus cases). Mean interobserver weighted kappa for GS groups was 0.43. Mean difficulty scores in consensus and non-consensus cases were 1.44 and 1.66 (p = 0.003). Pattern 4 types 1, 2, and 3 were seen in 28%, 86%, and 67% of GP 4. All three coexisted in 16% (11% and 23% in consensus and non-consensus cases, p = 0.03). Average estimated and calculated %GP 4/5 were 29% and 16%. After individual review, the experts met to analyze diagnostic difficulties. Areas of GP 4 and 5 were displayed as heat maps, which were helpful for identifying contentious areas. A key problem was to agree on minimal criteria for small foci of GP 4. In summary, the detection threshold for GP 4 in NBX needs to be better defined. This set of consensus cases may be useful for standardization.