Artificial Intelligence Assistance Significantly Improves Gleason Grading of Prostate Biopsies by Pathologists

Aug 1, 2020·
Wouter Bulten
,
Maschenka Balkenhol
,
Jean-Joël Awoumou Belinga
,
Américo Brilhante
,
Aslı Çakır
,
Lars Egevad
,
Martin Eklund
,
Xavier Farré
,
Katerina Geronatsiou
,
Vincent Molinié
,
Guilherme Pereira
,
Paromita Roy
,
Günter Saile
,
Paulo Salles
,
Ewout Schaafsma
,
Joëlle Tschui
,
Anne-Marie Vos
,
Brett Delahunt
,
Hemamali Samaratunga
,
David J. Grignon
,
Andrew J. Evans
,
Daniel M. Berney
,
Chin-Chen Pan
,
Glen Kristiansen
,
James G. Kench
,
Jon Oxley
,
Katia R. M. Leite
,
Jesse K. McKenney
,
Peter A. Humphrey
,
Samson W. Fine
,
Toyonori Tsuzuki
,
Murali Varma
,
Ming Zhou
,
Eva Comperat
,
David G. Bostwick
,
Kenneth A. Iczkowski
,
Cristina Magi-Galluzzi
,
John R. Srigley
,
Hiroyuki Takahashi
,
Theo Van Der Kwast
,
Hester Van Boven
,
Robert Vink
,
Jeroen Van Der Laak
,
Christina Hulsbergen-Van Der Kaa
,
Geert Litjens
· 0 min read
Abstract
The Gleason score is the most important prognostic marker for prostate cancer patients, but it suffers from significant observer variability. Artificial intelligence (AI) systems based on deep learning can achieve pathologist-level performance at Gleason grading. However, the performance of such systems can degrade in the presence of artifacts, foreign tissue, or other anomalies. Pathologists integrating their expertise with feedback from an AI system could result in a synergy that outperforms both the individual pathologist and the system. Despite the hype around AI assistance, existing literature on this topic within the pathology domain is limited. We investigated the value of AI assistance for grading prostate biopsies. A panel of 14 observers graded 160 biopsies with and without AI assistance. Using AI, the agreement of the panel with an expert reference standard increased significantly (quadratically weighted Cohen’s kappa, 0.799 vs. 0.872; p = 0.019). On an external validation set of 87 cases, the panel showed a significant increase in agreement with a panel of international experts in prostate pathology (quadratically weighted Cohen’s kappa, 0.733 vs. 0.786; p = 0.003). In both experiments, on a group-level, AI-assisted pathologists outperformed the unassisted pathologists and the standalone AI system. Our results show the potential of AI systems for Gleason grading, but more importantly, show the benefits of pathologist-AI synergy.
Type
Publication
Mod Pathol