Artificial Intelligence Assistance Significantly Improves Gleason Grading of Prostate Biopsies by Pathologists

Aug 1, 2020·

Wouter Bulten

Maschenka Balkenhol

Jean-Joël Awoumou Belinga

Américo Brilhante

Aslı Çakır

Lars Egevad

Martin Eklund

Xavier Farré

Katerina Geronatsiou

Vincent Molinié

Guilherme Pereira

Paromita Roy

Günter Saile

Paulo Salles

Ewout Schaafsma

Joëlle Tschui

Anne-Marie Vos

Brett Delahunt

Hemamali Samaratunga

David J. Grignon

Andrew J. Evans

Daniel M. Berney

Chin-Chen Pan

Glen Kristiansen

James G. Kench

Jon Oxley

Katia R. M. Leite

Jesse K. McKenney

Peter A. Humphrey

Samson W. Fine

Toyonori Tsuzuki

Murali Varma

Ming Zhou

Eva Comperat

David G. Bostwick

Kenneth A. Iczkowski

Cristina Magi-Galluzzi

John R. Srigley

Hiroyuki Takahashi

Theo Van Der Kwast

Hester Van Boven

Robert Vink

Jeroen Van Der Laak

Christina Hulsbergen-Van Der Kaa

Geert Litjens

· 0 min read

PDF Cite DOI URL

Abstract

The Gleason score is the most important prognostic marker for prostate cancer patients, but it suffers from significant observer variability. Artificial intelligence (AI) systems based on deep learning can achieve pathologist-level performance at Gleason grading. However, the performance of such systems can degrade in the presence of artifacts, foreign tissue, or other anomalies. Pathologists integrating their expertise with feedback from an AI system could result in a synergy that outperforms both the individual pathologist and the system. Despite the hype around AI assistance, existing literature on this topic within the pathology domain is limited. We investigated the value of AI assistance for grading prostate biopsies. A panel of 14 observers graded 160 biopsies with and without AI assistance. Using AI, the agreement of the panel with an expert reference standard increased significantly (quadratically weighted Cohen’s kappa, 0.799 vs. 0.872; p = 0.019). On an external validation set of 87 cases, the panel showed a significant increase in agreement with a panel of international experts in prostate pathology (quadratically weighted Cohen’s kappa, 0.733 vs. 0.786; p = 0.003). In both experiments, on a group-level, AI-assisted pathologists outperformed the unassisted pathologists and the standalone AI system. Our results show the potential of AI systems for Gleason grading, but more importantly, show the benefits of pathologist-AI synergy.

Type

Journal article

Publication

Mod Pathol

Last updated on Aug 1, 2020

← Impact of rescanning and normalization on convolutional neural network performance in multi-center, whole-slide classification of prostate cancer Sep 1, 2020

Streaming convolutional neural networks for end-to-end learning with multi-megapixel images Aug 1, 2020 →