Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge
Jan 1, 2022·,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,·
0 min read
Wouter Bulten
Kimmo Kartasalo
Po-Hsuan Cameron Chen
Peter Ström
Hans Pinckaers
Kunal Nagpal
Yuannan Cai
David F. Steiner
Hester Van Boven
Robert Vink
Christina Hulsbergen Van De Kaa
Jeroen Van Der Laak
Mahul B. Amin
Andrew J. Evans
Theodorus Van Der Kwast
Robert Allan
Peter A. Humphrey
Henrik Grönberg
Hemamali Samaratunga
Brett Delahunt
Toyonori Tsuzuki
Tomi Häkkinen
Lars Egevad
Maggie Demkin
Sohier Dane
Fraser Tan
Masi Valkonen
Greg S. Corrado
Lily Peng
Craig H. Mermel
Pekka Ruusuvuori
Geert Litjens
Martin Eklund
Américo Brilhante
Aslı Çakır
Xavier Farré
Katerina Geronatsiou
Vincent Molinié
Guilherme Pereira
Paromita Roy
Günter Saile
Paulo G. O. Salles
Ewout Schaafsma
Joëlle Tschui
Jorge Billoch-Lima
Emíio M. Pereira
Ming Zhou
Shujun He
Sejun Song
Qing Sun
Hiroshi Yoshihara
Taiki Yamaguchi
Kosaku Ono
Tao Shen
Jianyi Ji
Arnaud Roussel
Kairong Zhou
Tianrui Chai
Nina Weng
Dmitry Grechka
Maxim v. Shugaev
Raphael Kiminya
Vassili Kovalev
Dmitry Voynov
Valery Malyshev
Elizabeth Lapo
Manuel Campos
Noriaki Ota
Shinsuke Yamaoka
Yusuke Fujimoto
Kentaro Yoshioka
Joni Juvonen
Mikko Tukiainen
Antti Karlsson
Rui Guo
Chia-Lun Hsieh
Igor Zubarev
Habib S. T. Bukhar
Wenyuan Li
Jiayun Li
William Speier
Corey Arnold
Kyungdoc Kim
Byeonguk Bae
Yeong Won Kim
Hong-Seok Lee
Jeonghyuk Park
The PANDA Challenge Consortium
Abstract
Artificial intelligence (AI) has shown promise for diagnosing prostate cancer in biopsies. However, results have been limited to individual studies, lacking validation in multinational settings. Competitions have been shown to be accelerators for medical imaging innovations, but their impact is hindered by lack of reproducibility and independent validation. With this in mind, we organized the PANDA challenge–the largest histopathology competition to date, joined by 1,290 developers–to catalyze development of reproducible AI algorithms for Gleason grading using 10,616 digitized prostate biopsies. We validated that a diverse set of submitted algorithms reached pathologist-level performance on independent cross-continental cohorts, fully blinded to the algorithm developers. On United States and European external validation sets, the algorithms achieved agreements of 0.862 (quadratically weighted κ, 95% confidence interval (CI), 0.840-0.884) and 0.868 (95% CI, 0.835-0.900) with expert uropathologists. Successful generalization across different patient populations, laboratories and reference standards, achieved by a variety of algorithmic approaches, warrants evaluating AI-based Gleason grading in prospective clinical trials.
Type
Publication
Nat Med