Computational pathology, the application of advanced machine learning (ML) methods to digitized tissue sections, can revolutionize cancer care and research. Specifically, I propose a paradigm shift by moving away from the currently used manual grading systems towards ML-supported patient prognostication. However, significant knowledge gaps are hindering the field of computational pathology. We do not know how to: 1) effectively leverage global and local information in WSIs, 2) identify pan-cancer and cancer-specific prognostic features, and 3) make ML models explainable and interpretable.

Flowchart of the interlocking work packages
Flowchart of the interlocking work packages


This ambitious project will address these critical knowledge gaps by building on the novel stochastic streaming gradient descent developed in my group. First, I will push SSGD to the next level by integrating hierarchical hyperparameter optimization and separable convolutions. Second, to identify pan-cancer and cancer-specific prognostic biomarkers, I will integrate innovative multi-task and cross-task learning algorithms with SSGD. Third, I will leverage the latest advances in concept learning and natural language processing to endow deep neural networks with unprecedented transparency and explainability. Last, I will validate our developed methodology in the largest dataset of oncological WSIs globally.

By publicly releasing all developed tools and data, the proposed project will have a scientific multiplier effect on the fields of computational pathology, machine learning, and oncology. Specifically, the enhanced SSGD method can open new research areas for ML that require data across scales, such as remote sensing. My novel approach to ML explainability can encourage the adoption of innovative technologies, such as self-driving cars. Last, the derived specific and pan-cancer biomarkers will have a tremendous impact on the quest to understand cancer development and progression, and ultimately on public health and the economy