Deep unsupervised learning of cancer tissue representations

Claudio Quirós, Adalberto (2023) Deep unsupervised learning of cancer tissue representations. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2023ClaudioQuirosPhD.pdf] PDF
Download (187MB)


Histopathological images of tumors contain abundant information about how tumors grow and how they interact with their micro-environment. Characterizing and improving our understanding of phenotypes can reveal factors related to tumor progression and their underlying biological processes, ultimately improving diagnosis and treatment. In recent years, the field of deep learning applications in pathology has witnessed a steady progress, yet most of these applications focus on a supervised approach, relating tissue with associated sample labels or annotations.

The impact of supervised approaches is limited by three factors. Firstly, high-quality labels are expensive in time and effort, which makes them not easily scalable. Secondly, these methods focus on identifying known histologic patterns on pathology images, fundamentally restricting the discovery of new tissue phenotypes. Thirdly, supervised methods rely on annotations by experts and oftentimes there is significant subjectivity and bias in determining certain histologic patterns. These limitations emphasize the importance of using novel methods capable of characterizing tissue by the entire spectrum of features enclosed in the image, without pre-defined annotation or supervision.

This thesis explores the use of generative adversarial networks and self-supervised learning for unsupervised learning of cancer tissue representations, making the following contributions.

We propose PathologyGAN, a generative adversarial network(GAN) for unsupervised representation learning of cancer tissue images. We demonstrate that PathologyGAN generates highfidelity tissue images through quantitative measures and pathologist analyses. Additionally, we show how PathologyGAN captures key tissue characteristics such as cancer cell, lymphocyte, or stroma density to provide a structured and interpretable latent space.

We further develop PathologyGAN by introducing an encoder that allows us to find representations of real tissue images. We show evidence that these representations capture meaningful tissue information by illustrating their applicability in three different scenarios: We visualize representations with clinical annotations, we quantify how informative they are by training a linear classifier to predict tissue types, and we demonstrate their suitability on a multiple instance learning (MIL) task, predicting the presence of tumor on whole slide images (WSIs).

Finally, we introduce Histomorphological Phenotype Learning (HPL), a methodology to automatically identify and cluster tissue morphologies through self-supervised learning and community detection. We demonstrate the significance of the clustered tissue morphologies by relating them with cancer types, patient outcomes, cell types, growth patterns, and omics based immune signatures.

Throughout this thesis, we propose GAN-based and self-supervised learning-based methods for unsupervised learning of cancer tissue representations. We evaluate the relevance of the tissue information enclosed in the representations by correlating them with tissue types or cell type density. In addition, we apply the representations in real-world tasks such as tumor presence or cancer type classification over WSIs. Furthermore, we study the general applicability of the self-supervised learning methods by relating representations with clinically relevant annotations such as patient outcomes or omics-based immune signatures. Overall, this thesis provides examples of the potential of unsupervised learning for cancer tissue representations, delivering a tool not only to alleviate the bottleneck of high-quality annotations but also as a means to further study tissue morphologies through growth patterns and associations with molecular phenotypes.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Yuan, Dr. Ke
Date of Award: 2023
Depositing User: Theses Team
Unique ID: glathesis:2023-83374
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 24 Jan 2023 08:35
Last Modified: 24 Jan 2023 08:39
Thesis DOI: 10.5525/gla.thesis.83374
Related URLs:

Actions (login required)

View Item View Item


Downloads per month over past year