Integration of physical prior knowledge in machine learning imaging workflows

Aversa, Marco (2024) Integration of physical prior knowledge in machine learning imaging workflows. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of edited version, chapter 4 removed due to confidentiality issues] PDF (edited version, chapter 4 removed due to confidentiality issues)
Download (168MB)

Abstract

In this thesis, we introduce several scientific machine learning methods to circumvent models’ data dependencies by bridging the gap between the physical insight about the acquisition data process and the machine learning paradigms. The aim is to open neural networks’ black box by combining it with a well-known forward process, the white-box model. Having access to a well-defined white-box model, we can embed its information inside the network in order to obtain a hybrid model where we partially know how it should respond.

Focusing on medical and aerospace imaging applications, we leverage sensor calibration profile and image signal processing prior knowledge to develop three novel validation protocols via a physically faithful differentiable model. Starting from the object, through the optics, to the sensor, the entire imaging process is integrated into the machine learning workflow to detect model failures and enhance model robustness. These novel methods extend model generalization beyond classical techniques like catalogue testing or augmentation, bringing additional freedom to the data and model explainability.

Guided by the principle of metrologically precise data handling, we designed a data-centric machine learning workflow to emulate expensive satellite imaging payloads using more affordable drone image data. The emulation mimics pixel distribution and the optical properties of the target acquisition system, allowing an in silico model validation before launching the physical prototype. The experiments demonstrate the lowest resolution and signal-to-noise ratio necessary for conducting a segmentation task on satellite data, offering the optimal range of optical parameters where the model operates effectively.

While in medical imaging the acquisition process has a key role in making the model more resilient to real-world application, it does not cover the out-of-distribution negative impact on the downstream model due to missing data in sparsely annotated datasets. In this thesis, we developed a generative framework based on diffusion models for the synthesis of lung cancer tissue histological data. The model leverages the biological insight on the macroscopic cell arrangement to guide the synthesis using new features from unlabelled data. We evaluated the efficacy and fidelity of the generated content via a comprehensive data assessment and we explored the potential of synthetic data for training on in-out house data.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Murray-Smith, Professor Roderick
Date of Award: 2024
Depositing User: Theses Team
Unique ID: glathesis:2024-84415
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 27 Jun 2024 14:28
Last Modified: 27 Jun 2024 14:28
Thesis DOI: 10.5525/gla.thesis.84415
URI: https://theses.gla.ac.uk/id/eprint/84415
Related URLs:

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year