Liu, Yuan (2025) Sparse intrinsic Gaussian processes in complex constrained domains with application of Bayesian optimisation. PhD thesis, University of Glasgow.
Full text available as:![]() |
PDF
Download (19MB) |
Abstract
As scientific research advances, more and more data are no longer limited to traditional Euclidean space, but extend to spaces with more complex geometric structures, such as complex constrained domains and Riemannian manifolds. Riemannian manifolds are increasingly being recognized as an important tool in data analysis and machine learning due to their widespread use in multiple scientific fields and in real-word contexts. For example, lakes can be modeled as manifolds to better understand their geographic structure and dynamics in environmental studies. In order to model such manifolds in real world situations, an increasing number of statistical tools are developed for estimation over a manifold. When considering regression on manifolds, inspired by the success of Gaussian Processes (GPs) in Euclidean spaces, this thesis aims to provide novel tools in order to efficiently and accurately estimate surfaces using GPs tailored for manifolds.
Traditional GPs typically use kernels that rely on Euclidean distance to define the covariance between data points on the target surface, such as the radial basis function (RBF) kernel. Traditional GPs cannot be directly applied to manifolds due to their failure to accurately capture the underlying structure, especially in the presence of gaps and complex boundaries. The heat kernel describes the heat diffusion on the manifold, which reflects the manifold’s geometric properties, but only specific manifolds have closed-form expressions. Intrinsic GPs proposed in [114] use the transition density of Brownian motion (BM) on the manifold to approximate the heat kernel, thereby capturing the manifold’s intrinsic geometric characteristics and enabling more accurate regression on manifolds. However, Intrinsic GPs face issues near boundaries due to resampling BM paths when crossing the boundary, causing inaccurate predictions near the boundary. According to the definition of the Neumann boundary condition, the BM path should be reflected when it crosses the boundary. This thesis proposes a "reflection" method to address this issue, leading to more accurate predictions at the boundary.
Additionally, Intrinsic GPs are constrained by the computational complexity of simulating BM paths, especially on large-scale or highly complex manifolds, which make them highly computationally intensive. This thesis investigates the feasibility of sparse methods in Intrinsic GPs, which use inducing points as intermediaries to facilitate information transmission from training points to test points, aiming to simplify the computational complexity without sacrificing inference accuracy. This thesis first proposes Sparse Intrinsic GPs using a Deterministic Inducing Conditional approach (SI-GPDIC), which is straightforward to implement and computationally efficient; however, it is sensitive to the location of a small number of inducing points. The Sparse Intrinsic Gaussian Process using a Deterministic Training Conditional approach (SI-GPDTC) is then proposed, which is less sensitive to the location of inducing points, achieving a balance between computational efficiency and inference precision. Considering approximating the true posterior distribution with a simpler, more tractable distribution by minimizing the divergence metric between them, this thesis develops the Sparse Intrinsic Gaussian Process with Variational Inference (SI-GPVI), a powerful tool for regression on complex manifolds. Graph GPs, which utilize the graph Matérn kernel on the undirected graph constructed from the manifold, and Traditional GPs, which directly use the Euclidean distance-based RBF kernel, are employed for comparison with the three Sparse Intrinsic GPs developed in this thesis. The performance of the proposed methods is demonstrated using three examples: the 2D U-shape, the 3D Bitten-torus, and the real-world dataset of the Aral Sea, with SI-GPVI performing particularly well.
Finally, motivated by the success of Bayesian optimisation (BO) in Euclidean space, this thesis proposes novel approaches to construct Intrinsic BO on manifolds, building upon previous research. The proposed GPs (introduced earlier in the thesis), serve as surrogate models in the BO approach, providing the acquisition function the probability of improvement (PI), with accurate information about the underlying manifold structure. Benefiting from the surrogate models’ ability to capture the structure of manifolds, the proposed BO algorithms—Intrinsic BO with DTC and Intrinsic BO with VI—achieve better results compared to Graph BO, based on Graph GPs, and Traditional BO, based on Traditional GPs. Among them, Intrinsic BO with DIC shows unstable performance due to its predictive variance providing inaccurate uncertainty when estimating points that are far from inducing points, whereas Intrinsic BO with VI demonstrates particularly strong performance, excelling in both accuracy and efficiency.
Item Type: | Thesis (PhD) |
---|---|
Qualification Level: | Doctoral |
Additional Information: | Supported by funding from the China Scholarship Council and the University of Glasgow. |
Subjects: | H Social Sciences > HA Statistics Q Science > QA Mathematics |
Colleges/Schools: | College of Science and Engineering > School of Mathematics and Statistics |
Funder's Name: | China Scholarship Council, University of Glasgow |
Supervisor's Name: | Niu, Dr. Mu and Miller, Professor Claire |
Date of Award: | 2025 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2025-85125 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 19 May 2025 14:40 |
Last Modified: | 19 May 2025 14:45 |
Thesis DOI: | 10.5525/gla.thesis.85125 |
URI: | https://theses.gla.ac.uk/id/eprint/85125 |
Related URLs: |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year