Wang, Yizhu (2026) Data-driven Riemannian geometry for generative and regression models on manifolds. PhD thesis, University of Glasgow.
Full text available as:|
PDF
Download (13MB) |
Abstract
Many real-world datasets are high-dimensional in their ambient representation but exhibit an intrinsic low-dimensional structure induced by an unknown manifold. When such geometric structure is ignored, probabilistic models built on Euclidean assumptions may induce misleading notions of similarity, produce unreliable uncertainty estimates, and perform poorly in both generative and regression tasks. This thesis develops geometry-aware probabilistic learning methods that explicitly account for intrinsic manifold structure inferred from data.
The first part of the thesis addresses generative modelling on unknown manifolds through a geometry-aware latent diffusion framework. An Intrinsic Hybrid Latent Diffusion Model (ILDM) is introduced, in which the latent space is interpreted as a chart of an unknown manifold endowed with a probabilistic geometry induced by a pretrained decoder. Unlike standard latent diffusion models that impose Euclidean dynamics in the latent space, ILDM defines a hybrid diffusion process that adapts to geometric uncertainty. Riemannian Brownian motion is used in regions where the latent geometry is well supported by data, while Euclidean dynamics are employed in areas of high uncertainty. A corresponding score-based model is learned from simulated hybrid trajectories using an approximate denoising objective. The reverse-time process combines Riemannian and Euclidean Langevin dynamics to generate samples that better respect the underlying manifold structure. Experimental results on image and medical datasets demonstrate improved generation quality compared to conventional diffusion and latent diffusion models.
The second part of the thesis focuses on regression on unknown manifolds through intrinsic Gaussian process models. A probabilistic latent representation of the data manifold is learned using a probabilistic latent manifold model, yielding both a latent coordinate system and a distribution over Riemannian metrics. The learned metric, together with uncertainty information from the latent mapping, is used to characterise the geometry and boundary of the manifold. Brownian motion is simulated on the inferred manifold using the probabilistic metric, and the corresponding heat kernel—approximated via transition densities—is employed as the covariance function of an intrinsic Gaussian process. The resulting model, referred to as Gaussian Processes on Unknown Manifolds (GPUM), enables regression that respects intrinsic geometry without requiring explicit geodesic distance computations. Its performance is demonstrated on synthetic data and real-world datasets, including point clouds, WiFi signal measurements, and image data, and is compared against Euclidean and graph-based Gaussian process baselines.
Together, these contributions provide a probabilistic framework for learning, inference, and generation on unknown manifolds, highlighting the importance of incorporating geometry and uncertainty into probabilistic models.
| Item Type: | Thesis (PhD) |
|---|---|
| Qualification Level: | Doctoral |
| Subjects: | Q Science > QA Mathematics |
| Colleges/Schools: | College of Science and Engineering > School of Mathematics and Statistics |
| Supervisor's Name: | Niu, Dr. Mu and Yang, Dr. Xiaochen |
| Date of Award: | 2026 |
| Depositing User: | Theses Team |
| Unique ID: | glathesis:2026-86027 |
| Copyright: | Copyright of this thesis is held by the author. |
| Date Deposited: | 17 Jun 2026 14:12 |
| Last Modified: | 17 Jun 2026 14:15 |
| Thesis DOI: | 10.5525/gla.thesis.86027 |
| URI: | https://theses.gla.ac.uk/id/eprint/86027 |
| Related URLs: |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year

Tools
Tools