Xu, Xiangmin (2026) Real-time 3D scene representations for robotic systems. PhD thesis, University of Glasgow.
Full text available as:|
PDF
Download (21MB) |
Abstract
Real-time 3D scene representation is a fundamental capability for robotic systems operating in dynamic and resource-constrained environments. In this thesis, “real-time” refers to perception and reconstruction pipelines operating under update latencies ranging from milliseconds to seconds, depending on the level of representation fidelity and system interaction requirements. An agent that cannot perceive the three-dimensional structure of its surroundings in a timely and accurate manner inevitably operates with an incomplete and potentially misleading world model. In teleoperation, insufficient timeliness leads to delayed and unsafe control; in autonomy, insufficient fidelity undermines planning and obstacle avoidance. Despite its central importance, existing 3D scene representation pipelines typically prioritise either responsiveness or visual accuracy, rarely treating timeliness and fidelity as jointly coupled design objectives.
This thesis addresses this gap by developing a communication-aware framework for realtime 3D scene representation that explicitly models and optimises the timeliness–fidelity tradeoff. The work begins with the design of an embodied, full-stack reconstruction system deployed on a robotic platform, integrating teleoperation, deterministic pose recovery, and fast neural scene optimisation for interactive scene updates. Building upon this system-level foundation, a multi-camera sensing architecture with distributed edge–cloud computation is introduced to study the temporal structure of networked perception under stochastic communication delays.
A mathematical model is formulated to link Age of Information (AoI) dynamics with 3D reconstruction fidelity, enabling quantitative analysis of how delayed or stale updates affect multi-view fusion and dynamic mapping. The resulting analysis characterises a frontier between freshness and representation quality, demonstrating that both periodic updates and naive AoI minimisation can be suboptimal under realistic communication conditions.
Beyond this analysis, a task-oriented communication framework is proposed on the edge, where the 3D scene representation scheduler is formulated as a reinforcement learning problem. The scheduler optimises task performance by jointly considering update timeliness, 3D scene representation fidelity, and bandwidth constraints, while incorporating image semantic information, camera extrinsics, and the age of information into the scheduling decision. To better capture task relevance, a semantic-aware scheduling mechanism is introduced, enabling the system to prioritise observations based on both visual content and temporal freshness. The framework further investigates two scheduling strategies, namely ω-threshold and ω-waiting policies, revealing their different impacts on multi-view consistency and information timeliness. Experimental evaluations on multi-camera datasets and an outward-facing scene dataset demonstrate consistent improvements over periodic and AoI driven baselines, particularly in dynamic environment scenarios where stale updates degrade mapping stability and excessive transmissions waste resources. Saliency-based analysis further provides interpretability, revealing the spatial and semantic cues influencing scheduling decisions.
Overall, this thesis establishes timeliness-aware 3D scene representation as a first-order system design problem rather than a secondary implementation concern. By integrating embodied acquisition, communication modelling, and task-driven scheduling, the work contributes a principled foundation for scalable, adaptive, and communication-aware robotic perception systems.
| Item Type: | Thesis (PhD) |
|---|---|
| Qualification Level: | Doctoral |
| Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering |
| Colleges/Schools: | College of Science and Engineering > School of Computing Science |
| Supervisor's Name: | Li, Dr. Emma and Flynn, Professor David |
| Date of Award: | 2026 |
| Depositing User: | Theses Team |
| Unique ID: | glathesis:2026-86028 |
| Copyright: | Copyright of this thesis is held by the author. |
| Date Deposited: | 17 Jun 2026 15:03 |
| Last Modified: | 17 Jun 2026 15:09 |
| Thesis DOI: | 10.5525/gla.thesis.86028 |
| URI: | https://theses.gla.ac.uk/id/eprint/86028 |
| Related URLs: |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year

Tools
Tools