Murali, Prajval Kumar (2025) Interactive shared visuo-tactile perception and learning in robotics. PhD thesis, University of Glasgow.
Full text available as:![]() |
PDF
Download (67MB) |
Abstract
Humans possess the ability to seamlessly integrate perceptual information from vision and tactile sensing to maintain a high-level cognitive understanding of the environment. Similarly, leveraging vision and tactile sensing can enable robots to interact with novel objects in unstructured environments. This thesis presents novel approaches for visuo-tactile perception and learning in robotics, focussing on object pose estimation, recognition, and reconstruction.
For robust pose estimation of unknown objects in dense clutter, a novel recursive filtering formulation termed translation-invariant Quaternion filter (TIQF) and its global-optimal version stochastic TIQF (S-TIQF) with visuo-tactile point cloud data are proposed in this thesis. A two-robot team with vision and tactile sensing autonomously declutters the scene and retrieves the pose of the target object which can be opaque or transparent using STIQF. Moreover, S-TIQF is deployed to correct hand-eye calibration with arbitrary objects in-situ which is necessary for shared perception. In addition to rigid objects, the pose tracking of articulated objects is a challenging task that requires integration of vision and tactile sensing. A novel Manifold Unscented Kalman Filter method on the SE(3) Lie Group termed ArtReg is presented, which is used for tracking the pose of objects. Using ArtReg, a novel framework is designed for detecting, tracking, and the goal-driven manipulation of unknown objects (single, multiple, or articulated) without assuming any prior knowledge regarding object shape or dynamics.
This thesis also presents a new vision-to-tactile cross-modal learning approach for object recognition where the network is trained with dense visual point clouds and tested with sparse point clouds acquired from tactile sensors. A novel unsupervised domain adaptation loss function is proposed to minimise the gap between the visual and tactile domains. Cross-modal adaptation allows the robotic system to switch to tactile sensing in case vision sensing is compromised, thereby increasing the robustness of the system.
Object reconstruction is another fundamental perceptual challenge that enables downstream tasks such as pose estimation and recognition. This thesis introduces a novel deep learning-based 3D object reconstruction approach utilising sparse tactile point cloud data to accurately recover the geometry of category-level unknown transparent objects leveraging only synthetic data for training. Furthermore, it is also demonstrated with visuo-tactile point clouds for opaque objects, wherein the tactile data are used to refine the shape in regions of uncertainty in the visual data.
The proposed methods have been rigorously validated with extensive experiments on standard datasets and robot experiments and have been demonstrated to outperform state of-the-art approaches.
Item Type: | Thesis (PhD) |
---|---|
Qualification Level: | Doctoral |
Additional Information: | Supported by funding from the BMW Group. |
Subjects: | T Technology > T Technology (General) |
Colleges/Schools: | College of Science and Engineering > School of Engineering |
Supervisor's Name: | Porr, Dr. Bernd |
Date of Award: | 2025 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2025-85001 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 25 Mar 2025 10:27 |
Last Modified: | 26 Mar 2025 10:09 |
Thesis DOI: | 10.5525/gla.thesis.85001 |
URI: | https://theses.gla.ac.uk/id/eprint/85001 |
Related URLs: |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year