Semantic depth estimation with monocular camera for autonomous navigation of small unmanned aircraft

Tay, Yong Kiat (2024) Semantic depth estimation with monocular camera for autonomous navigation of small unmanned aircraft. MPhil(R) thesis, University of Glasgow.

Full text available as:
[thumbnail of 2024TayMPhil(R).pdf] PDF
Download (4MB)


Demand for small Unmanned Aircraft (UA) applications in Global Navigation Satellite System (GNSS) denied environment has increased over the years in areas such as internal building infrastructure inspection, indoor security surveillance and stock cycle counting. One of the key challenges in the current development of autonomous UA is the localization and pose estimation in the absence of GNSS signals. Various methods using onboard sensors such as Light Detection and Ranging (LiDAR) have been adopted but with the compromise of take-off weight and computing complexity. Off-board sensors such as motion trackers or Radio Frequency (RF) based beacons have also been adopted but are costly and limited to a small area of operations within the sensor’s range. With the advancement of computer vision and deep neural networks, and the fact that the majority of consumer and commercial UA comes equipped with high resolution cameras, it is now even more possible to exploit camera images for navigational tasks. To enhance the accuracy of traditional computer vision methods, machine learning can be adopted to model complex image variations for more accurate predictions. In this thesis, a novel approach based on Semantic Depth Prediction (SDP) was proposed for small UA to perform path planning in GNSS denied environments using its onboard monocular camera. The objective of SDP isto perform 3D scene reconstruction using deep convolution neural network using 2D images captured through a single forward-looking onboard camera thus eliminating the use of expensive and complex sensors. SDP was modeled based on open-source image data set (like NYU2 and SunRGB-D) and real image data sets taken from the actual environments to improve of detection accuracy and was tested in an actual indoor warehouse to validate the performance of the proposed SDP concept. Our experiments have shown that combining lightweight mobile Convolutional neural network (CNN) models allows feature tracking navigation tasks to be undertaken by an off the shelve Tello without the need for additional sensors. However, features of interest need to be kept within the center of each frame of image to eliminate the possibility of losing feature of interest over time. Missing objects in SDP output can be linked to partially occluded objects captured in the input image as existing networks are not able to handle missing information and thus cannot detect objects under occlusion.

Item Type: Thesis (MPhil(R))
Qualification Level: Masters
Subjects: T Technology > T Technology (General)
Colleges/Schools: College of Science and Engineering
Supervisor's Name: Hesse, Dr. Henrik, Sutthiphong, Dr. Srigrarom and Anderson, Dr. David
Date of Award: 2024
Depositing User: Theses Team
Unique ID: glathesis:2024-84086
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 13 Feb 2024 13:08
Last Modified: 13 Feb 2024 13:33
Thesis DOI: 10.5525/gla.thesis.84086
Related URLs:

Actions (login required)

View Item View Item


Downloads per month over past year