Yu, Tianyi (2025) SAMFusion3D: Self-adaptive multi-modality fusion for 3D object detection in autonomous driving. MSc(R) thesis, University of Glasgow.
Full text available as:![]() |
PDF
Download (3MB) |
Abstract
Autonomous vehicles rely on a diverse array of sensors to achieve comprehensive visual perception of their surroundings. Consequently, the integration of multimodal data, aimed at harnessing the complete spectrum of features from each sensor’s Bird’s Eye View (BEV) information, has emerged as a pivotal area of interest for numerous researchers. Currently, the research community is dedicated to enhancing the accuracy of detection models. However, given that the visual perception systems of autonomous vehicles are typically compact to medium-sized mobile platforms, computational complexity and efficiency are paramount. As the surrounding environment of an autonomous vehicle can fluctuate rapidly at times, maintaining a static sampling rate in such varied contexts results in suboptimal computational efficiency. Furthermore, as each modality’s features are processed through Vision Transformers, particularly in the self-attention mechanism where the attention values for features are computed, it has been observed that adhering to the conventional pipeline approach results in elevated computational complexity and diminished efficiency. For the self-adaptive sampling mechanism, we adeptly extract depth information from camera features by utilizing point cloud data. Then, the fusion rate, which functions as a regulatory factor, dynamically adjusts the size of the effective sampling intervals, significantly impacting the computational load of the feature integration process. We also adopted the structure of the iTransformer that masterfully inverts the dimensions of the embedding. Our experiments conducted on the nuScenes dataset prove that our model can perform with reduced computational complexity while maintaining results comparable to those of the baseline model.
Item Type: | Thesis (MSc(R)) |
---|---|
Qualification Level: | Masters |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Colleges/Schools: | College of Science and Engineering > School of Computing Science |
Supervisor's Name: | Aragon Camarasa, Dr. Gerardo and Dai, Dr. Hang |
Date of Award: | 2025 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2025-84914 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 18 Feb 2025 16:37 |
Last Modified: | 20 Feb 2025 09:57 |
Thesis DOI: | 10.5525/gla.thesis.84914 |
URI: | https://theses.gla.ac.uk/id/eprint/84914 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year