Sheng, Hongyun (2024) GaitTriViT and GaitVViT: transformer-based methods emphasizing spatial or temporal aspects in Gait Recognition. MSc(R) thesis, University of Glasgow.
Full text available as:
PDF
Download (1MB) |
Abstract
In image recognition tasks, subjects with long distance and low resolution remains a challenge, whereas Gait Recognition, identifying subjects by walking patterns, is considered one of the most promising biometric technologies due to the stability and efficiency. Previous Gait Recognition methods mostly focused on constructing a sophisticated model structure to better extract spatial and temporal features from frame sequences, aiming to increase the distinctiveness between different feature representations for better model performance during evaluation. Moreover, these methods primarily based on traditional Convolutional Neural Networks (CNNs) due to the dominance of CNNs in Computer Vision.
However, since the alternative form of Transformer, named Vision Transformer, which originally has a wide application in Natural Language Processing (NLP), has introduced into Computer Vision field, the Vision Transformer has gained a strong attention by the outstanding performance in various tasks. Thus, unlike previous methods mainly based on Convolutional Neural Networks (CNNs), this project introduces two Transformer-based method: a completely Vision Transformer-based gait recognition method GaitTriViT and a Video Vision Transformer-based method GaitVViT. The GaitTriViT leveraging Vision Transformer to gain more fine-grained spatial features, while GaitVViT enhances the capacity of temporal extraction. This work evaluates their performances on two of the most popular benchmarks. The results show the still-existing gaps, and several encouraging outperforms compared with current State-of-the-Art (SOTA), demonstrating the difficulties and challenges these Transformer-based methods will encounter continuously. But I still believe in the promising future of Vision Transformers in the field of Gait Recognition.
Item Type: | Thesis (MSc(R)) |
---|---|
Qualification Level: | Masters |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Colleges/Schools: | College of Science and Engineering > School of Computing Science |
Supervisor's Name: | Mahmoud, Dr. Marwa |
Date of Award: | 2024 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2024-84475 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 22 Jul 2024 15:46 |
Last Modified: | 22 Jul 2024 15:46 |
Thesis DOI: | 10.5525/gla.thesis.84475 |
URI: | https://theses.gla.ac.uk/id/eprint/84475 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year