Li, Shibo (2024) Artificial Intelligence-enabled video inpainting using spatio-temporal correlation. PhD thesis, University of Glasgow.
Full text available as:
PDF
Download (103MB) |
Abstract
This thesis makes significant contributions to the field of AI-enabled video inpainting by addressing fundamental challenges in visual and contextual consistency, handling complex scenes with multiple layers, and computational efficiency. The proposed approaches—shortlong-term propagation, depth-guided modeling, and hierarchical sparse Transformer—have advanced the state of the art in video inpainting, achieving superior results across diverse datasets and tasks, including object removal and video restoration. Beyond addressing the technical challenges, these contributions establish a foundation for scalable and robust video inpainting methods that can be adapted for real-world applications in media production, healthcare, scientific research, and beyond.
Looking ahead, video editing tasks, including inpainting, are poised to witness transformative developments over the next 5–10 years. As AI techniques continue to evolve, future research is expected to focus on enhancing user interactivity and customization in video editing. By integrating Natural Language Processing (NLP) and generative models, video inpainting systems could allow users to specify their desired edits through textual prompts, enabling intuitive and precise control over the content generation process. Additionally, advances in multimodal learning may lead to systems capable of simultaneously processing video, audio, and textual data, providing comprehensive tools for complex multimedia editing tasks. However, some challenges must be addressed to enable widespread adoption of video inpainting technologies in real-world applications, such as real-time processing and the reliability and ethical concerns of AI-generated content in sensitive domains. By addressing the challenges of real-time performance, ethical considerations, and user-centric design, future research can further enhance the impact of video inpainting and related video editing tasks, paving the way for transformative advancements in media, healthcare, education, and beyond in the coming decade.
Item Type: | Thesis (PhD) |
---|---|
Qualification Level: | Doctoral |
Subjects: | T Technology > T Technology (General) |
Colleges/Schools: | College of Science and Engineering > School of Engineering |
Supervisor's Name: | Cooper, Professor Jonathan, Imran, Professor Muhammad and Abbasi, Professor Qammer |
Date of Award: | 2024 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2024-84828 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 21 Jan 2025 16:35 |
Last Modified: | 21 Jan 2025 16:35 |
URI: | https://theses.gla.ac.uk/id/eprint/84828 |
Related URLs: |
Actions (login required)
View Item |
Downloads
Downloads per month over past year