Enlighten Theses

In this section

Reinforcement learning at the edge of chaos

Young, Rory (2026) Reinforcement learning at the edge of chaos. PhD thesis, University of Glasgow.

Full text available as:

PDF
Download (7MB)

Abstract

Reinforcement learning offers a powerful data-driven framework for solving complex control tasks. However, despite strong empirical performance in simulated domains, reinforcement learning methods remain rarely deployed in real-world systems. We argue that a key reason for this is because the learned control policies do not account for the predictability of the closed-loop control interaction. Consequently, these policies frequently induce chaotic state dynamics, where small changes to the control system can lead to dramatic and unpredictable long-term behaviour. This unpredictability poses a critical barrier for realworld deployment, as chaotic behaviour can lead to potentially unsafe outcomes.

To address this issue, we propose two methods for embedding predictability into the training process. First, we introduce a Lyapunov-based regularisation method that constrains the divergence of predicted trajectories by penalising the predicted Maximal Lyapunov Exponent during policy updates. This regularisation encourages the controller to generate smoother, more predictable closed-loop dynamics and introduces a tuneable trade-off between task performance and predictability. Second, we develop a fine-tuning framework that updates pre-trained policies with a stabilising intrinsic reward function. This approach improves global stability without retraining from scratch and is compatible with any existing reinforcement learning controllers.

Together, these contributions tackle a key limitation in current reinforcement learning practice: without guaranteed predictability, even high-performing policies remain unsuitable for real-world deployment. By explicitly shaping the dynamics of closed-loop behaviour, this work advances deep reinforcement learning towards robust and reliable control in real-world control systems.

Item Type:	Thesis (PhD)
Qualification Level:	Doctoral
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Colleges/Schools:	College of Science and Engineering > School of Computing Science
Supervisor's Name:	Pugeault, Dr. Nicolas and Aragon Camarasa, Dr. Gerardo
Date of Award:	2026
Depositing User:	Theses Team
Unique ID:	glathesis:2026-86020
Copyright:	Copyright of this thesis is held by the author.
Date Deposited:	17 Jun 2026 10:28
Last Modified:	17 Jun 2026 10:38
Thesis DOI:	10.5525/gla.thesis.86020
URI:	https://theses.gla.ac.uk/id/eprint/86020
Related URLs:	Enlighten Publications Record

Actions (login required)

View Item

Downloads

Downloads per month over past year

Tools

Enlighten Theses

Reinforcement learning at the edge of chaos

Abstract

Actions (login required)

Downloads

Library