Westwood, Sean (2025) Neural characteristics of reward and punishment learning for optimising decision-making. PhD thesis, University of Glasgow.
Full text available as:![]() |
PDF
Download (3MB) |
Abstract
Avoiding aversive or punishing outcomes is integral to the survival and function of an organism, and yet this specific dimension of learning has received comparatively little attention in comparison to its rewarding counterpart. There are a number of mechanistic theories and pathways implicated in each dimension, but as of yet there is limited consensus on a holistic model of punishment processing. They key contributions of this thesis are three-fold. First, I offer a novel insight into the role of a motivational salience signal in modulating different behaviours in rewarding and punishing contexts and make the claim that this is compatible with prominent existing mechanistic accounts of punishment learning. Second, I frame this effect with a prominent individual-difference focus, highlighting the importance of considering subject-specific sensitivities to reward and punishment when attempting to model this dichotomy. Third, I use the insights from the first two contributions to show the viability of using this paradigm as a potential means to improve task performance through a closed-loop brain-computer interface (BCI).
These contributions are made primarily from data from a reversal learning task conducted across rewarding and punishing blocks, with EEG and pupillometry as the primary measures of interest. In Chapter 2, I replicate a two-component paradigm from Philiastides et al., (2010) and Fouragnan et al. (2015) with the extension of a punishment condition. I show broad similarities in the EEG signatures, with some notable insights from pupillometry as to promising temporal signals to focus on for further investigation. In Chapter 3, I use these insights to test a motivational salience hypothesis, targeting specifically the earlier component in this two-component hypothesis. I show that individual differences in the sensitivity of this component to context can reliably track task performance. In Chapter 4, I delve further into these individual differences, focusing more specifically on reinforcement learning parameters and psychometric personality measures to better characterise performance effects from Chapter 3. In Chapter 5, I apply all of these findings in a tentative pseudo-BCI analysis, where I retroactively estimate performance and predict whether dynamic context switching might have led to improved behavioural outcomes.
Together, these findings offer new insights into the spatiotemporal characterisation of reward punishment differences and propose some tentative yet exciting future directions for the application of this in the performance optimisation domain.
Item Type: | Thesis (PhD) |
---|---|
Qualification Level: | Doctoral |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Colleges/Schools: | College of Science and Engineering > School of Computing Science |
Supervisor's Name: | Vinciarelli, Professor Alessandro and Philiastides, Professor Marios |
Date of Award: | 2025 |
Depositing User: | Theses Team |
Unique ID: | glathesis:2025-84992 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 07 Apr 2025 08:47 |
Last Modified: | 07 Apr 2025 08:52 |
Thesis DOI: | 10.5525/gla.thesis.84992 |
URI: | https://theses.gla.ac.uk/id/eprint/84992 |
Actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year