Spatio-temporal areal data modelling: COVID-19 applications and outlier detection for big data

Muegge, Robin (2025) Spatio-temporal areal data modelling: COVID-19 applications and outlier detection for big data. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2024MueggePhD.pdf] PDF
Download (19MB)

Abstract

The COVID-19 pandemic has been the greatest challenge to global public health in the 21st century. The novel virus demanded scientific progress in several fields of research, from medical innovations to the development of political strategies that aimed to contain the spreading of the virus and protect the most vulnerable, often assisted by statistical analyses. The work presented in this thesis is a timely analysis of important public health aspects of COVID-19 in the UK, detecting overall trends and patterns in mortality risk after the three national lockdowns in England and identifying differences in COVID-19 vaccine attrition rates for the second and third doses by age group, sex, and council area in Scotland. The presented statistical analyses fit spatio-temporal areal data using generalised linear mixed effects models in a Bayesian hierarchical framework, where the correlated spatial random effects are assigned prior distributions from the class of conditional autoregressive (CAR) models. These models typically induce spatial smoothness in the inferred disease risk or prevalence surface, where strength in the estimation is borrowed from neighbouring observations, according to some neighbourhood structure. The spatial smoothness assumption is often accredited to Waldo R. Tobler, who said, “Everything is related to everything else, but near things are more related than distant things”. However, the presented COVID-19 analyses suggest that the spatial smoothness assumption might not always hold for all areas. Hence, this thesis proposes a novel relative density-based outlier score (RDOS) for identifying potential singleton spatial outliers that violate the spatial smoothness assumption and a novel modified spatial smoothing model to remove the potential outliers’ impact on the estimated disease prevalence surface. The following summarises the key findings from this thesis. The study on the impact of national lockdowns on COVID-19 mortality risk in England shows that the risks increased drastically before the implementation of lockdowns 1 and 3 and decreased to pre-lockdown levels after ten and six weeks, respectively. Further, the study identifies areas with a higher peak risk during these lockdowns, detecting an urban/rural divide for lockdown 1 and an association between higher risk and the early spreading of the Alpha variant during lockdown 3. The study on COVID-19 vaccine attrition rates in Scotland identifies a strong association between age and attrition rates, where the odds in favour of attrition decrease smoothly with increasing age. The odds in favour of attrition tend to be overall higher for males than females and higher in the second transition (from doses 2 to 3) than the first (from doses 1 to 2). Lastly, a simulation study shows that the novel singleton spatial outlier detection method for areal data produces much better detection results than the commonly used local Moran’s I statistic. Similarly, the modified smoothing model is shown to produce overall better prevalence estimates than a conventional smoothing model when the number of outliers is large or at least some outliers have a large magnitude, even when the identified outlier sets are sub-optimal. The proposed methods are combined in a two-stage modelling approach and applied in a motivating study on asthma prevalence at the lower super output area (LSOA) level in England, where potential singleton spatial outliers are identified, and the estimated risk surface obtained from the modified smoothing model is compared to that of a conventional smoothing model. The comparison shows that the prevalence estimates of the identified outliers and their neighbouring inliers differ noticeably between the two models, highlighting the importance of considering such potential singleton spatial outliers in the analysis of areal unit data.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Additional Information: Supported by funding from G-Research, the University of Glasgow's Graduate School, the London Mathematical Society, and the School of Mathematics and Statistics.
Subjects: H Social Sciences > HA Statistics
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics
Supervisor's Name: Dean, Dr. Nema, Jack, Dr. Eilidh and Lee, Professor Duncan
Date of Award: 2025
Depositing User: Theses Team
Unique ID: glathesis:2025-84953
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 17 Mar 2025 09:50
Last Modified: 21 Mar 2025 12:50
Thesis DOI: 10.5525/gla.thesis.84953
URI: https://theses.gla.ac.uk/id/eprint/84953
Related URLs:

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

Loading...Loading...