Sampling designs for the spatiotemporal modelling of groundwater quality monitoring data

Radvanyi, Peter (2025) Sampling designs for the spatiotemporal modelling of groundwater quality monitoring data. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2025radvanyiphd.pdf] PDF
Download (4MB)

Abstract

Applying appropriate sampling designs in the long-term monitoring of groundwater quality networks is crucial in ensuring that accurate inferences are made about the spatio-temporal distribution of the concentrations of constituents of potential concern (CoPC). Furthermore, the sampling of groundwater monitoring wells and the subsequent analysis of samples induces costs, safety hazards and unintended environmental consequences. Therefore, an optimal sampling design should aim to minimise sample sizes, whilst maximising the value of the information obtained. The problem of finding optimal locations for wells to extend or establish a network has received a lot of attention in the literature. In contrast, fewer approaches have been proposed for optimal sample selection within existing networks, especially in a spatio-temporal context, and the application of these approaches in practice is limited. Current sampling practices often rely on expert judgment and prescriptions by regulatory bodies, which results in datasets that are not well-suited for statistical analysis. Despite its statistical advantages such as generalisability and reducing bias, probability sampling is seldom applied in long-term groundwater quality monitoring. The primary aims of this thesis were to assess common characteristics of long-term groundwater quality data and the use of spatio-temporal models, explore the optimisation of sampling designs through reducing network size, and propose approaches based on probability sampling to support the spatio-temporal modelling of CoPC concentrations.

To compare different approaches to the spatio-temporal modelling of CoPC concentrations via generalised additive models (GAMs) and to assess common characteristics of long-term ground water quality monitoring data, a comparative study is presented. The study uses synthetic and case study data to evaluate differences in estimating spatio-temporal CoPC concentration surfaces via GAMs with separate and joint smooth terms for space and time. The results highlight the importance of model specification and sampling patterns for obtaining reliable estimates of CoPC concentrations.

In practice, the identification of wells that provide redundant information with respect to the estimation of CoPC concentrations can often be omitted from sampling designs to reduce strain on resources, whilst ensuring that conclusions about spatial and temporal trends are not affected. The demand for a tool to facilitate well redundancy analysis was identified through feedback shared by the users of the groundwater quality modelling software GWSDAT1. In this thesis, a computationally efficient, data-driven approach is proposed for ranking monitoring wells based on their influence on spatio-temporal CoPC concentration models. The approach is based on influential observation diagnostics and is shown to provide rankings similar to a computationally more demanding, cross-validation (CV) based method, through a case study and a simulation study using synthetic groundwater quality data. The approach has also been implemented in GWSDAT.

Thus, omitting redundant wells can be an effective approach for the optimisation of groundwater monitoring networks, but it does not make suggestions on specific sample selection. The generation of spatio-temporal sampling designs, optimised to support the estimation of CoPC concentrations, can help further improve the sustainability of long-term monitoring. In this thesis, it is shown through a literature review that probability sampling designs that aim to draw a spatially and temporally balanced, i.e. evenly spread samples, result in a more precise estimation of CoPC concentration surfaces than simple random designs. Furthermore, it is proposed that by tuning the inclusion probabilities of sample units in balanced designs based on historic data to track the spatial evolution of CoPC concentrations through time, a more precise characterisation of the CoPC plume can be achieved. An original, data-driven methodology is developed for tuning sample inclusion probabilities to be proportional to future predicted distances between the wells and the boundaries of the CoPC plume. Higher probability for selection is given to wells that are predicted to be closer to the plume at a given time. The proposed methodology is shown to provide advantages in characterising plumes given a sufficiently large sample size, using a case study and a simulation study of synthetic CoPC concentration data.

An approach is also proposed for the application of spatio-temporally balanced sampling designs in evaluating the sufficiency of historic sampling intensity. By comparing the CoPC concentration estimates of models using all available historic data and increasingly smaller subsamples selected via balanced designs, it can be assessed whether the monitoring network has been over or undersampled. This information can then be used to adjust future sampling intensity accordingly.

Finally, consideration is given to the trade-off between high spatial and temporal sampling intensity, given a fixed sample size and monitoring period. In practice, it can be logistically advantageous to perform sampling less frequently, but obtain more samples during each campaign. This results in a spatially high, but temporally low resolution data set. Through a simulation study of synthetic CoPC plume data, it is shown that high spatial resolution data is advantageous when estimating the overall concentration surface, but high temporal resolution data can provide benefits in estimating plume characteristics.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Additional Information: Supported by funding from Shell Research Ltd.
Subjects: G Geography. Anthropology. Recreation > GE Environmental Sciences
H Social Sciences > HA Statistics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Funder's Name: Shell Research Ltd.
Supervisor's Name: Miller, Professor Claire and Alexander, Dr. Craig
Date of Award: 2025
Depositing User: Theses Team
Unique ID: glathesis:2025-85157
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 05 Jun 2025 10:46
Last Modified: 05 Jun 2025 10:48
Thesis DOI: 10.5525/gla.thesis.85157
URI: https://theses.gla.ac.uk/id/eprint/85157

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year