Flexible regression for river systems

Rushworth, Alastair M (2014) Flexible regression for river systems. PhD thesis, University of Glasgow.

Full text available as:
Download (9MB) | Preview
Printed Thesis Information: https://eleanor.lib.gla.ac.uk/record=b3059623


Maintaining river health is of vital importance to the human populations that depend on them for drinking water, and for the income generated from industry and leisure activities. The key to a clear understanding of the current state of the river environment lies in assimilating the various data that are available for a particular river catchment. As a result of the large expense involved in extensive data collection programmes, measurements are often only taken at a handful of monitoring locations, resulting in large portions of a river network remaining unmonitored and rendering it difficult to assess the health of the river as a whole. Interpreting observations associated with a particular response variable pivots on understanding many other variables whose underlying relationships are often highly complex and which may not be routinely measured. Cutting-edge statistical methods can play a crucial role in the interpretation of such data, particularly when faced with small sample sizes and the presence of latent processes. In particular, developing models for environmental data that relax the assumption of simple linear dependencies between response and covariate is a core theme of this thesis, which can enable powerful descriptions of such complex systems. This approach adopts and promotes modern flexible regression techniques based on penalised splines, which are motivated and summarised in Chapter 2; these permit regression relationships to assume a wide variety of non-linear shapes, without requiring the modeller to impose a priori structure.

This thesis aims to address two related, but distinct regression problems for data collected within a river catchment. Firstly, the relationship between rainfall data collected at a rain gauge and subsequent river flow rates collected at a point downstream is tackled in Chapter 3. In this application, it is of particular interest to understand the degree, duration and time-lag of the influence of a rainfall event on a measurable increase in river flow rates at a downstream location. This relationship is complex because it is governed by attributes of the surrounding river environment that may not be readily available, such as soil composition, land use and ground strata. However, rainfall and flow data are frequently collected at a high temporal resolution, and Chapter 3 develops models that exploits this feature that are able to express complex lagged dependence structures between a sequence of flow rates and a rainfall time series. The chapter illustrates how the resulting model enables insight into the sensitivity of the river to additional rainfall, and provides a mechanism for obtaining predictions of future flow rates, without recourse to traditional computationally intensive deterministic modelling.

This thesis also tackles the problem of constructing appropriate models for the spatial structure of variables that are carried by water along the channels of the river network. This problem cannot be approached using traditional spatial modelling tools due to the presence of the different volumes of water that mix at confluence points, often causing sudden changes in the levels of the measured variable. Very little literature is available for this type of spatial problem, and none has been developed that is appropriate for the large data sets that are becoming increasingly common in many environmental settings. Chapters 4 and 5 develop new regression models that can incorporate spatial variation on a stream network that respects the presence of confluences, flow rates and direction, while including non-linear functional representations for the influence of covariates. These different model components are constructed using the same modern flexible regression framework as used in Chapter 3, and the computational benefits of adopting this approach are highlighted. Chapter 4 illustrates the utility of the new models by applying them to a large set of dissolved nitrate concentrations collected over a Scottish river network. The application reveals strong trends in both space and time, and evidence of a subtle interaction between temporal trend and the location in space; both conclusions would have been difficult to reach using other techniques.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: penalized, splines, semiparametric, regression, stream, network, rainfall, nitrate
Subjects: G Geography. Anthropology. Recreation > GE Environmental Sciences
Q Science > Q Science (General)
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Bowman, Professor Adrian W. and Brewer, Dr. Mark J.
Date of Award: 2014
Depositing User: Mr Alastair M Rushworth
Unique ID: glathesis:2014-5267
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 26 Jun 2014 11:05
Last Modified: 26 Jun 2014 11:11
URI: http://theses.gla.ac.uk/id/eprint/5267

Actions (login required)

View Item View Item


Downloads per month over past year