P-spline additive modeling and partial derivative estimation for environmental data

Vazanellis, George (2020) P-spline additive modeling and partial derivative estimation for environmental data. PhD thesis, University of Glasgow.

Full text available as:
[img]
Preview
PDF
Download (26MB) | Preview

Abstract

This thesis addresses the construction of complex additive mixed models for environmental data and the use of those models to estimate partial derivatives for the purpose of detecting impacts of known events.

The methods developed are applied to a data set collected by the Scottish Environment Protection Agency in an effort to monitor the dissolved oxygen of the River Clyde. There are many metrics recorded along the River. Exploratory analysis is carried out to pinpoint some possible drivers of the dissolved oxygen.

The River Clyde contains processes which are diffcult to represent by conventional parametric models. P-splines offer a means of fitting a flexible model to this data set. There is also the possibility of the presence of interactions between some explanatory covariates. Because of the sampling regime, a random effects component is appropriate. An additive mixed model with interactions allows for all the above-mentioned components to be included in a representative model for the River Run data. The methodology for fitting such a model, along with descriptions of four information criteria which are intended to aid in smoothing parameter selection, are explained in this thesis. Two options for performing analysis of variance for additive models with interactions are considered: A simple F-test and a quadratic approach. The performance and computational expense of each is compared to a parametric bootstrap and to various other standard
tests.

A simple additive model with no interactions is initially fitted with varying degrees of freedom for each main effect. The four information criteria scores are calculated for every main effect across all degrees of freedom. The information criterion which performs best is then used to select the optimal smoothing parameter for every main effect in an additive model and an additive mixed model, both with no interactions. Before an additive mixed model with interactions is fitted, a simulation study is conducted to see if the order of optimization of the main effect degrees of freedom is of any importance. An additive mixed model with interactions is subsequently fitted and interpreted.

One aim of this thesis is to determine if upgrades to two wastewater treatment facilities have had positive impacts to the levels of dissolved oxygen in the river. Partial derivatives with respect to time are discussed as a means of detecting subtle changes in a system which has shown gradual increases in dissolved oxygen over the past four decades. An argument is made for the use of P-splines with penalty orders other than 2 if the main goal is derivative estimation. A simulation study is conducted and the optimal penalty order is then used to construct a derivative additive mixed model with interactions for the River Run data. This model is used to see if there is evidence the wastewater facility upgrades had a positive impact. One positive result of this research is that the quadratic forms method of analysis of variance for additive models with interactions was found to out-perform the simple F-test and was less computationally expensive than the parametric bootstrap. A second positive result was finding a preferred information criterion for smoothing parameter selection and using the optimal degrees of freedom to subsequently fit such a complex additive mixed model with interactions. A third positive result was finding that penalty order three outperformed penalty order two in estimating partial derivatives. Finally, the fourth positive result was constructing a derivative model and subsequently using it to provide evidence the wastewater treatment facility upgrades had a positive impact on the dissolved oxygen.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: P-splines, additive mixed model, derivative estimation, penalty order, environmental data.
Subjects: H Social Sciences > HA Statistics
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics
Supervisor's Name: Bowman, Professor Adrian
Date of Award: 2020
Depositing User: Dr George Vazanellis
Unique ID: glathesis:2020-78974
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 04 Feb 2020 17:04
Last Modified: 04 Feb 2020 17:09
URI: http://theses.gla.ac.uk/id/eprint/78974

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year