Transferable species distribution modelling: comparative performance evaluation and interpretation of novel Generalized Functional Response models

Aldossari, Shaykhah (2023) Transferable species distribution modelling: comparative performance evaluation and interpretation of novel Generalized Functional Response models. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2022AldossariPhD.pdf] PDF
Download (5MB)

Abstract

Predictive species distribution models (SDMs) are becoming increasingly important in ecology, in the light of rapid environmental change. The predictions of most current SDMs are specific to the habitat composition of the environments in which such models were fitted. However, species respond differently to a given habitat depending on the availability of all habitats in their environment, a phenomenon known as a functional response in resource selection. The Generalised Functional Response (GFR) framework captures this dependence by formulating the SDM coefficients as functions of habitat availability in the broader environment. The original GFR implementation used global polynomial functions of habitat availability to describe functional responses. In the present thesis, I develop several refinements of this approach and compare their explanatory and predictive performance using two simulated and three real datasets.

I use local radial basis functions (RBF), a more flexible approach than global polynomials, to represent the habitat selection coefficients and regularization to balance bias and variance and prevent over-fitting. Second, I use the RBF-GFR and GFR models in combination with the classification and regression tree (CART), which has more flexibility and better predictive powers for non-linear modelling. As further extensions, I use random forests (RF) and extreme gradient boosting (XGBoost) ensemble approaches that consistently lead to variance reduction in generalization error.

After applying the original and extended models to four different datasets, I find that the different methods perform consistently across the datasets, such that their approximate ranking for out-of-data prediction is preserved. The traditional stationary approach to SDMs, excluding the GFR model, consistently performs at the bottom of the ranking. The best methods in my list provide non-negligible improvements in predictive performance, in some cases taking the out-of-sample R2 score from 0.3 up to 0.7across datasets.

At times of rapid environmental change and spatial non-stationarity ignoring the effects of functional responses on SDMs, results in two different types of prediction bias (under-prediction or mis-positioning of distribution hotspots). However, not all functional response models are created equal. The more volatile GFR models may fall foul of similar biases. My results indicate that there are consistently robust GFR approaches that achieve transferability consistently across very different datasets.

In addition to these improvements in predictive performance resulting from the GFR, RBF-GFR and their extensions, it is also essential to know whether these models can offer insights into the mechanisms mediating species distributions. I use one of the simulated datasets to interpret two of the models that provide the best predictive power for this dataset. The resulting selection coefficients from the two models are similar, which explains why the two models are able to explain the observed data in similar ways. In addition, the behaviour of the availability-filtered selectivity coefficients is consistent with the known mechanisms generating the data. These findings indicate that despite their purely statistical nature these fundamentally different models show convergent and realistic behaviour.

To test the transferability of the improved versions of the GFR model in a large-scale and multi-species dataset, I use the challenging large-scale North American Breeding Bird Survey BBS dataset. I discuss how the information in the dataset affects the predictive ability of each species abundance. My recent extensions of the GFR model double the biodiversity prediction accuracy compared to the standard generalised linear model (GLM) and the original GFR model.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: H Social Sciences > HA Statistics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Husmeier, Professor Dirk and Matthiopoulos, Professor Jason
Date of Award: 2023
Depositing User: Theses Team
Unique ID: glathesis:2023-83919
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 08 Nov 2023 08:53
Last Modified: 08 Nov 2023 08:55
Thesis DOI: 10.5525/gla.thesis.83919
URI: https://theses.gla.ac.uk/id/eprint/83919

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year