A comparative analysis of machine learning methods and spatial statistical methods for areal unit Scottish property price data

MacBride, Cara Margaret (2024) A comparative analysis of machine learning methods and spatial statistical methods for areal unit Scottish property price data. MSc(R) thesis, University of Glasgow.

Full text available as:
[thumbnail of 2023macbridemscr.pdf] PDF
Download (7MB)

Abstract

Spatial areal unit data are a type of spatial data which consist of a set of contiguous non-overlapping areal units in space, one example being Data Zones (DZ) in Scotland. A special feature about these data is that they are spatially correlated. This means that pairs of areal units that are close to each other in space have more similar data values and structure to one another than areal units that are further apart. In general, spatial data are modelled using classical spatial statistical methods that account for spatial correlation within the data. One widely established spatial method being the conditional autoregressive (CAR) model where spatial correlation is modelled through a set of random effects. However, in recent years, the application of machine learning (ML) methods to spatial data in order to generate predictions has risen in popularity. Unlike spatial methods, machine learning methods can account for non-linear effects. This results in two important questions of interest: (i) Are classical spatial statistical methods or a-spatial machine learning methods best for prediction of spatial areal unit data? and (ii) Can machine learning methods and spatial methods be combined as one to improve predictive performance compared to using the two methods in isolation? By partitioning the data into training and test sets and evaluating predictions using prediction metrics, this MSc addresses these questions in the context of property prices at the Data Zone level in Scotland. In general, I found that there was little difference between spatial methods and machine learning methods in terms of prediction and the combination of both also had a very similar predictive performance.

Item Type: Thesis (MSc(R))
Qualification Level: Masters
Subjects: H Social Sciences > HA Statistics
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Lee, Professor Duncan and Davies, Dr. Vinny
Date of Award: 2024
Depositing User: Theses Team
Unique ID: glathesis:2024-84170
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 28 Mar 2024 10:46
Last Modified: 28 Mar 2024 10:49
Thesis DOI: 10.5525/gla.thesis.84170
URI: https://theses.gla.ac.uk/id/eprint/84170

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year