Comparing crime hotspots at different areal resolutions in Strathclyde

McKay, Rebecca Miriam (2018) Comparing crime hotspots at different areal resolutions in Strathclyde. MSc(R) thesis, University of Glasgow.

Full text available as:
Download (3MB) | Preview
Printed Thesis Information:


Crime hotspots are used by police and government agencies to target interventions and resources in key high crime areas. It is therefore of interest to look at how hotspots are identified. Hotspots can be identified by clustering and then finding the clusters with a high crime level. The modifiable areal unit problem (MAUP) can have an impact on the clusters identified. MAUP means that if the data are aggregated to different areal units, the results can differ. This impact was investigated using crime data provided by Strathclyde police (now Police Scotland) which covered all crimes (bar crimes of a sexual nature) over the financial year 2011 by clustering this data at two different levels of aggregation (output areas and data zones where output areas are nested within data zones). Clustering was carried out using 4 different cluster methods (k-means, finite mixture modelling, Local Moran’s I and Getis Ord Gi*). Maps were produced to visualise this and the adjusted Rand index (a measure of similarity between clusterings) was calculated for each cluster method at the output area and data zone level. The results showed that there was not much similarity in the clusterings produced at the two different areal levels. At the output area level, the methods, k-means, finite mixture modelling and Getis Ord Gi*, clustered over 90% of the output areas in the lowest crime cluster and therefore the lowest crime areas. However, Local Moran’s I had less than 7% in the low crime cluster and this shows there can be a great dissimilarity between cluster methods. When comparing these results at the data zone areal level, there was a distinction between using methods which assumed spatial contiguity and those which made no assumptions. Both k-means and finite mixture modelling produced clusters which had most data zones lying in the low crime cluster while Local Moran’s I and Getis Ord Gi* had most data zones in the medium crime cluster (or non-significant cluster). This shows that at the output area level, most output areas are in the low crime cluster but at the data zone areal level, most data zones are in the medium crime cluster highlighting the difference in clusters identified at each areal unit. This highlighted the MAUP and the importance of choosing the correct areal level for the analysis.

Maps were again used to visualise the clustering output for both output areas and data zones at the output areal level and the adjusted Rand index was calculated and the results showed that there were similarities in the k-means and finite mixture modelling clusterings and also between the clusterings identified by Local Moran’s I and Getis Ord Gi*. Therefore, this shows the importance of choosing areal units and methods wisely, based on the analysis to be undertaken.

Item Type: Thesis (MSc(R))
Qualification Level: Masters
Keywords: Cluster analysis, crime hotspots, modifiable areal unit problem (MAUP), k-means, finite mixture modelling, Getis Ord Gi*, Local Moran's I.
Subjects: H Social Sciences > HA Statistics
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Dean, Dr. Nema and Burman, Prof. Michele and Kearns, Prof. Ade
Date of Award: 2018
Depositing User: Miss Rebecca M McKay
Unique ID: glathesis:2018-41161
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 28 May 2019 13:09
Last Modified: 09 Jul 2019 08:48

Actions (login required)

View Item View Item


Downloads per month over past year