An investigation of equine injuries in Thoroughbred flat racing in North America

Georgopoulos, Stamatis Panagiotis (2017) An investigation of equine injuries in Thoroughbred flat racing in North America. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2017GeorgopoulosPhD.pdf] PDF
Download (5MB)
Printed Thesis Information:


The aim of this research work was to investigate and quantify the risk of fatal and fracture injury for Thoroughbreds participating in flat racing in the US and Canada so that horses at particular risk can be identified and the risk of fatal injury reduced. Risk factors associated with fatalities and fractures were identified and predictive models for both fatalities and fractures were developed and their performance was evaluated. Our analysis was based on 188,269 Thoroughbreds that raced on 89 racecourses reporting injuries to the Equine Injury Database (EID) in the US and Canada from 1st January 2009 to 31st December 2015. This included 2,493,957 race starts and 4,592,162 exercise starts. The race starts reported to the EID represented the starts for 90.0% of all official Thoroughbred racing events in the United States and Canada during the 7-year observation period.
The annual average risk of fatal and fracture equine injuries for the period 2009 - 2015 was estimated and a description of the different injury types that resulted in fatalities and fractures was given, based on the cases recorded in the EID.
Possible risk factors were pre-screened using univariable logistic regression models; risk factors with an association indicated by p < 0.20 were then included in a stepwise logistic regression selection process. A forward bidirectional elimination approach using Akaike's Information Criterion was utilised for the stepwise selection. We identified more than 20 risk factors that were found to be significantly associated with fatal injury (p < 0.05) and more than 20 risk factors associated with fracture injury, across the final multi-variable models. The risk factors identified are related to the horse’s previous racing history, the trainer, the race, the horse's expected performance and the horse's racing history.
Five different algorithms were used to develop predictive models based on the data available from the period 2009 - 2014 for both fatal and fracture injuries. Firstly, we used Multivariable Logistic Regression, commonly used in risk factor analysis. Secondly, Improved Balanced Random Forests were developed, a machine learning algorithm based on a modification of the random forests algorithm. Because fatal injuries are extremely rare events, less than 2 instances per 1000 starts on average, balanced samples were used to develop the Random Forest model to deal with the class-imbalance problem. Furthermore, we trained an Artificial Neural Network with a single layer and two networks with deep architecture, a Deep Belief Network and a Stacked Denoising Autoencoder. As artificial neural networks and deep learning models have been successfully used to solve complex problems in a diverse field of domains we wanted to explore the possibility of using them to successfully predict equine injuries. The performance of each classifier was evaluated by calculating the Area Under the Receiver Operating Characteristic Curve (AUC), using the data available from 2015 for validation. AUC results ranged from 0.62 to 0.64 for the best performing algorithm and similar predictive results were obtained from the wide array of different models created.
This is the first study to make use of the extensive information contained in the EID to identify risk factors associated with equine fatal and fracture injuries in the US and Canada for this period. To our knowledge, this is the largest retrospective observational study investigating the risk of equine fatal and fracture injuries during flat racing in the literature. This is also the first study to train logistic regression and machine learning models to predict equine injuries using such an extensive amount of data and a full year of horse racing events for prediction and evaluation.
We believe the results could help identify horses at high risk of (fatal) injury on entering a race and inform the design and implementation of preventive measures aimed at minimising the number of Thoroughbreds sustaining fatal injuries during racing in North America.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: Risk factor analysis, logistic regression, horse racing, equine epidemiology.
Subjects: S Agriculture > SF Animal culture > SF600 Veterinary Medicine
Colleges/Schools: College of Medical Veterinary and Life Sciences
Supervisor's Name: Parkin, Dr. T. D. H.
Date of Award: 2017
Depositing User: Mr Stamatis P. Georgopoulos
Unique ID: glathesis:2017-8326
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 09 Aug 2017 11:04
Last Modified: 18 Aug 2017 08:01

Actions (login required)

View Item View Item


Downloads per month over past year