Three applications of machine learning methods in corporate finance

Movaghari, Hadi (2024) Three applications of machine learning methods in corporate finance. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2024MovaghariPhD.pdf] PDF
Download (2MB)


This thesis focuses on three applications of machine learning methods in corporate finance. The first two applications (Chapter 2 and 3) are dedicated to two applications of double (or debiased) machine learning (DML) on corporate cash holdings, and merger returns, respectively. The third application (Chapter 4) is related to empirical evaluation of the heterogeneous impacts of cost of carry on cash holdings using the causal forest (CF) method. I also provide a comprehensive introduction to machine learning techniques and the potential benefits that these methods can bring to enhance the effectiveness of data analysis in the field of finance (Chapter 1). The motivation for using DML is the existence of a large number of explanatory variables in the relevant literature. The increase of features in a system probably causes a high degree of non-linearities and hidden complex inter-relationships between covariates. Traditional machine learning methods which rely on the linearity assumption, like LASSO, cannot handle these ill-conditions. Another weakness that such traditional methods suffer from is omitted variable bias. This means that variables that are probably relevant in predicting the dependent variable are left out due to model selection mistakes. The DML method allows the modelling of non-linearities by incorporating specialized machine learning methods like gradient boosting method. In addition, it resolves the omitted variable bias of naïve estimator through double usage of machine learning methods in the step of nuisance functions estimation. The motivation for using CF is that we aim to examine the possible heterogeneity at the firmlevel, instead of estimating the average relationship across all firms. In fact, CF is a random forest based method to examine the possible heterogeneity at the level of individuals. Although such heterogeneity can be detected by conventional approaches such as subsample analysis, such an approach has two shortages: data snooping bias and preventing the development of new theories given sample partitioning based on previous knowledge. CF is a technique to address these challenges. In addition, as a nonparametric method, it does not require the linearity assumption unlike conventional methods. Chapter 2 compares the relative importance of potential drivers of cash increase among US industrial firms utilizing DML method. The results show that tangible assets and R&D spending have statistically significant and economically important effects on cash holdings. Cross-sectional analysis illustrates that debt maturity and cost of carry have lost their importance over the years, while intangible assets have become more important. The ranking of drivers is not specific to healthcare and technology sectors, which have recorded the highest increase in cash. The obtained results are robust to alternative machine learners (gradient boosting method, LASSO, regression trees), cash proxies and estimation methods. These findings have important implications for policymakers regarding the reasons for the slow recovery from the Great Recession. Chapter 3 investigates the informational value of mergers and acquisitions (M&A) return determinants within a short window around the announcement date using DML technique. The results support the predominant role for variables that are assumed to mitigate information asymmetry (e.g., target’s number of analysts, and investment advisors for bidder and target) in M&A deals. It also provides strong evidence regarding the significant effect of high-tech deal indicator, which is closely related to the issue of information asymmetry. The obtained results are robust to different benchmarks (random benchmarks and commonly used ones), alternative machine learners (LASSO, gradient boosting method, and regression trees), and windows of different lengths around the announcement date (CAR(−1, 1), CAR(−4, 4)). Overall, findings affirm the prevalence of irrelevant predictors in M&A literature, underscoring the necessity for developing new theories to identify potential predictors in explaining M&A returns. Chapter 4 examines the heterogeneity in the effect of cost of carry on cash holdings using the causal forest method. Studying the money demand function at firm level, rather than at the average level, it provides evidence that the density of cost of carry effects with entirely negative values during the 1970s and 1980s have been moving into positive territory since the 1990s. This suggests that the breaching of the Baumol-Tobin model’s postulation is more relevant in modern times, with low interest rates. Firm size and net working capital are the most important features responsible for causing the heterogeneity in the cost of carry effect. Particularly, firm size exhibits a hump-shaped effect on the elasticity of cash to cost of carry rather than a simple linear effect, contradiction to existing literature. These results remain robust to alternative cash measures and are not driven by omitted variable bias. These findings suggest that policy makers should track the distributional impacts of opportunity cost of money over time to better evaluate the evolution of monetary policy.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Additional Information: Grant number/references: College of Social Sciences (CoSS) PhD Scholarship.
Subjects: H Social Sciences > HG Finance
Colleges/Schools: College of Social Sciences > Adam Smith Business School
Supervisor's Name: Vagenas-Nanos, Professor Evangelos and Sermpinis, Professor Georgios
Date of Award: 2024
Depositing User: Theses Team
Unique ID: glathesis:2024-84298
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 07 May 2024 08:59
Last Modified: 08 May 2024 07:31
Thesis DOI: 10.5525/gla.thesis.84298

Actions (login required)

View Item View Item


Downloads per month over past year