Automatic architecture selection for hierarchical mixture of experts models

Voroneckaja, Ivona (2023) Automatic architecture selection for hierarchical mixture of experts models. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2023VoroneckajaPhD.pdf] PDF
Download (6MB)


Hierarchical mixture of experts (HME) is a powerful tree-structured modeling technique based on the divide and conquer principle. HME model trees consist of two types of nodes - gate nodes, which are responsible for splitting a large complex problem into several smaller subproblems, and expert nodes, which perform the corresponding subproblemsolving. Selecting the number of such nodes as well as the order in which they are arranged is, however, a non-trivial task. A commonly used approach involves fitting several architectures and using methods such as cross-validation to pick the best one. As well as being computationally intensive, this method first requires one to pick the set of architectures to consider. For complex models with a large number of architectural elements, this leads to an unmanageable number of potential options. Pre-setting model architecture also requires choosing initial parameter values, which becomes progressively more challenging as parameter dimensionality increases. The latter challenges could be addressed by growing trees during the model fitting process instead of selecting the architecture in advance. It is thus evident that HME models suffer from a lack of a flexible and adaptive way of performing automatic architecture selection.

The work presented in this thesis proposes automatic architecture selection methods for HME models, which allow for adding and removing tree nodes as well as adjusting the order in which they are arranged. As part of the development, three Bayesian parameter sampling strategies are proposed and systematically evaluated resulting in a recommended strategy. An adaptation of the Reversible Jump (RJ) algorithm is then used to grow and prune HME model trees. The main downfall of the RJ, which lies in low acceptance rates, is addressed by the addition of a novel reversible jump proposal algorithm. A new Gate Swaps (GS) algorithm is then proposed to tackle the problem of changing the order in which the existing tree nodes are arranged. Both algorithms are evaluated on two real-life problems with a particular focus on the Glasgow rental property prices data. It is shown that HME models fitted using the proposed RJ GS MCMC yield accurate predictions as well as provide an exceptionally high level of model interpretability, which is unusual amongst other machine learning methods.

Item Type: Thesis (PhD)
Additional Information: Supported by funding from the School for the Maclaurin Scholarship.
Subjects: Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics
Supervisor's Name: Dean, Dr. Nema and Evers, Dr. Ludger
Date of Award: 2023
Depositing User: Theses Team
Unique ID: glathesis:2023-83492
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 22 Mar 2023 12:45
Last Modified: 23 Mar 2023 09:55
Thesis DOI: 10.5525/gla.thesis.83492

Actions (login required)

View Item View Item


Downloads per month over past year