Approximate Bayesian inference for educational attainment models

Alghamdi, Shuhrah (2022) Approximate Bayesian inference for educational attainment models. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2022alghamdiphd.pdf] PDF
Download (14MB)


The rapidly expanding volume of educational testing data from online assessments has posed a problem for researchers in modern education. Their main goal is to utilise this information in a timely and adaptive manner to infer skills mastery, improve learning facilities and adapt them to individual learners. Over the past few years, several static statistical models have been proposed for extracting knowledge about skills mastery from item response data. However, realistic models typically lead to complex, computationally expensive fitting methods such as Markov chain Monte Carlo (MCMC). In an extensive comparison study, this thesis showed that the MCMC methods are unusable for streaming data, which appear to be very slow even for the efficient and fastest methods such as Hamiltonian Monte Carlo (HMC). On the other hand, the sequential Monte Carlo (SMC) methods have been widely used to reduce the time of dynamic Bayesian analysis. This thesis contributed to the application of two different settings of the SMC algorithms to the item response theory (IRT) model and compared the output to the MCMC results. However, the results showed that these methods were not fast enough to estimate students’ ability in real-time and provide immediate feedback even for a small dataset. Moreover, the efficiency of the SMC methods depends on the user settings, which might be difficult for real-time inference or non-professional users.

Therefore, these methods will not scale well for streaming data and large-scale real-time systems. The main objective of this thesis is to develop approximate Bayesian inference based on the Laplace approximation method (LA), which allows faster inference for item response theory (IRT) models.

The LA estimation method’s performance for the logistic IRT models has been compared with the MCMC method in simulation studies. Based on the results of several comparison criterion methods such as bias, RMSE, and Kendall’s τ, the performance of the LA is very good in small, moderate, and relatively large sample size settings. The LA estimated abilities results are very close to the actual and MCMC values. In addition, LA resulted in between a 120 to 900 times speed up over MCMC, making it a more practical alternative for large educational testing datasets. Also, this thesis investigated the issue of the high-dimensional covariance matrix for massive datasets, which may slow the LA method. Two solutions: using the LA diagonal and the block matrix techniques, have been proposed to reduce the computation cost. In addition, a novel sequential LA approach was proposed and successfully applied in this thesis to allow using LA in a dynamic inference. The result showed that this method is comparable to the full LA method. Moreover, the use of a real dataset confirmed that the proposed LA inference method provided similar estimates to MCMC estimation with much faster computation.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics
Supervisor's Name: Dean, Dr. Nema and Yang, Dr. Xiaochen
Date of Award: 2022
Depositing User: Theses Team
Unique ID: glathesis:2022-83314
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 20 Dec 2022 14:06
Last Modified: 27 Jun 2023 08:56
Thesis DOI: 10.5525/gla.thesis.83314

Actions (login required)

View Item View Item


Downloads per month over past year