Understanding and improving the applicability of randomised controlled trials: subgroup reporting and the statistical calibration of trials to real-world populations

Wei, Lili (2024) Understanding and improving the applicability of randomised controlled trials: subgroup reporting and the statistical calibration of trials to real-world populations. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2023WeiLiliPhD.pdf] PDF
Download (4MB)


Context and objective

Randomised controlled trials (hereafter, trials) are widely regarded as the gold standard for evaluating treatment efficacy in medical interventions. They employ strict study designs, rigorous eligibility criteria, standardised protocols, and close participant monitoring under controlled conditions, contributing to high internal validity. However, these stringent criteria and procedures may limit the generalisability of trial findings to real-world situations, which often involve diverse patient populations such as multimorbidity and frailty patients. Consequently, there is growing interest in the applicability of trials to real-world clinical practice. In this thesis I will 1) evaluate how well major trials report on variation in treatment effects and 2) examine the use of trial calibration methods to test trial applicability.


1) A comprehensive and consistent subgroup reporting description was presented, which contributes to the exploration of subgroup effects and treatment heterogeneity for informed decision-making in tailored subgroup populations within routine practice. The study evaluated 2,235 trials from clinicaltrial.gov that involve multiple chronic medical conditions, assessing the presence of subgroup reporting in corresponding publications and extracting subgroup terms. These terms were then standardised and summarised using Medical Subject Headings and WHO Anatomical Therapeutic Chemical codes. Logistic and Poisson regression models were employed to identify independent predictors of subgroup reporting patterns.

2) Two calibration models, namely the regression-based model and inverse odds of sampling weights (IOSW) were implemented. These models were utilised to apply the findings from two influential heart failure (HF) trials - COMET and DIG - to a real-world HF registry in Scotland consisting of 8,012 HF patients mainly with reduced ejection fraction, using individual participant data (IPD) from both datasets. Additionally, calibration was conducted within the subgroup population (lowest and highest risk group) of the real-world Scottish HF registry for exploratory analyses. The study provided comparisons of baseline characteristics and calibrated and uncalibrated results between the trial and registry. Furthermore, it assessed the impact of calibration on the results with the focus on overall effects and precision.


The subgroup reporting study showed that among 2,235 eligible trials, 48% (1,082 trials) reported overall results and 23% (524 trials) reported subgroups. Age (51%), gender (45%), racial group (28%) and geographical locations (17%) were the most frequently reported subgroups among 524 trials. Characteristics related to the index condition (severity/duration/types, etc.) were somewhat commonly reported. However, reporting on metrics of comorbidity or frailty and mental health were rare. Follow-up time, enrolment size, trial starting year and specific index conditions (e.g., hypercholesterolemia, hypertension etc.) were significant predictors for any subgroup reporting after adjusting for enrolment size and index conditions while funding source and number of arms were not associated with subgroup reporting.

The trial calibration study showed that registry patients were, on average, older, had poorer renal function and received higher-doses of loop diuretics than trial participants. The key findings from two HF trials remained consistent after calibration in the registry, with a tolerable decrease in precision (larger confidence intervals) for the effect estimates. Treatment-effect estimates were also similar when trials were calibrated to high-risk and low-risk registry patients, albeit with a greater reduction in precision.


Variations in subgroup reporting among different trials limited the feasibility to evaluate subgroup effects and examine heterogeneity of treatment effects. If IPD or IPD alternative summarised data is available from trials and the registry, trial applicability can be assessed by performing calibration.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: R Medicine > R Medicine (General)
Colleges/Schools: College of Medical Veterinary and Life Sciences > School of Health & Wellbeing > General Practice and Primary Care
Supervisor's Name: McAllister, Professor David and Lewsey, Professor Jim
Date of Award: 2024
Depositing User: Theses Team
Unique ID: glathesis:2024-84047
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 23 Jan 2024 16:19
Last Modified: 23 Jan 2024 16:19
Thesis DOI: 10.5525/gla.thesis.84047
URI: https://theses.gla.ac.uk/id/eprint/84047
Related URLs:

Actions (login required)

View Item View Item


Downloads per month over past year