Statistical analysis of breast cancer: The effects of missing values on survival data

Thomson, Catherine S. (1999) Statistical analysis of breast cancer: The effects of missing values on survival data. MSc(R) thesis, University of Glasgow.

Full text available as:
[thumbnail of scanned version of the original print thesis] PDF (scanned version of the original print thesis)
Download (8MB)
Printed Thesis Information:


Worldwide, breast cancer is the most common cancer in women. In Scotland, there are currently over 3,000 women diagnosed with the disease each year and the incidence continues to rise. Despite some major advances in the treatment of breast cancer, with the discovery of tamoxifen and the on-going development of cytotoxic drugs, only 60% of women are still alive after five years, many of whom have a relapse at some later stage. Against that background, the main aims of this thesis are to interpret the findings of a survival analysis of cases of breast cancer in Scotland; to investigate whether the method of including extra categories for unknown values in factors in that analysis is appropriate; and to check whether the assumption of proportional hazards is valid. Chapter 1 provides a general introduction, whilst Chapter 2 examines the burden that breast cancer places on the National Health Service in Scotland and throughout the world. The risk factors for getting the disease and the different strategies available for treatment of the cancer are also presented. To identify how women with breast cancer in Scotland were managed. Chapter 3 outlines some background to a national retrospective audit of all cases of invasive breast cancer in the years 1987 and 1993. Analyses of a subgroup of the 1987 cohort constitute the majority of this thesis. Chapter 4 examines the associations among the variables included in the survival analysis. The patterns among the missing values in four of the prognostic factors are also investigated, using log-linear- modelling. The method employed in analysing this cohort of women was to create extra categories to represent unknown values in each of the factors. Other techniques available for handling missing values in models are discussed, along with a summary of the methods used in other relevant studies of breast cancer survival. Chapter 5 presents a survival analysis of the cohort, including a discussion of the findings in relation to other relevant studies. Model checking is performed on the best fit model to assess the adequacy of the fit of it and to validate the assumption of proportional hazards. The remainder of the chapter focuses on a comparison of the results from fitting Cox models using the additional categories and the complete cases methods. This investigates whether different interpretations would be concluded from these models. In Chapter 6, simulated datasets are generated using exponential distributions to investigate whether the proportional hazards assumption is valid when additional categories are used to extend two factors at two levels to three levels in an exponential regression model. The extent of any biases for the parameter estimates is examined. Chapter 7 provides a summary of the key conclusions and highlights areas of future research.

Item Type: Thesis (MSc(R))
Qualification Level: Masters
Additional Information: Adviser: Professor Ian Ford.
Keywords: Biostatistics.
Subjects: H Social Sciences > HA Statistics
R Medicine > R Medicine (General)
R Medicine > RC Internal medicine > RC0254 Neoplasms. Tumors. Oncology (including Cancer)
Colleges/Schools: College of Medical Veterinary and Life Sciences
Date of Award: 1999
Depositing User: Enlighten Team
Unique ID: glathesis:1999-71583
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 10 May 2019 14:13
Last Modified: 26 Oct 2022 08:33
Thesis DOI: 10.5525/gla.thesis.71583

Actions (login required)

View Item View Item


Downloads per month over past year