Statistical disclosure control: an interdisciplinary approach to the problem of balancing privacy risks and data utility

Comerford, Michael (2014) Statistical disclosure control: an interdisciplinary approach to the problem of balancing privacy risks and data utility. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2014ComerfordPhd.pdf] PDF
Download (8MB)
Printed Thesis Information:


The recent increase in the availability of data sources for research has put significant strain on existing data management work-flows, especially in the field of statistical disclosure control. New statistical methods for disclosure control are frequently set out in the literature, however, few of these methods become functional implementations for data owners to utilise. Current workflows often provide inconsistent results dependent on ad hoc approaches, and bottlenecks can form around statistical disclosure control checks which prevent research from progressing. These problems contribute to a lack of trust between researchers and data owners and contribute to the under utilisation of data sources.

This research is an interdisciplinary exploration of the existing methods. It hypothesises that algorithms which invoke a range of statistical disclosure control methods (recoding, suppression, noise addition and synthetic data generation) in a semi-automatic way will enable data owners to release data with a higher level of data utility without any increase in disclosure risk when compared to existing methods. These semi-automatic techniques will be applied in the context of secure data-linkage in the e-Health sphere through projects such as DAMES and SHIP.

This thesis sets out a theoretical framework for statistical disclosure control and draws on qualitative data from data owners, researchers, and analysts. With these contextual frames in place, the existing literature and methods were reviewed, and a tool set for implementing k-anonymity and a range of disclosure control methods was created. This tool-set is demonstrated in a standard workflow and it is shown how it could be integrated into existing e-Science projects and governmental settings.

Comparing this approach with existing workflows within the Scottish Government and NHS Scotland, it allows data owners to process queries from data users in a semi-automatic way and thus provides for an enhanced user experience. This utility is drawn from the consistency and replicability of the approach combined with the increase in the speed of query processing.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Additional Information: Supported by funding from ESRC.
Subjects: H Social Sciences > H Social Sciences (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Sventek, Prof. Joseph
Date of Award: 2014
Depositing User: Mr Michael Comerford
Unique ID: glathesis:2014-7044
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 11 Feb 2016 12:35
Last Modified: 09 Mar 2016 10:46

Actions (login required)

View Item View Item


Downloads per month over past year