Query-driven learning for automating exploratory analytics in large-scale data management systems

Savva, Fotis (2021) Query-driven learning for automating exploratory analytics in large-scale data management systems. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2020SavvaFotisPhD.pdf] PDF
Download (7MB)

Abstract

As organizations collect petabytes of data, analysts spend most of their time trying to extract insights. Although data analytic systems have become extremely efficient and sophisticated, the data exploration phase is still a laborious task with high productivity, monetary and mental costs. This dissertation presents the Query-Driven learning methodology in which multiple systems/frameworks are introduced to address the need of more efficient methods to analyze large data sets. Countless queries are executed daily, in large deployments, and are often left unexploited but we believe they are of immense value. This work describes how Machine Learning can be used to expedite the data exploration process by (a) estimating the
results of aggregate queries (b) explaining data spaces through interpretable Machine Learning models (c) identifying data space regions that could be of interest to the data analyst. Compared to related work in all the associated domains, the proposed solutions do not utilize any of the underlying data. Because of that, they are extremely efficient, decoupled from underlying infrastructure and can easily be adapted. This dissertation is a first account of
how the Query-Driven methodology can be effectively used to expedite the data exploration process focusing solely on extracting knowledge from queries and not from data.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: query-driven learning, machine learning, databases, exploratory analytics, data mining, approximate query processing, automatic data exploration.
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Anagnostopoulos, Dr. Christos and Triantafillou, Prof. Peter
Date of Award: 2021
Depositing User: MR Fotis Savva
Unique ID: glathesis:2021-81907
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 12 Jan 2021 17:05
Last Modified: 12 Jan 2021 17:10
Thesis DOI: 10.5525/gla.thesis.81907
URI: https://theses.gla.ac.uk/id/eprint/81907

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year