Supervised extractive summarisation of news events

Mackie, Stuart William (2018) Supervised extractive summarisation of news events. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2018mackiephd.pdf] PDF
Download (2MB)
Printed Thesis Information: https://eleanor.lib.gla.ac.uk/record=b3303979

Abstract

This thesis investigates whether the summarisation of news-worthy events can be improved by using evidence about entities (i.e.\ people, places, and organisations) involved in the events. More effective event summaries, that better assist people with their news-based information access requirements, can help to reduce information overload in today's 24-hour news culture.

Summaries are based on sentences extracted verbatim from news articles about the events. Within a supervised machine learning framework, we propose a series of entity-focused event summarisation features. Computed over multiple news articles discussing a given event, such entity-focused evidence estimates: the importance of entities within events; the significance of interactions between entities within events; and the topical relevance of entities to events.

The statement of this research work is that augmenting supervised summarisation models, which are trained on discriminative multi-document newswire summarisation features, with evidence about the named entities involved in the events, by integrating entity-focused event summarisation features, we will obtain more effective summaries of news-worthy events.

The proposed entity-focused event summarisation features are thoroughly evaluated over two multi-document newswire summarisation scenarios. The first scenario is used to evaluate the retrospective event summarisation task, where the goal is to summarise an event to-date, based on a static set of news articles discussing the event. The second scenario is used to evaluate the temporal event summarisation task, where the goal is to summarise the changes in an ongoing event, based on a time-stamped stream of news articles discussing the event.

The contributions of this thesis are two-fold. First, this thesis investigates the utility of entity-focused event evidence for identifying important and salient event summary sentences, and as a means to perform anti-redundancy filtering to control the volume of content emitted as a summary of an evolving event. Second, this thesis also investigates the validity of automatic summarisation evaluation metrics, the effectiveness of standard summarisation baselines, and the effective training of supervised machine learned summarisation models.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Additional Information: I acknowledge the financial support of EPSRC Doctoral Training Grant 1509226 and the financial support of EC SMART Project FP7-287583.
Keywords: Information retrieval, text summarisation.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Ounis, Professor Iadh and Craig, Dr. Macdonald
Date of Award: 2018
Depositing User: Mr Stuart William Mackie
Unique ID: glathesis:2018-8865
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 13 Mar 2018 08:47
Last Modified: 06 Apr 2018 13:09
URI: https://theses.gla.ac.uk/id/eprint/8865

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year