Elsafoury, Fatma (2019) Detecting protest repression incidents from tweets. MSc(R) thesis, University of Glasgow.
Full text available as:
PDF
Download (3MB) |
Abstract
Protests are considered a threat to governments and political elites, that is why protesters are likely to be faced with repression. For social scientists to study protest repression, they need protest repression datasets. Currently, social scientists depend on news reports to build protest datasets and political conflict datasets. Although news reports provide a source of information that gives access to historical and international events, they have limitations like the coverage of small protest events and the delay in reporting incidents. This research explores the use of social media posts, especially Twitter, to build protest repression dataset and to overcome the limitations of using new reports. We use supervised machine learning models with a dataset of tweets that were sent during the Turkish Gezi Park protest in 2013 to detect tweets that report protest repression events. To accomplish this, we run a crowdsourcing experiment to build a training dataset of tweets and their corresponding labels as protest-related or not and violent or not. Then, we use this dataset to train two baseline machine learning models: Support Vector Machine(SVM) and Multinomial Naive Bayes(MNB) with different text representation models: Bag of Words(BOW), TF-IDF and word Embedding(WE). The empirical results of the experiments show that Crowdsourcing with the right settings and quality measures provides a fast and cheap way to hand label datasets to train machine learning models. The results also show that baseline machine learning models perform well in tweets classification tasks in terms of good AUC scores (high true positive rate and low false-positive rate).
Item Type: | Thesis (MSc(R)) |
---|---|
Qualification Level: | Masters |
Keywords: | Protests, Violence, Protest repression, Twitter, Machine learning, Text classification, Support vector machine (SVM), Naive Bayes (NB), Crowdsourcing, Figure-Eight. |
Subjects: | T Technology > T Technology (General) |
Colleges/Schools: | College of Science and Engineering > School of Computing Science |
Supervisor's Name: | Jensen, Dr. Bjorn Sand, Rogers, Dr. Simon and Claassen, Dr. Christopher |
Date of Award: | 2019 |
Depositing User: | Ms Fatma Elsafoury |
Unique ID: | glathesis:2019-75160 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 01 Nov 2019 15:00 |
Last Modified: | 05 Mar 2020 22:21 |
Thesis DOI: | 10.5525/gla.thesis.75160 |
URI: | https://theses.gla.ac.uk/id/eprint/75160 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year