Detecting protest repression incidents from tweets

Elsafoury, Fatma (2019) Detecting protest repression incidents from tweets. MSc(R) thesis, University of Glasgow.

Full text available as:
Download (3MB) | Preview


Protests are considered a threat to governments and political elites, that is why protesters are likely to be faced with repression. For social scientists to study protest repression, they need protest repression datasets. Currently, social scientists depend on news reports to build protest datasets and political conflict datasets. Although news reports provide a source of information that gives access to historical and international events, they have limitations like the coverage of small protest events and the delay in reporting incidents. This research explores the use of social media posts, especially Twitter, to build protest repression dataset and to overcome the limitations of using new reports. We use supervised machine learning models with a dataset of tweets that were sent during the Turkish Gezi Park protest in 2013 to detect tweets that report protest repression events. To accomplish this, we run a crowdsourcing experiment to build a training dataset of tweets and their corresponding labels as protest-related or not and violent or not. Then, we use this dataset to train two baseline machine learning models: Support Vector Machine(SVM) and Multinomial Naive Bayes(MNB) with different text representation models: Bag of Words(BOW), TF-IDF and word Embedding(WE). The empirical results of the experiments show that Crowdsourcing with the right settings and quality measures provides a fast and cheap way to hand label datasets to train machine learning models. The results also show that baseline machine learning models perform well in tweets classification tasks in terms of good AUC scores (high true positive rate and low false-positive rate).

Item Type: Thesis (MSc(R))
Qualification Level: Masters
Keywords: Protests, Violence, Protest repression, Twitter, Machine learning, Text classification, Support vector machine (SVM), Naive Bayes (NB), Crowdsourcing, Figure-Eight.
Subjects: T Technology > T Technology (General)
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Jensen, Dr. Bjorn Sand and Rogers, Dr. Simon and Claassen, Dr. Christopher
Date of Award: 2019
Depositing User: Ms Fatma Elsafoury
Unique ID: glathesis:2019-75160
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 01 Nov 2019 15:00
Last Modified: 05 Mar 2020 22:21
Thesis DOI: 10.5525/gla.thesis.75160

Actions (login required)

View Item View Item


Downloads per month over past year