Salamin, Hugues Eric (2013) Automatic role recognition. PhD thesis, University of Glasgow.
Full text available as:
PDF
Download (794kB) |
Abstract
The computing community is making significant efforts towards the development of automatic approaches for the analysis of social interactions. The way people interact depends on the context, but there is one aspect that all social interactions seem to have in common: humans behave according to roles. Therefore, recognizing the roles of participants is an essential step towards understanding social interactions and the construction of socially aware
computer.
This thesis addresses the problem of automatically recognizing roles of participants in multi-party recordings. The objective is to assign to each participant a role. All the proposed approaches use a similar strategy. They all start by segmenting the audio into turns. Those
turns are used as basic analysis units. The next step is to extract features accounting for the organization of turns. The more sophisticated approaches extend the features extracted with features from either the prosody or the semantic. Finally, the mapping of people or turns to
roles is done using statistical models. The goal of this thesis is to gain a better understanding of role recognition and we will investigate three aspects that can influence the performance
of the system:
We investigate the impact of modelling the dependency between the roles.
We investigate the contribution of different modalities for the effectiveness of role
recognition approach.
We investigate the effectiveness of the approach for different scenarios.
Three models are proposed and tested on three different corpora totalizing more than 90 hours of audio. The first contribution of this thesis is to investigate the combination of turn-taking features and semantic information for role recognition, improving the accuracy of
role recognition from a baseline of 46.4% to 67.9% on the AMI meeting corpus. The second contribution is to use features extracted from the prosody to assign roles. The performance of this model is 89.7% on broadcast news and 87.0% on talk-shows. Finally, the third contribution is the development of a model robust to change in the social setting. This model achieved an accuracy of 86.7% on a database composed of a mixture of broadcast news and
talk-shows.
Item Type: | Thesis (PhD) |
---|---|
Qualification Level: | Doctoral |
Keywords: | Social Signal Processing, Automatic Role Recognition |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Colleges/Schools: | College of Science and Engineering > School of Computing Science |
Supervisor's Name: | Vinciarelli, Dr. Alsessandro |
Date of Award: | 2013 |
Depositing User: | Mr Hugues Salamin |
Unique ID: | glathesis:2013-4367 |
Copyright: | Copyright of this thesis is held by the author. |
Date Deposited: | 11 Jun 2013 08:34 |
Last Modified: | 11 Jun 2013 08:34 |
URI: | https://theses.gla.ac.uk/id/eprint/4367 |
Actions (login required)
View Item |
Downloads
Downloads per month over past year