The voting model for people search

Macdonald, Craig (2009) The voting model for people search. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2009macdonaldphd.pdf] PDF
Download (6MB)

Abstract

The thesis investigates how persons in an enterprise organisation can be ranked in response to a query, so that those persons with relevant expertise to the query topic are ranked first. The expertise areas of the persons are represented by documentary evidence of expertise, known as candidate profiles. The statement of this research work is that the expert search task in an enterprise setting can be successfully and effectively modelled using a voting paradigm. In the so-called Voting Model, when a document is retrieved for a query, this document represents a vote for every expert associated with the document to have relevant expertise to the query topic. This voting paradigm is manifested by the proposition of various voting techniques that aggregate the votes from documents to candidate experts. Moreover, the research work demonstrates that these voting techniques can be modelled in terms of a Bayesian belief network, providing probabilistic semantics for the proposed voting paradigm.

The proposed voting techniques are thoroughly evaluated on three standard expert search test collections, deriving conclusions concerning each component of the Voting Model, namely the method used to identify the documents that represent each candidate's expertise areas, the weighting models that are used to rank the documents, and the voting techniques which are used to convert the ranking of documents into the ranking of experts. Effective settings are identified and insights about the behaviour of each voting technique are derived. Moreover, the practical aspects of deploying an expert search engine such as its efficiency and how it should be trained are also discussed.

This thesis includes an investigation of the relationship between the quality of the underlying ranking of documents and the resulting effectiveness of the voting techniques. The thesis shows that various effective document retrieval approaches have a positive impact on the performance of the voting techniques. Interestingly, it also shows that a `perfect' ranking of documents does not necessarily translate into an equally perfect ranking of candidates. Insights are provided into the reasons for this, which relate to the complexity of evaluating tasks based on ranking aggregates of documents.

Furthermore, it is shown how query expansion can be adapted and integrated into the expert search process, such that the query expansion successfully acts on a pseudo-relevant set containing only a list of names of persons. Five ways of performing query expansion in the expert search task are proposed, which vary in the extent to which they tackle expert search-specific problems, in particular, the occurrence of topic drift within the expertise evidence for each candidate.

Not all documentary evidence of expertise for a given person are equally useful, nor may there be sufficient expertise evidence for a relevant person within an enterprise. This thesis investigates various approaches to identify the high quality evidence for each person, and shows how the World Wide Web can be mined as a resource to find additional expertise evidence.

This thesis also demonstrates how the proposed model can be applied to other people search tasks such as ranking blog(ger)s in the blogosphere setting, and suggesting reviewers for the submitted papers to an academic conference.

The central contributions of this thesis are the introduction of the Voting Model, and the definition of a number of voting techniques within the model. The thesis draws insights from an extremely large and exhaustive set of experiments, involving many experimental parameters, and using different test collections for several people search tasks. This illustrates the effectiveness and the generality of the Voting Model at tackling various people search tasks and, indeed, the retrieval of aggregates of documents in general.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: Enterprise Information Retrieval, Expert Search, People Search, Blog Search
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: Ounis, Dr. Iadh
Date of Award: 2009
Depositing User: Dr Craig Macdonald
Unique ID: glathesis:2009-609
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 02 Mar 2009
Last Modified: 10 Dec 2012 13:20
URI: https://theses.gla.ac.uk/id/eprint/609

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year