IfD - information for discrimination

Cai, Di (2004) IfD - information for discrimination. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2004CaiPhD.pdf] PDF
Download (12MB)
Printed Thesis Information: https://eleanor.lib.gla.ac.uk/record=b2219455


The problem of term mismatch and ambiguity has long been serious and outstanding in IR. The problem can result in the system formulating an incomplete and imprecise query representation, leading to a failure of retrieval. Many query reformulation methods have been proposed to address the problem. These methods employ term classes which are considered as related to individual query terms. They are hindered by the computational cost of term classification, and by the fact that the terms in some class are generally related to some specific query term belonging to the class rather than relevant to the context of the query.

In this thesis we propose a series of methods for automatic query reformulation (AQR). The methods constitute a formal model called IfD, standing for Information for Discrimination. In IfD, each discrimination measure is modelled as information contained in terms supporting one of two opposite hypotheses. The extent of association of terms with the query can thus be defined based directly on the discrimination. The strength of association of candidate terms with the query can then be computed, and good terms can be selected to enhance the query.

Justifications for IfD are presented from several aspects: formal interpretations of infor­mation for discrimination are introduced to show its soundness; criteria are put forward to show its rationality; properties of discrimination measures are analysed to show its appro­priateness; examples are examined to show its usability; extension is discussed to show its potential; implementation is described to show its feasibility; comparisons with other methods are made to show its flexibility; improvements in retrieval performance are exhibited to show its powerful capability. Our conclusion is that the advantage and promise IfD should make it an indispensable methodology for AQR, which we believe can be an effective technique for improvement in retrieval performance.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: van Rijsbergen, Prof. Keith and Jose, Prof. Joemon M.
Date of Award: 2004
Depositing User: Angi Shields
Unique ID: glathesis:2004-3972
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 12 Feb 2013 10:00
Last Modified: 12 Feb 2013 10:00
URI: https://theses.gla.ac.uk/id/eprint/3972

Actions (login required)

View Item View Item


Downloads per month over past year