Human information processing based information retrieval

Graf, Erik (2011) Human information processing based information retrieval. PhD thesis, University of Glasgow.

Full text available as:
Download (14MB) | Preview
Printed Thesis Information:


This work focused on the investigation of the question how the concept of relevance in Information Retrieval can be validated. The work is motivated by the consistent difficulties of defining the meaning of the concept, and by advances in the field of cognitive science.
Analytical and empirical investigations are carried out with the aim of devising a principled approach to the validation of the concept. The foundation for this work was set by
interpreting relevance as a phenomenon occurring within the context of two systems:
An IR system and the cognitive processing system of the user. In light of the cognitive interpretation of relevance, an analysis of the learnt lessons in cognitive science
with regard to the validation of cognitive phenomena was conducted. It identified that construct validity constitutes the dominant approach to the validation of constructs in
cognitive science. Construct validity constitutes a proposal for the conduction of validation in scenarios, where no direct observation of a phenomenon is possible. With
regard to the limitations on direct observation of a construct (i.e. a postulated theoretic concept), it bases validation on the evaluation of its relations to other constructs.
Based on the interpretation of relevance as a product of cognitive processing it was concluded, that the limitations with regard to direct observation apply to its investigation.
The evaluation of its applicability to an IR context, focused on the exploration of the nomological network methodology. A nomological network constitutes an analytically constructed set of constructs and their relations. The construction of such a network
forms the basis for establishing construct validity through investigation of the relations between constructs. An analysis focused on contemporary insights to the nomological
network methodology identified two important aspects with regard to its application in IR. The first aspect is given by a choice of context and the identification of a pool of
candidate constructs for the inclusion in the network. The second consists of identifying criteria for the selection of a set of constructs from the candidate pool. The
identification of the pertinent constructs for the network was based on a review of the principles
of cognitive exploration, and an analysis of the state of the art in text based discourse processing and reasoning. On that basis, a listing of known sub-processes contributing
to the pertinent cognitive processing was presented. Based on the identification of a large number of potential candidates, the next step consisted of the inference of criteria for the selection of an initial set of constructs for the network. The investigation of these
criteria focused on the consideration of pragmatic and meta-theoretical aspects. Based on a survey of experimental means in cognitive science and IR, five pragmatic criteria for the selection of constructs were presented. Consideration of meta-theoretically motivated criteria required to investigate what the specific challenges with regard to the
validation of highly abstract constructs are. This question was explored based on the underlying considerations of the Information Processing paradigm and Newell’s (1994)
cognitive bands. This led to the identification of a set of three meta-theoretical criteria for the selection of constructs. Based on the criteria and the demarcated candidate pool, an IR focused nomological network was defined. The network consists of the constructs of relevance and type and grade of word relatedness.
A necessary prerequisite for making inferences based on a nomological network consists of the availability of validated measurement instruments for the constructs. To that cause, two validation studies targeting the measurement of the type and grade of relations between words were conducted. The clarification of the question of the validity
of the measurement instruments enabled the application of the nomological network. A first step of the application consisted of testing if the constructs in the network are
related to each other. Based on the alignment of measurements of relevance and the word related constructs it was concluded to be true. The relation between the constructs was characterized by varying the word related constructs over a large parameter space and observing the effect of this variation on relevance. Three hypotheses relating to different aspects of the relations between the word related constructs and relevance. It was
concluded, that the conclusive confirmation of the hypotheses requires an extension of the experimental means underlying the study. Based on converging observations from
the empirical investigation of the three hypotheses it was concluded, that semantic and associative relations distinctly differ with regard to their impact on relevance estimation.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: Information Retrieval, Psycholinguistics, Cognitive Processing, Distributed Semantic Spaces, Relevance, Validity, Correlation to Cognition Paradigm
Subjects: B Philosophy. Psychology. Religion > B Philosophy (General)
B Philosophy. Psychology. Religion > BF Psychology
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
Colleges/Schools: College of Science and Engineering > School of Computing Science
Supervisor's Name: van Rijsbergen, Professor Keith and Jose, Professor Joemon and Lalmas, Professor Mounia
Date of Award: 2011
Depositing User: Erik K. Graf
Unique ID: glathesis:2011-5188
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 08 Jul 2014 14:57
Last Modified: 08 Jul 2014 14:58

Actions (login required)

View Item View Item


Downloads per month over past year