Indexing is a core process of an information retrieval (IR) system (IRS). As indexing can neither be exhaustive nor precise, the decision taken by an IRS about the relevance of the content of a document to an information need is subject to uncertainty. It is our hypothesis that one of the reasons that IRSs are unable to optimally respond to every query is that the document collections and the posting lists are modeled as sets of documents. In contrast, if vector spaces replace sets along the way given by quantum mechanics, it is possible to define a quantum information retrieval basis (QIRB) that at least in principle yields more effective document ranking than the ranking yielded by the current principles, with effectiveness being measured in terms of recall at every level of fallout. To this end, we show that the probability ranking principle and the Neyman–Pearson Lemma (NPL) are equivalent. The rest of the article follows from this result. In particular, we introduce the QIRB, link it to a generalization of the NPL and demonstrate its superiority through a concise mathematical analysis and an empirical study. The challenges posed by this basis and the research directions that would be opened are also discussed.

Deriving a Quantum Information Retrieval Basis

MELUCCI, MASSIMO
2012

Abstract

Indexing is a core process of an information retrieval (IR) system (IRS). As indexing can neither be exhaustive nor precise, the decision taken by an IRS about the relevance of the content of a document to an information need is subject to uncertainty. It is our hypothesis that one of the reasons that IRSs are unable to optimally respond to every query is that the document collections and the posting lists are modeled as sets of documents. In contrast, if vector spaces replace sets along the way given by quantum mechanics, it is possible to define a quantum information retrieval basis (QIRB) that at least in principle yields more effective document ranking than the ranking yielded by the current principles, with effectiveness being measured in terms of recall at every level of fallout. To this end, we show that the probability ranking principle and the Neyman–Pearson Lemma (NPL) are equivalent. The rest of the article follows from this result. In particular, we introduce the QIRB, link it to a generalization of the NPL and demonstrate its superiority through a concise mathematical analysis and an empirical study. The challenges posed by this basis and the research directions that would be opened are also discussed.
2012
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/2529906
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 10
social impact