2010
Conference article  Unknown

Efficient dynamic pruning with proximity support

Tonellotto N., Macdonald C., Ounis I.

Information Search and Retrieval  Information Retrieval  Search Engines 

Modern retrieval approaches apply not just single-term weighting models when ranking documents - instead, proximity weighting models are in common use, which highly score the co-occurrence of pairs of query terms in close proximity to each other in documents. The adoption of these proximity weighting models can cause a computational overhead when documents are scored, negatively impacting the efficiency of the retrieval process. In this paper, we discuss the integration of proximity weighting models into efficient dynamic pruning strategies. In particular, we propose to modify document-at-a-time strategies to include proximity scoring without any modifications to pre-existing index structures. Our resulting two-stage dynamic pruning strategies only consider single query terms during first stage pruning, but can early terminate the proximity scoring of a document if it can be shown that it will never be retrieved. We empirically examine the efficiency benefits of our approach using a large Web test collection of 50 million documents and 10,000 queries from a real query log. Our results show that our proposed two-stage dynamic pruning strategies are considerably more efficient than the original strategies, particularly for queries of 3 or more terms.

Source: SIGIR 2010 - Workshop on Large Scale Distributed Search, pp. 31–35, Ginevra, Svizzera, Luglio 2010

Publisher: CEUR-WS.org, Aachen, DEU



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:92088,
	title = {Efficient dynamic pruning with proximity support},
	author = {Tonellotto N. and Macdonald C. and Ounis I.},
	publisher = {CEUR-WS.org, Aachen, DEU},
	booktitle = {SIGIR 2010 - Workshop on Large Scale Distributed Search, pp. 31–35, Ginevra, Svizzera, Luglio 2010},
	year = {2010}
}