Conference article  Open Access

Efficient & Effective Selective Query Rewriting with Efficiency Predictions

Macdonald C., Tonellotto N., Ounis I.

Information retrieval 

To enhance e'ectiveness, a user's query can be rewritten internally by the search engine in many ways, for example by applying proximity, or by expanding the query with related terms. However, approaches that benefit e'ectiveness offten have a negative impact on effciency, which has impacts upon the user satisfaction, if the query is excessively slow. In this paper, we propose a novel framework for using the predicted execution time of various query rewritings to select between alternatives on a per-query basis, in a manner that ensures both e'ectiveness and effciency. In particular, we propose the prediction of the execution time of ephemeral (e.g., proximity) posting lists generated from uni-gram inverted index posting lists, which are used in establishing the permissible query rewriting alternatives that may execute in the allowed time. Experiments examining both the e'ectiveness and efficiency of the proposed approach demonstrate that a 49% decrease in mean response time (and 62% decrease in 95th-percentile response time) can be attained without significantly hindering the e'ectiveness of the search engine.

Source: SIGIR '17 - 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 495–504, Shinjuku, Tokyo, Japan, 7-11 July, 2017

[1] Gianni Amati. 2006. Frequentist and Bayesian Approach to IR. In ECIR. 13-24.
[2] Vo Ngoc Anh, Owen de Kretser, and Alistair Mo‚at. 2001. Vector-space ranking with e‚ective early termination. In SIGIR. 35-42.
[3] Michael Bendersky, W. Bruce Cro‰, and Yanlei Diao. 2011. ‹ality-biased ranking of web documents. In WSDM. 95-104.
[4] Michael Bendersky, Donald Metzler, and W. Bruce Cro‰. 2010. Learning Concept Importance Using a Weighted Dependence Model. In WSDM. 31-40.
[5] Daniele Broccolo, Craig Macdonald, Salvatore Orlando, Iadh Ounis, Ra‚aele Perego, Fabrizio Silvestri, and Nicola TonelloŠo. 2013. Load-sensitive Selective Pruning for Distributed Search. In CIKM. 379-388.
[6] Andrei Z. Broder, David Carmel, Michael Herscovici, Aya So‚er, and Jason Y. Zien. 2003. Ecient query evaluation using a two-level retrieval process. In CIKM. 426-434.
[7] Chris Buckley, Gerard Salton, James Allan, and Amit Singhal. 1995. Automatic query expansion using SMART: TREC 3. In TREC 3. 69-80.
[8] Christopher J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. Technical Report MSR-TR-2010-82.
[9] Charles L. A. Clarke, Nick Craswell, and Ellen M. Voorhees. 2012. Overview of the TREC 2012 Web Track. In TREC.
[10] Nick Craswell, Dennis FeŠerly, Marc Najork, Stephen Robertson, and Emine Yilmaz. 2010. Microso‰ Research at TREC 2009. In TREC.
[11] Nick Craswell, Rosie Jones, Georges Dupret, and Evelyne Viegas (Eds.). 2009. Proceedings of the Web Search Click Data Workshop at WSDM 2009.
[12] Nick Craswell and Martin Szummer. 2007. Random walks on the click graph. In SIGIR. 239-246.
[13] W. Bruce Cro‰, Donald Metzler, and Trevor Strohman. 2009. Search Engines: Information Retrieval in Practice. Addison-Wesley Publishing Company.
[14] J. Shane Culpepper, Charles L. A. Clarke, and Jimmy Lin. 2016. Dynamic Cuto‚ Prediction in Multi-Stage Retrieval Systems. In ADCS. 17-24.
[15] Je‚rey Dean. 2009. Challenges in building large-scale information retrieval systems: invited talk. In WSDM.
[16] Je‚rey Dean and Luiz Andr Barroso. 2013. Œe Tail at Scale. Commun. ACM 56 (2013), 74-80.
[17] Shuai Ding and Torsten Suel. 2011. Faster top-k document retrieval using blockmax indexes. In SIGIR. 993-1002.
[18] Marcus Fontoura, Vanja Josifovski, Jinhui Liu, Srihari Venkatesan, Xiangfei Zhu, and Jason Y. Zien. 2011. Evaluation Strategies for Top-k ‹eries over Memory-Resident Inverted Indexes. PVLDB 4, 12 (2011), 1213-1224.
[19] Myeongjae Jeon, Saehoon Kim, Seung-won Hwang, Yuxiong He, Sameh Elnikety, Alan L. Cox, and ScoŠ Rixner. 2014. Predictive Parallelization: Taming Tail Latencies in Web Search. In SIGIR. 253-262.
[20] Rosie Jones, Benjamin Rey, Omid Madani, and Wiley Greiner. 2006. Generating ‹ery Substitutions. In WWW. 387-396.
[21] Saehoon Kim, Yuxiong He, Seung-won Hwang, Sameh Elnikety, and Seungjin Choi. 2015. Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search. In WSDM. 7-16.
[22] Nicholas Lester, Alistair Mo‚at, William Webber, and Justin Zobel. 2005. SpaceLimited Ranked ‹ery Evaluation Using Adaptive Pruning. In WISE. 470-477.
[23] Jimmy Lin, MaŠ Crane, Andrew Trotman, Jamie Callan, Ishan ChaŠopadhyaya, John Foley, Grant Ingersoll, Craig Macdonald, and Sebastiano Vigna. 2016. Toward Reproducible Baselines: Œe Open-Source IR Reproducibility Challenge. In ECIR. 408-420.
[24] Xiaolu Lu, Alistair Mo‚at, and J. Shane Culpepper. 2015. On the Cost of Extracting Proximity Features for Term-Dependency Models. In CIKM. 293-302.
[25] Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Ra‚aele Perego, Nicola TonelloŠo, and Rossano Venturini. 2015. ‹ickScorer: A Fast Algorithm to Rank Documents with Additive Ensembles of Regression Trees. In SIGIR. 73-82.
[26] Craig Macdonald, Iadh Ounis, and Nicola TonelloŠo. 2011. Upper-bound Approximations for Dynamic Pruning. ACM Trans. Inf. Syst. 29, 4 (2011), 17:1-17:28.
[27] Craig Macdonald, Rodrygo LT Santos, and Iadh Ounis. 2013. Œe whens and hows of learning to rank for web search. Information Retrieval 16, 5 (2013), 584-628.
[28] Craig Macdonald, Rodrygo L.T. Santos, Iadh Ounis, and Ben He. 2013. About Learning Models with Multiple ‹ery-dependent Features . ACM Trans. Inf. Syst. 31, 3 (2013), 11:1-11:39.
[29] Craig Macdonald, Nicola TonelloŠo, and Iadh Ounis. 2012. Learning to predict response times for online query scheduling. In SIGIR. 621-630.
[30] Donald Metzler and W. Bruce Cro‰. 2005. A Markov Random Field Model for Term Dependencies. In SIGIR. 472-479.
[31] Giuseppe OŠaviano, Nicola TonelloŠo, and Rossano Venturini. 2015. Optimal Space-time Tradeo‚s for Inverted Indexes. In WSDM. 47-56.
[32] Iadh Ounis, Gianni Amati, Vassilis Plachouras, Ben He, Craig Macdonald, and Christina Lioma. 2006. Terrier: A High Performance & Scalable IR Platform. In OSIR.
[33] Jan Pederson. 2010. ‹ery Understanding at Bing. In Invited Talk, SIGIR 2010 Industry Day.
[34] Fuchun Peng, Nawaaz Ahmed, Xin Li, and Yumao Lu. 2007. Context Sensitive Stemming for Web Search. In SIGIR. 639-646.
[35] Jie Peng, Craig Macdonald, Ben He, Vassilis Plachouras, and Iadh Ounis. 2007. Incorporating Term Dependency in the DFR Framework. In SIGIR. 843-844.
[36] Eric Shurman and Jake Brutlag. 2009. Performance related changes and their user impacts. In Velocity: Web Performance and Operations Conference.
[37] Trevor Strohman, Howard Turtle, and W. Bruce Cro‰. 2005. Optimization Strategies for Complex ‹eries. In SIGIR. 219-225.
[38] Nicola TonelloŠo, Craig Macdonald, and Iadh Ounis. 2013. Ecient and E‚ective Retrieval Using Selective Pruning. In WSDM. 63-72.
[39] Howard Turtle and James Flood. 1995. ‹ery evaluation: Strategies and optimizations. Information Processing & Management 31, 6 (1995), 831 - 850.
[40] Sebastiano Vigna. 2013. ‹asi-succinct indices. In WSDM. 83-92.
[41] Lidan Wang, Jimmy Lin, and Donald Metzler. 2010. Learning to Eciently Rank. In SIGIR. 138-145.
[42] Jinxi Xu and W. Bruce Cro‰. 1996. ‹ery Expansion Using Local and Global Document Analysis. In SIGIR. 4-11.


Back to previous page
BibTeX entry
	title = {Efficient \& Effective Selective Query Rewriting with Efficiency Predictions},
	author = {Macdonald C. and Tonellotto N. and Ounis I.},
	doi = {10.1145/3077136.3080827},
	booktitle = {SIGIR '17 - 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 495–504, Shinjuku, Tokyo, Japan, 7-11 July, 2017},
	year = {2017}