2014
Conference article  Open Access

A self-adapting latency/power tradeoff model for replicated search engines

Freire A., Macdonald C., Tonellotto N., Ounis I., Cacheda F.

Search Engines  Power Consumption 

For many search settings, distributed/replicated search en- gines deploy a large number of machines to ensure efficient retrieval. This paper investigates how the power consump- tion of a replicated search engine can be automatically re- duced when the system has low contention, without com- promising its efficiency. We propose a novel self-adapting model to analyse the trade-off between latency and power consumption for distributed search engines. When query volumes are high and there is contention for the resources, the model automatically increases the necessary number of active machines in the system to maintain acceptable query response times. On the other hand, when the load of the sys- tem is low and the queries can be served easily, the model is able to reduce the number of active machines, leading to power savings. The model bases its decisions on exam- ining the current and historical query loads of the search engine. Our proposal is formulated as a general dynamic decision problem, which can be quickly solved by dynamic programming in response to changing query loads. Thor- ough experiments are conducted to validate the usefulness of the proposed adaptive model using historical Web search traffic submitted to a commercial search engine. Our results show that our proposed self-adapting model can achieve an energy saving of 33% while only degrading mean query com- pletion time by 10 ms compared to a baseline that provisions replicas based on a previous day's traffic.

Source: WSDM'14 - 7th ACM International Conference on Web Search and Data Mining, New York, USA, 24-28 February 2014


[1] R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.
[2] D. P. Bertsekas. Dynamic programming and optimal control. Athena Scienti c, 2nd ed, 2000.
[3] S. Buettcher, C. L. A. Clarke, and G. V. Cormack. Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press, 1st ed, 2010.
[4] F. Cacheda, V. Carneiro, V. Plachouras, and I. Ounis. Performance analysis of distributed information retrieval architectures using an improved network simulation model. Inf. Process. Manage., 43(1):204{224, 2007.
[5] G. Chowdhury. An agenda for green information retrieval research. Inf. Process. Manage., 48(6):1067{1077, 2012.
[6] Y. Diao, N. Gandhi, J. Hellerstein, S. Parekh, and D. Tilbury. Using MIMO feedback control to enforce policies for interrelated metrics with application to the Apache web server. In Proc. of the Network Operations and Management Symposium 2002, pages 219{234.
[7] E. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1(1):269{271, 1959.
[8] D. Economou, S. Rivoire, and C. Kozyrakis. Full-system power analysis and modeling for server environments. In Proc. of the Workshop on Modeling Benchmarking and Simulation, 2006.
[9] A. Freire, C. Macdonald, N. Tonellotto, I. Ounis, and F. Cacheda. Hybrid query scheduling for a replicated search engine. In Proc. of ECIR 2013, pages 435{446.
[10] A. Gandhi and M. Harchol-Balter. How data center size impacts the e ectiveness of dynamic power management. In Proc. of 49th Allerton conference on Communication, Control, and Computing, pages 1164{1169, 2011.
[11] Gartner. Green IT: The new industry shockwave. Presentation at Symposium/ITXPO conference, 2007.
[12] J. L. Hellerstein, Y. Diao, S. Parekh, and D. M. Tilbury. Feedback Control of Computing Systems. John Wiley & Sons, 2004.
[13] L. A. Barroso, J. Dean, and U. Holzle. Web search for a planet: The Google cluster architecture. IEEE Micro, 23(2):22{28, 2003.
[14] U. Hoelzle and L. A. Barroso. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan and Claypool Publishers, 1st ed, 2009.
[15] S. Jonassen, B. B. Cambazoglu, and F. Silvestri. Prefetching query results and its impact on search engines. In Proc. of SIGIR 2012, pages 631{640.
[16] E. Kayaaslan, B. B. Cambazoglu, R. Blanco, F. P. Junqueira, and C. Aykanat. Energy-price-driven query processing in multi-center web search engines. In Proc. of SIGIR 2011, pages 983{992.
[17] B. Khargharia, S. Hariri, and M. S. Yousif. Autonomic power and performance management for computing systems. Cluster Computing, 11(2):167{181, 2008.
[18] E. Kharitonov, C. Macdonald, P. Serdyukov, and I. Ounis. Incorporating e ciency in evaluation. In Proc. of the Modeling User Behavior for Information Retrieval Evaluation Workshop at SIGIR 2013.
[19] J. Liu, F. Zhao, X. Liu, and W. He. Challenges towards elastic power management in Internet data centers. Proc. of Cyber-Physical Systems Workshop at ICDCS 2009.
[20] C. Macdonald, N. Tonellotto, and I. Ounis. Learning to predict response times for online query scheduling. In Proc. of SIGIR 2012, pages 621{630.
[21] L. Mastroleon, N. Bambos, C. Kozyrakis, and D. Economou. Automatic power management schemes for internet servers and data centers. In Proc. of GLOBECOM 2005, page 5.
[22] I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A high performance and scalable information retrieval platform. In Proc. of the Open Source IR Workshop at SIGIR 2006, pages 18{25.
[23] S. Parekh, N. Gandhi, J. Hellerstein, D. Tilbury, T. Jayram, and J. Bigus. Using control theory to achieve service level objectives in performance management. Real-Time Syst., 23(1/2):127{141, 2002.
[24] K. Radinsky, K. Svore, S. Dumais, J. Teevan, A. Bocharov, and E. Horvitz. Modeling and predicting behavioral dynamics on the web. In Proc. WWW 2012, pages 599{608.
[25] F. B. Sazoglu, B. B. Cambazoglu, R. Ozcan, I. S. Altingovde, and O. Ulusoy A nancial cost metric for result caching. In Proc. of SIGIR 2013, pages 873{876.
[26] E. Shurman and J. Brutlag. Performance related changes and their user impacts. In Velocity: Web Performance and Operations Conference, 2009.
[27] F. Silvestri. Mining query logs: Turning search usage data into knowledge. Foundations and Trends in Information Retrieval, 4(1-2):1{174, 2010.
[28] EU Energy Star: http://eu-energystar.org.
[29] L. Wang, J. Lin, and D. Metzler. Learning to e ciently rank. In Proc. of SIGIR 2010, pages 138{145.
[30] Z. Wang, N. Tolia, and C. Bash. Opportunities and challenges to unify workload, power, and cooling management in data centers. SIGOPS Oper. Syst. Rev., 44(3):41{46, 2010.

Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:294071,
	title = {A self-adapting latency/power tradeoff model for replicated search engines},
	author = {Freire A. and Macdonald C. and Tonellotto N. and Ounis I. and Cacheda F.},
	doi = {10.1145/2556195.2556246},
	booktitle = {WSDM'14 - 7th ACM International Conference on Web Search and Data Mining, New York, USA, 24-28 February 2014},
	year = {2014}
}

MIDAS
Model and Inference Driven, Automated testing of Services architectures

SMART
Search engine for MultimediA enviRonment generated contenT


OpenAIRE