2012
Conference article  Open Access

Scheduling queries across replicas

Freire A., Macdonald C., Tonellotto N., Ounis I., Cacheda F.

Experimentation  Performance  H.3.3 Information Search & Retrieval 

For increased efficiency, an information retrieval system can split its index into multiple shards, and then replicate these shards across many query servers. For each new query, an appropriate replica for each shard must be selected, such that the query is answered as quickly as possible. Typically, the replica with the lowest number of queued queries is selected. However, not every query takes the same time to execute, particularly if a dynamic pruning strategy is applied by each query server. Hence, the replica's queue length is an inaccurate indicator of the workload of a replica, and can result in inefficient usage of the replicas. In this work, we propose that improved replica selection can be obtained by using query efficiency prediction to measure the expected workload of a replica. Experiments are conducted using 2.2k queries, over various numbers of shards and replicas for the large GOV2 collection. Our results show that query waiting and completion times can be markedly reduced, showing that accurate response time predictions can improve scheduling accuracy and attesting the benefit of the proposed scheduling algorithm.

Source: 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1139–1140, Portland, OR, USA, 12-16 August 2012

Publisher: ACM Press, New York, USA


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:218934,
	title = {Scheduling queries across replicas},
	author = {Freire A. and Macdonald C. and Tonellotto N. and Ounis I. and Cacheda F.},
	publisher = {ACM Press, New York, USA},
	doi = {10.1145/2348283.2348508},
	booktitle = {35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1139–1140, Portland, OR, USA, 12-16 August 2012},
	year = {2012}
}