Document - Caching historical embeddings in conversational search

2022

Journal article Open Access

Caching historical embeddings in conversational search

Frieder O., Mele I., Muntean C., Nardini F. M., Perego R., Tonellotto N.

Information Retrieval (cs.IR) FOS: Computer and information sciences Computer Science - Information Retrieval Conversational search Caching Dense retrieval Similarity search

Rapid response, namely low latency, is fundamental in search applications; it is particularly so in interactive search sessions, such as those encountered in conversational settings. An observation with a potential to reduce latency asserts that conversational queries exhibit a temporal locality in the lists of documents retrieved. Motivated by this observation, we propose and evaluate a client-side document embedding cache, improving the responsiveness of conversational search systems. By leveraging state-of-the-art dense retrieval models to abstract document and query semantics, we cache the embeddings of documents retrieved for a topic introduced in the conversation, as they are likely relevant to successive queries. Our document embedding cache implements an efficient metric index, answering nearest-neighbor similarity queries by estimating the approximate result sets returned. We demonstrate the efficiency achieved using our cache via reproducible experiments based on TREC CAsT datasets, achieving a hit rate of up to 75% without degrading answer quality. Our achieved high cache hit rates significantly improve the responsiveness of conversational systems while likewise reducing the number of queries managed on the search back-end.

Source: ACM TRANSACTIONS ON THE WEB, vol. 18 (issue 4)

Metrics

Back to previous page

Cite as

BibTeX entry

@article{oai:iris.cnr.it:20.500.14243/504502,
	title = {Caching historical embeddings in conversational search},
	author = {Frieder O. and Mele I. and Muntean C. and Nardini F.  M. and Perego R. and Tonellotto N.},
	doi = {10.1145/3578519 and 10.48550/arxiv.2211.14155},
	year = {2022}
}