2022
Journal article  Open Access

Caching historical embeddings in conversational search

Frieder O., Mele I., Muntean C., Nardini F. M., Perego R., Tonellotto N.

Information Retrieval (cs.IR)  FOS: Computer and information sciences  Computer Science - Information Retrieval  Conversational search  Caching  Dense retrieval  Similarity search 

Rapid response, namely low latency, is fundamental in search applications; it is particularly so in interactive search sessions, such as those encountered in conversational settings. An observation with a potential to reduce latency asserts that conversational queries exhibit a temporal locality in the lists of documents retrieved. Motivated by this observation, we propose and evaluate a client-side document embedding cache, improving the responsiveness of conversational search systems. By leveraging state-of-the-art dense retrieval models to abstract document and query semantics, we cache the embeddings of documents retrieved for a topic introduced in the conversation, as they are likely relevant to successive queries. Our document embedding cache implements an efficient metric index, answering nearest-neighbor similarity queries by estimating the approximate result sets returned. We demonstrate the efficiency achieved using our cache via reproducible experiments based on TREC CAsT datasets, achieving a hit rate of up to 75% without degrading answer quality. Our achieved high cache hit rates significantly improve the responsiveness of conversational systems while likewise reducing the number of queries managed on the search back-end.

Source: ACM TRANSACTIONS ON THE WEB, vol. 18 (issue 4)


Metrics



Back to previous page
BibTeX entry
@article{oai:iris.cnr.it:20.500.14243/504502,
	title = {Caching historical embeddings in conversational search},
	author = {Frieder O. and Mele I. and Muntean C. and Nardini F.  M. and Perego R. and Tonellotto N.},
	doi = {10.1145/3578519 and 10.48550/arxiv.2211.14155},
	year = {2022}
}