2024
Conference article  Open Access

Improving RAG systems via sentence clustering and reordering

Alessio M., Faggioli G., Ferro N., Nardini F. M., Perego R.

Arrangement Strategy  Conversational Search  Positional Bias  Retrieval Augmented Generation 

Large Language Models (LLMs) have gained noteworthy importance and attention across different domains and fields in recent years. Information Retrieval (IR) is one of the domains they impacted the most, as witnessed by the recent increase in the number of IR systems incorporating generative models. Specifically, Retrieval Augmented Generation (RAG) is the emerging paradigm that integrates existing knowledge from large-scale document corpora into the generation process, enabling the model to generate more coherent, contextually relevant, and accurate text across various tasks. Such tasks include summarization, question answering, and dialogue systems. Recent studies have highlighted the significant positional dependence exhibited by RAG systems. Such studies observed how the placement of information within the LLM input prompt drastically affects the generated output. We ground our study on this property by investigating alternative strategies for ordering sentences within the LLM prompt to improve the average quality of the generated responses in the user and conversational system dialogues. We propose the architecture of an end-to-end RAG-based conversational assistant and empirically evaluate our strategies using the TREC CAsT 2022 collection. Our experiments highlight significant differences between distinct arrangement strategies. By employing an evaluation methodology based on RankVicuna, we show that our best approach achieves improvements up to 54% in terms of overall response quality over baseline methods.

Source: CEUR WORKSHOP PROCEEDINGS, vol. 3784, pp. 34-43. Washington DC, USA, 07/07/2024

Publisher: CEUR-WS



Back to previous page
BibTeX entry
@inproceedings{oai:iris.cnr.it:20.500.14243/520153,
	title = {Improving RAG systems via sentence clustering and reordering},
	author = {Alessio M. and Faggioli G. and Ferro N. and Nardini F.  M. and Perego R.},
	publisher = {CEUR-WS},
	booktitle = {CEUR WORKSHOP PROCEEDINGS, vol. 3784, pp. 34-43. Washington DC, USA, 07/07/2024},
	year = {2024}
}

EFRA
Extreme Food Risk Analytics


OpenAIRE