2026
Journal article
Open Access
Getting off the DIME: dimension pruning via dimension importance estimation for dense information retrieval
Faggioli Guglielmo, Ferro Nicola, Perego Raffaele, Tonellotto NicolaDense Information Retrieval (IR) systems rely on neural networks to embed documents and queries within a latent low-dimensional space. Among the Dense IR approaches, bi-encoders are particularly popular, as they achieve state-of-the-art performance and allow for efficient encoding of documents and queries. Nevertheless, using this class of systems, by construction, all the documents and queries are represented using the same set of dimensions. In this article, we introduce the Manifold Clustering (MC) hypothesis which states that, for each query, there exists a query-dependent manifold of the original embedding space where the query and documents relevant to it cluster more effectively. We empirically validate the MC hypothesis showing that it is possible to find a query-dependent linear subspace of the original embedding space where high retrieval effectiveness is achieved.Source: ACM TRANSACTIONS ON INFORMATION SYSTEMS, vol. 44 (issue 1), pp. 1-34
DOI: 10.1145/3765619Metrics:
See at:
dl.acm.org
| CNR IRIS
| ACM Transactions on Information Systems
| CNR IRIS
2026
Journal article
Open Access
Projection-displacement-based query performance prediction for embedded space of dense retrievers
Datta Suchana, Faggioli Guglielmo, Ferro Nicola, Ganguly Debasis, Muntean Cristina Ioana, Perego Raffaele, Tonellotto NicolaRecent advances in representation learning have enabled neural Information Retrieval (IR) systems to use learned dense representations for queries and documents to effectively handle semantics, language nuances, and vocabulary mismatch problems. In contrast to traditional IR systems that rely on word matching, dense IR models exploit query/document similarity in dense latent spaces to account for semantics. This requires substantial training data and comes with increased computational demands. Thus, it would be beneficial to predict how a system will perform for a given query to decide whether a dense IR model is the best option or alternatives should be used. Traditional Query Performance Prediction (QPP) models are designed for lexical IR approaches and perform sub-optimally when applied to dense neural IR systems. Therefore, there has been a renewed interest in QPP methods to improve their effectiveness for dense neural IR models. While the results of the new QPP methods are generally encouraging, there is ample room for improvement in absolute performance and stability. We argue that by using features more aligned with the underlying rationale of dense IR models, we can enhance the performance of QPP. In this respect, we propose the Projection-Displacement-Based QPP (PDQPP), which exploits the geometric properties of dense IR models, projects queries and retrieved documents onto subspaces defined by pseudo-relevant documents, and considers changes in retrieval scores within them as a proxy for retrieval coherence. Minor score changes suggest robust and coherent retrieval, while significant alterations indicate semantic divergence and potentially poor performance. Results over a wide range of experimental settings on both traditional (TREC Robust) and neural-oriented (TREC Deep Learning) test collections show that PDQPP mostly outperforms the state-of-the-art QPP baselines.Source: ACM TRANSACTIONS ON INFORMATION SYSTEMS, vol. 44 (issue 1), pp. 1-30
DOI: 10.1145/3765617Metrics:
See at:
dl.acm.org
| CNR IRIS
| ACM Transactions on Information Systems
| CNR IRIS
2025
Journal article
Open Access
ChatGPT versus modest large language models: an extensive study on benefits and drawbacks for conversational search
Rocchietti G., Rulli C., Nardini F. M., Muntean Cristina Ioana, Perego R., Frieder O.Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches - including zero, one, and few-shot methods - with ChatGPT (e.g., gpt-3.5-turbo). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than gpt-3.5-turbo. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both gpt-3.5-turbo and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques.Source: IEEE ACCESS, vol. 13, pp. 15253-15271
DOI: 10.1109/access.2025.3529741Metrics:
See at:
IEEE Access
| IEEE Access
| CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
2025
Journal article
Open Access
Explainable, effective, and efficient learning-to-rank models using ILMART
Lucchese C., Nardini F. M., Orlando S., Perego R., Veneri A.Learning ranking models that are both explainable and effective is an emerging topic within the research area of explainable AI. Several Learning-to-Rank (LtR) algorithms have been recently proposed that build models that are simple to explain and, at the same time, almost as effective as their state-of-the-art, black-box counterparts. In this work, we propose Interpretable LambdaMART (ILMART), a novel framework with different strategies to constrain the state-of-the-art LtR LambdaMART algorithm to generate interpretable models, i.e., ensembles whose trees can use either single features (main effects) or a limited number of interacting features (interaction effects). ILMART facilitates a straightforward tradeoff between model explainability and effectiveness by precisely tuning the quantity of main and interaction effects during the learning phase. We show that slightly increasing their number allows ILMART models to reach ranking performances at par with full-complexity LambdaMART ones. Furthermore, reproducible experiments conducted on publicly available LtR datasets demonstrate that ILMART can improve nDCG@10 by up to 10% compared to state-of-the-art competitors while preserving an explainable structure. Finally, we explore the relationship between model explainability and inference efficiency by introducing a novel and easy-to-implement scoring algorithm for ILMART ranking models, achieving up to a speedup compared to the baseline.Source: ACM TRANSACTIONS ON INFORMATION SYSTEMS, vol. 43 (issue 4)
DOI: 10.1145/3733232Project(s): EFRA
Metrics:
See at:
dl.acm.org
| CNR IRIS
| ACM Transactions on Information Systems
| CNR IRIS
2025
Other
Open Access
ISTI-day 2025 Proceedings
Del Corso G., Pedrotti A., Federico G., Gennaro C., Carrara F., Amato G., Di Benedetto M., Gabrielli E., Belli D., Matrullo Z., Miori V., Tolomei G., Waheed T., Marchetti E., Calabrò A., Rossetti G., Stella M., Cazabet R., Abramski K., Cau E., Citraro S., Failla A., Mesina V., Morini V., Pansanella V., Colantonio S., Germanese D., Pascali M. A., Bianchi L., Messina N., Falchi F., Barsellotti L., Pacini G., Cassese M., Puccetti G., Esuli A., Volpi L., Moreo A., Sebastiani F., Sperduti G., Nguyen D., Broccia G., Ter Beek M. H., Ferrari A., Massink M., Belmonte G., Ciancia V., Papini O., Canapa G., Catricalà B., Manca M., Paternò F., Santoro C., Zedda E., Gallo S., Maenza S., Mattioli A., Simeoli L., Rucci D., Carlini E., Dazzi P., Kavalionak H., Mordacchini M., Rulli C., Muntean Cristina Ioana, Nardini F. M., Perego R., Rocchietti G., Lettich F., Renso C., Pugliese C., Casini G., Haldimann J., Meyer T., Assante M., Candela L., Dell'Amico A., Frosini L., Mangiacrapa F., Oliviero A., Pagano P., Panichi G., Peccerillo B., Procaccini M., Mannocci A., Manghi P., Lonetti F., Kang D., Di Giandomenico F., Jee E., Lazzini G., Conti F., Scopigno R., D'Acunto M., Moroni D., Cafiso M., Paradisi P., Callieri M., Pavoni G., Corsini M., De Falco A., Sala F., Saraceni Q., Gattiglia G.ISTI-Day is an annual information and networking event organized by the Institute of Information Science and Technologies "A. Faedo" (ISTI) of the Italian National Research Council (CNR). This event features an opening talk of the Director of the Dept. DIITET (Emilio F. Campana) as well as an overview of the Institute's activities presented by the ISTI Director (Roberto Scopigno). Those institutional segments are complemented by dedicated presentations and round tables featuring former staff members, as well as internal and external collaborators. To foster a network of knowledge and collaboration among newcomers, the 2025 ISTI Day edition also includes a large poster session that provides a comprehensive overview of current research activities. Each of the 13 laboratories contributes 1–3 posters, highlighting the most innovative work and offering early-career researchers a platform for discussion. Thus these proceedings include the posters selected for ISTI-Day 2025, reflecting the diverse and innovative nature of the Institute's research.
See at:
CNR IRIS
| www.isti.cnr.it
| CNR IRIS
2025
Conference article
Open Access
Efficient re-ranking with cross-encoders via early exit
Busolin F., Lucchese C., Nardini F. M., Orlando S., Perego R., Trani S., Veneri A.Pre-trained language models based on transformer networks arehighly effective for document re-ranking in ad-hoc search. Amongthese, cross-encoders stand out for their effectiveness, as they pro-cess query-document pairs through the entire transformer networkto compute ranking scores. However, this traversal is computation-ally expensive. To address this, prior work has explored early-exitstrategies, enabling the model to terminate the traversal of query-document pairs. These techniques rely on learned classifiers, placedafter each transformer block, that decide if a query-document paircan be dropped. Diverging from previous approaches, we proposeSimilarity-based Early Exit ( SEE ), a novel—non-learned—strategythat exploits the similarities between query and document tokenembeddings to early-terminate the inference of documents that willmost likely be non-relevant to the query. Even though SEE can beused after every transformer block, we show that the best advan-tage is achieved when applied before the first transformer block,thus saving most of the inference cost for the query-document pairs.Reproducible experiments on 17 public datasets covering in-domainand out-of-domain evaluation show that SEE can be effectively ap-plied to four different cross-encoders, achieving speedups of up to3.5× with a limited loss in ranking effectiveness.DOI: 10.1145/3726302.3729962Metrics:
See at:
dl.acm.org
| CNR IRIS
| doi.org
| CNR IRIS
2025
Conference article
Open Access
A spatially-grounded conversational planner for personalized urban itineraries
Pugliese C., Amendola M., Perego R., Renso C.We present a demo of RAGTrip, a modular conversational system that integrates Large Language Models (LLMs), spatial reasoning, and information retrieval to generate personalized walking itineraries in urban environments. Unlike traditional route planners or closed-book LLMs, RAGTrip interprets nuanced user preferences, avoids hallucinations, and grounds its suggestions in real-world geographic and factual data. The system features an interactive conversational interface that engages users in refining both the itinerary and the attractions to visit. Through dynamic map visualizations and contextual responses, users can explore and iteratively customize their routes. The demo includes a toggle to enable or disable Retrieval-Augmented Generation (RAG), allowing direct comparison between RAG-enhanced and closed-book LLM responses. This highlights the value of combining spatial and semantic grounding in conversational itinerary recommendation.DOI: 10.1145/3748636.3762795Project(s): Italian National Recovery and Resilience Plan (NRRP) of NextGenerationEU, partnership on “Telecommunications of the Future” ( - program “RESTART”).
Metrics:
See at:
dl.acm.org
| CNR IRIS
| CNR IRIS
2025
Conference article
Restricted
Query performance prediction using dimension importance estimators
Faggioli G., Ferro N., Perego R., Tonellotto N.Query Performance Prediction (QPP) tends to fall short when predicting the performance of dense Information Retrieval (IR) systems. Therefore, the research community is investigating QPP approaches designed to synergize with this class of state-of-the-art IR models. At the same time, recent advances concerning dense IR have shown that we can improve the retrieval performance by projecting embeddings in a (query-wise) optimal linear subspace of the dense representation space. The Dimension IMportance Estimation (DIME) framework was proposed to identify such optimal subspaces on a query-by-query basis. In this paper, we illustrate how to design QPP models that rely on measuring the alignment between the query and document representations and the optimal DIME dimensions, based on the hypothesis that good alignment indicates better retrieval performance. We experimentally evaluate the proposed QPPs, showing that our approach outperforms the state-of-the-art when predicting the performance of two commonly used dense encoders, Contriever and TAS-B, on two popular TREC collections, Deep Learning 2019 and 2020.Source: LECTURE NOTES IN COMPUTER SCIENCE, vol. 15573, pp. 202-217. Lucca, Italy, 06-10/04/2025
DOI: 10.1007/978-3-031-88711-6_13Metrics:
See at:
doi.org
| Padua research Archive (Archivio istituzionale della ricerca - Università di Padova)
| CNR IRIS
| CNR IRIS
2025
Conference article
Open Access
CoDIME: counterfactual approach for dimension importance estimation through click logs
Faggioli G., Ferro N., Perego R., Tonellotto N.Contextual dense representation models for text marked a shift in text processing, enabling a richer semantic understanding of the text and more effective Information Retrieval. These models project pieces of text into a latent space, describing them in terms of shared latent concepts, which are not explicitly tied to the text’s content. Previous work has shown that certain dimensions of such dense text representations can be irrelevant and detrimental to retrieval effectiveness depending on the information need specified in the query. Higher effectiveness can be achieved by performing retrieval within a linear subspace that excludes these dimensions. Dimension IMportance Estimators (DIMEs) are models designed to identify such harmful dimensions, refining the representations of queries and documents to retain only the useful ones. Current DIMEs rely either on pseudo-relevance feedback, which often delivers inconsistent effectiveness, or on explicit relevance feedback, which is challenging to collect. Inspired by counterfactual modelling, we introduce Counterfactual DIMEs (CoDIMEs), designed to leverage noisy implicit feedback to assess the importance of each dimension. The CoDIME framework presented here approximates the relationship between a document’s click frequency and its interaction with a given query dimension through a linear model. Empirical evaluations demonstrate that CoDIME outperforms traditional pseudo-relevance feedback-based DIMEs and surpasses other unsupervised counterfactual methods that utilize implicit feedback.DOI: 10.1145/3726302.3729926Project(s): EFRA
Metrics:
See at:
dl.acm.org
| CNR IRIS
| doi.org
| CNR IRIS
2025
Conference article
Open Access
CoSRec: a joint conversational search and recommendation dataset
Alessio M., Merlo S., Di Noia T., Faggioli G., Ferrante M., Ferro N., Muntean Cristina Ioana, Nardini F. M., Narducci F., Perego R., Santucci G., Viterbo N.Conversational Information Access systems have experienced wide-spread diffusion thanks to the natural and effortless interactionsthey enable with the user. In particular, they represent an effectiveinteraction interface for conversational search (CS) and conversa-tional recommendation (CR) scenarios. Despite their commonali-ties, CR and CS systems are often devised, developed, and evalu-ated as isolated components. Integrating these two elements wouldallow for handling complex information access scenarios, suchas exploring unfamiliar recommended product aspects, enablingricher dialogues, and improving user satisfaction. As of today, thescarce availability of integrated datasets — focused exclusively oneither of the tasks — limits the possibilities for evaluating by-designintegrated CS and CR systems. To address this gap, we proposeCoSRec1, the first dataset for joint Conversational Search and Rec-ommendation (CSR) evaluation. The CoSRec test set includes 20high-quality conversations, with human-made annotations for thequality of conversations, and manually crafted relevance judgmentsfor products and documents. Additionally, we provide supplemen-tary training data comprising partially annotated dialogues and rawconversations to support diverse learning paradigms. CoSRec is the first resource to model CR and CS tasks in a unified framework,enabling the training and evaluation of systems that must shiftbetween answering queries and making suggestions dynamically.DOI: 10.1145/3726302.3730319Metrics:
See at:
dl.acm.org
| CNR IRIS
| Padua research Archive (Archivio istituzionale della ricerca - Università di Padova)
| Padua research Archive (Archivio istituzionale della ricerca - Università di Padova)
| CNR IRIS
2025
Journal article
Open Access
PARK: Personalized Academic Retrieval with Knowledge-graphs
Kasela P., Pasi G., Perego R.Academic Search is a search task aimed to manage and retrieve scientific documents like journal articles and conference papers. Personalization in this context meets individual researchers’ needs by leveraging, through user profiles, the user related information (e.g. documents authored by a researcher), to improve search effectiveness and to reduce the information overload. While citation graphs are a valuable means to support the outcome of recommender systems, their use in personalized academic search (with, e.g. nodes as papers and edges as citations) is still under-explored. Existing personalized models for academic search often struggle to fully capture users’ academic interests. To address this, we propose a two-step approach: first, training a neural language model for retrieval, then converting the academic graph into a knowledge graph and embedding it into a shared semantic space with the language model using translational embedding techniques. This allows user models to capture both explicit relationships and hidden structures in citation graphs and paper content. We evaluate our approach in four academic search domains, outperforming traditional graph-based and personalized models in three out of four, with up to a 10% improvement in MAP@100 over the second-best model. This highlights the potential of knowledge graph-based user models to enhance retrieval effectiveness.Source: INFORMATION SYSTEMS, vol. 134
DOI: 10.1016/j.is.2025.102574Project(s): EFRA
Metrics:
See at:
CNR IRIS
| www.sciencedirect.com
| CNR IRIS
2025
Conference article
Open Access
Efficient conversational search via topical locality in dense retrieval
Muntean Cristina Ioana, Nardini F. M., Perego R., Rocchietti G., Rulli C.Pre-trained language models have been widely exploited to learn dense representations of documents and queries for information retrieval. While previous efforts have primarily focused on improving effectiveness and user satisfaction, response time remains a critical bottleneck of conversational search systems. To address this, we exploit the topical locality inherent in conversational queries, i.e., the tendency of queries within a conversation to focus on related topics. By leveraging query embedding similarities, we dynamically restrict the search space to semantically relevant document clusters, reducing computational complexity without compromising retrieval quality. We evaluate our approach on the TREC CAsT, 2019 and 2020 datasets using multiple embedding models and vector indexes, achieving improvements in processing speed of up to 10.3X with little loss in performance (4.3X without any loss). Our results show that the proposed system effectively handles complex, multi-turn queries with high precision and efficiency, offering a practical solution for real-time conversational search.DOI: 10.1145/3726302.3730186DOI: 10.48550/arxiv.2504.21507Project(s): EFRA 
, Future Artificial Intelligence Research” - Spoke 1” Human-centered AI”
Metrics:
See at:
arXiv.org e-Print Archive
| dl.acm.org
| CNR IRIS
| doi.org
| doi.org
| Archivio della Ricerca - Università di Pisa
| CNR IRIS
2025
Conference article
Open Access
Power- and fragmentation-aware online scheduling for GPU datacenters
Lettich F., Carlini E., Nardini F. M., Perego R., Trani S.The rise of Artificial Intelligence and Large Language Models is driving increased GPU usage in data centers for complex training and inference tasks, impacting operational costs, energy demands, and the environmental footprint of large-scale computing infrastructures. This work addresses the online scheduling problem in GPU datacenters, which involves scheduling tasks without knowledge of their future arrivals. We focus on two objectives: minimizing GPU fragmentation and reducing power consumption. GPU fragmentation occurs when partial GPU allocations hinder the efficient use of remaining resources, especially as the datacenter nears full capacity. A recent scheduling policy, Fragmentation Gradient Descent (FGD), leverages a fragmentation metric to address this issue. Reducing power consumption is also crucial due to the significant power demands of GPUs. To this end, we propose PWR, a novel scheduling policy to minimize power usage by selecting power-efficient GPU and CPU combinations. This involves a simplified model for measuring power consumption integrated into a Kubernetes score plugin. Through an extensive experimental evaluation in a simulated cluster, we show how PWR, when combined with FGD, achieves a balanced trade-off between reducing power consumption and minimizing GPU fragmentation.DOI: 10.1109/ccgrid64434.2025.00015Project(s): EFRA 
, Spoke 1 ”Human-centered AI” of the M4C2 - Investimento 1.3, Partenariato Esteso PE00000013 - ”FAIR - Future Artificial Intelligence Research”
Metrics:
See at:
CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
| CNR IRIS
2024
Patent
Restricted
Caching historical embeddings in conversational search
Frieder O., Mele I., Muntean C., Nardini F. M., Perego R., Tonellotto N.A method and system are described for improving the speed and efficiency of obtaining conversational search results. A user may speak a phrase to perform a conversational search or a series of phrases to perform a series of searches. These spoken phrases may be enriched by context and then converted into a query embedding. A similarity between the query embedding and document embeddings is used to determine the search results including a query cutoff number of documents and a cache cutoff number of documents. A second search phrase may use the cache of documents along with comparisons of the returned documents and the first query embedding to determine the quality of the cache for responding to the second search query. If the results are high-quality then the search may proceed much more rapidly by applying the second query only to the cached documents rather than to the server.
See at:
CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
MQTT-Chain: an MQTT approach using blockchain and smart contracts to achieve a new Quality of Service level
Agostinho B. M., Chessa S., Perego R., Ribeiro Dantas M. A., Sandro Roschildt Pinto A.The proliferation of Internet of Things (IoT) applications has surged in recent years, necessitating efficient communication protocols. The Message Queuing Telemetry Transport (MQTT) protocol, designed specifically for IoT devices, has gained prominence due to its lightweight publisher/subscriber model. However, inherent concerns arise within the conventional MQTT architecture, particularly pertaining to broker-side vulnerabilities. In addressing these concerns and enhancing data security, we advocate the utilization of blockchains and smart contracts for storing and transmitting broker messages. We designed and compared two different approaches, bringing a detailed latency analysis for each step and validating its functional viability, establishing a robust environment for IoT applications.DOI: 10.1109/ficloud62933.2024.00056Metrics:
See at:
CNR IRIS
| doi.org
| Archivio della Ricerca - Università di Pisa
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
CAMEO: Fostering Joint Conversational Search and Recommendation
Di Noia T., Faggioli G., Ferrante M., Ferro N., Narducci F., Perego R., Santucci G.The rising popularity of conversational agents for accessing information stems from their natural language dialogue-based interaction, facilitating usability for a broad spectrum of users, including the elderly, children, and visually impaired individuals. Among others, two tasks that benefit the most conversational agents are search and recommendation: in the former, the user receives factual information by asking the agent; in the latter, the system refines its knowledge of the user's needs by posing them a sequence of questions. This work discusses the observations and findings of the first CAMEO (Conversational Agents: Mastering, Evaluating, Optimizing) project retreat. The retreat focused on similarities and differences of conversational search and recommendation to identify the path to construct a joint conversational search and recommendation system. Our observations highlight how all the conversational search/recommendation systems can be categorized using two axes: “exploration-disambiguation” and “search-recommendation”. The first axis describes whether the question aims to gain knowledge over something unknown or allows to refine already available knowledge. The second axis describes if the user's interest is in gaining knowledge or obtaining a recommendation. Additionally, we provide insights on obtaining a dataset that can be used to train/test such a joint system. Finally, we describe how the CAMEO project will address the product search task, which we believe is the scenario where the joint conversational search and recommendation system would be the most effective.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3741, pp. 290-301. Villasimius, Italy, 23-26/06/2024
See at:
ceur-ws.org
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
Dimension importance estimation for dense information retrieval
Faggioli G., Ferro N., Perego R., Tonellotto N.Recent advances in Information Retrieval have shown the effectiveness of embedding queries and documents in a latent high-dimensional space to compute their similarity. While operating on such high-dimensional spaces is effective, in this paper, we hypothesize that we can improve the retrieval performance by adequately moving to a query-dependent subspace. More in detail, we formulate the Manifold Clustering (MC) Hypothesis: projecting queries and documents onto a subspace of the original representation space can improve retrieval effectiveness. To empirically validate our hypothesis, we define a novel class of Dimension IMportance Estimators (DIME). Such models aim to determine how much each dimension of a high-dimensional representation contributes to the quality of the final ranking and provide an empirical method to select a subset of dimensions where to project the query and the documents. To support our hypothesis, we propose an oracle DIME, capable of effectively selecting dimensions and almost doubling the retrieval performance. To show the practical applicability of our approach, we then propose a set of DIMEs that do not require any oracular piece of information to estimate the importance of dimensions. These estimators allow us to carry out a dimensionality selection that enables performance improvements of up to +11.5% (moving from 0.675 to 0.752 nDCG@10) compared to the baseline methods using all dimensions. Finally, we show that, with simple and realistic active feedback, such as the user's interaction with a single relevant document, we can design a highly effective DIME, allowing us to outperform the baseline by up to +0.224 nDCG@10 points (+58.6%, moving from 0.384 to 0.608).DOI: 10.1145/3626772.3657691Project(s): EFRA
Metrics:
See at:
IRIS Cnr
| IRIS Cnr
| IRIS Cnr
| Archivio istituzionale della ricerca - Università di Padova
| Archivio della Ricerca - Università di Pisa
| Archivio istituzionale della ricerca - Università di Padova
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
Improving RAG systems via sentence clustering and reordering
Alessio M., Faggioli G., Ferro N., Nardini F. M., Perego R.Large Language Models (LLMs) have gained noteworthy importance and attention across different domains and fields in recent years. Information Retrieval (IR) is one of the domains they impacted the most, as witnessed by the recent increase in the number of IR systems incorporating generative models. Specifically, Retrieval Augmented Generation (RAG) is the emerging paradigm that integrates existing knowledge from large-scale document corpora into the generation process, enabling the model to generate more coherent, contextually relevant, and accurate text across various tasks. Such tasks include summarization, question answering, and dialogue systems. Recent studies have highlighted the significant positional dependence exhibited by RAG systems. Such studies observed how the placement of information within the LLM input prompt drastically affects the generated output. We ground our study on this property by investigating alternative strategies for ordering sentences within the LLM prompt to improve the average quality of the generated responses in the user and conversational system dialogues. We propose the architecture of an end-to-end RAG-based conversational assistant and empirically evaluate our strategies using the TREC CAsT 2022 collection. Our experiments highlight significant differences between distinct arrangement strategies. By employing an evaluation methodology based on RankVicuna, we show that our best approach achieves improvements up to 54% in terms of overall response quality over baseline methods.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3784, pp. 34-43. Washington DC, USA, 07/07/2024
Project(s): EFRA 
See at:
ceur-ws.org
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
LongDoc summarization using instruction-tuned large language models for food safety regulations
Rocchietti G., Rulli C., Randl K., Muntean C., Nardini F. M., Perego R., Trani S., Karvounis M., Janostik J.We design and implement a summarization pipeline for regulatory documents, focusing on two main objectives: creating two silver standard datasets using instruction-tuned large language models (LLMs) and finetuning smaller LLMs to perform summarization of regulatory text. In the first task, we employ state-of-the-art models, Cohere C4AI Command-R-4bit and Llama-3-8B, to generate summaries of regulatory documents. These generated summaries serve as ground-truth data for the second task, where we finetune three general-purpose LLMs to specialize in high-quality summary generation for specific documents while reducing the computational requirements. Specifically, we finetune two Google Flan-T5 models using datasets generated by Llama-3-8B and Cohere C4AI, and we create a quantized (4-bit) version of Google Gemma 2-B based on summaries from Cohere C4AI. Additionally, we initiated a pilot activity involving legal experts from SGS-Digicomply to validate the effectiveness of our summarization pipeline.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3802, pp. 33-42. Udine, Italy, 5-6/09/2024
Project(s): EFRA 
See at:
ceur-ws.org
| CNR IRIS
| CNR IRIS