2026
Journal article
Open Access
Projection-displacement-based query performance prediction for embedded space of dense retrievers
Datta Suchana, Faggioli Guglielmo, Ferro Nicola, Ganguly Debasis, Muntean Cristina Ioana, Perego Raffaele, Tonellotto NicolaRecent advances in representation learning have enabled neural Information Retrieval (IR) systems to use learned dense representations for queries and documents to effectively handle semantics, language nuances, and vocabulary mismatch problems. In contrast to traditional IR systems that rely on word matching, dense IR models exploit query/document similarity in dense latent spaces to account for semantics. This requires substantial training data and comes with increased computational demands. Thus, it would be beneficial to predict how a system will perform for a given query to decide whether a dense IR model is the best option or alternatives should be used. Traditional Query Performance Prediction (QPP) models are designed for lexical IR approaches and perform sub-optimally when applied to dense neural IR systems. Therefore, there has been a renewed interest in QPP methods to improve their effectiveness for dense neural IR models. While the results of the new QPP methods are generally encouraging, there is ample room for improvement in absolute performance and stability. We argue that by using features more aligned with the underlying rationale of dense IR models, we can enhance the performance of QPP. In this respect, we propose the Projection-Displacement-Based QPP (PDQPP), which exploits the geometric properties of dense IR models, projects queries and retrieved documents onto subspaces defined by pseudo-relevant documents, and considers changes in retrieval scores within them as a proxy for retrieval coherence. Minor score changes suggest robust and coherent retrieval, while significant alterations indicate semantic divergence and potentially poor performance. Results over a wide range of experimental settings on both traditional (TREC Robust) and neural-oriented (TREC Deep Learning) test collections show that PDQPP mostly outperforms the state-of-the-art QPP baselines.Source: ACM TRANSACTIONS ON INFORMATION SYSTEMS, vol. 44 (issue 1), pp. 1-30
DOI: 10.1145/3765617Metrics:
See at:
dl.acm.org
| CNR IRIS
| ACM Transactions on Information Systems
| CNR IRIS
2025
Conference article
Open Access
Maybe you are looking for CroQS Cross-Modal Query Suggestion for text-to-image retrieval
Pacini G., Carrara F., Messina N., Tonellotto N., Amato G., Falchi F.Query suggestion, a technique widely adopted in information retrieval, enhances system interactivity and the browsing experience of document collections. In cross-modal retrieval, many works have focused on retrieving relevant items from natural language queries, while few have explored query suggestion solutions. In this work, we address query suggestion in cross-modal retrieval, introducing a novel task that focuses on suggesting minimal textual modifications needed to explore visually consistent subsets of the collection, following the premise of “Maybe you are looking for”. To facilitate the evaluation and development of methods, we present a tailored benchmark named CroQS. This dataset comprises initial queries, grouped result sets, and human-defined suggested queries for each group. We establish dedicated metrics to rigorously evaluate the performance of various methods on this task, measuring representativeness, cluster specificity, and similarity of the suggested queries to the original ones. Baseline methods from related fields, such as image captioning and content summarization, are adapted for this task to provide reference performance scores. Although relatively far from human performance, our experiments reveal that both LLM-based and captioning-based methods achieve competitive results on CroQS, improving the recall on cluster specificity by more than 115% and representativeness mAP by more than 52% with respect to the initial query. The dataset, the implementation of the baseline methods and the notebooks containing our experiments are available here: paciosoft.com/CroQS-benchmark/.Source: LECTURE NOTES IN COMPUTER SCIENCE, vol. 15573, pp. 138-152. Lucca, Italy, April 6–10, 2025
DOI: 10.1007/978-3-031-88711-6_9Project(s): Future Artificial Intelligence Research, a MUltimedia platform for Content Enrichment and Search in audiovisual archives
Metrics:
See at:
CNR IRIS
| link.springer.com
| CNR IRIS
| CNR IRIS
2025
Book
Open Access
Early-exit graph neural networks
Di Francesco A. G., Bucarelli M. S., Nardini F. M., Perego R., Tonellotto N., Silvestri F.Early-exit mechanisms allow deep neural networks to stop inference once prediction confidence is high, reducing latency and energy on easy inputs while retaining full-depth accuracy on harder ones. Similarly, adding early exit mechanisms to Graph Neural Networks (GNNs), the go-to models for graph-structured data, allows for dynamic trading depth for confidence on simple graphs while maintaining full-depth accuracy on harder ones to capture intricate relationships. Yet, their potential in deep GNNs, where over-smoothing, over-squashing or more generally vanishing gradients prevent these model to properly learn, remains largely unexplored. To address this, we introduce Symmetric-Anti-Symmetric GNNs (SAS-GNN), whose symmetry-based inductive biases yield stable intermediate representations that support safe early exits. Building on this backbone, we propose Early-Exit GNNs (EEGNNs), which attach confidence-aware exit neural heads which are trainable end-to-end based on the task objective, enabling on-the-fly termination at node or graph level. Experiments show that EEGNNs learn task-driven exit strategies, while achieving competitive results on heterophilic graphs and long-range tasks. Even when not outperforming the strongest baselines, EEGNNs consistently deliver favorable accuracy-efficiency trade-offs thanks to their adaptive and parameter-efficient design. We plan to release the code to reproduce our experiments.DOI: 10.48550/arxiv.2505.18088Metrics:
See at:
arXiv.org e-Print Archive
| CNR IRIS
| doi.org
| CNR IRIS
2025
Conference article
Open Access
Breaking the 2D dependency: what limits 3D-only open-vocabulary scene understanding
D’orsi D., Carrara F., Falchi F., Tonellotto N.Open-vocabulary 3D scene understanding, i.e., recognizing and classifying objects in 3D scenes without being limited to a predefined set of classes, is a foundational task for robotics and extended reality applications. Current leading methods often rely on 2D foundation models to extract semantics, then projected in 3D. This paper investigates the viability of a purely 3D-native pipeline, thereby eliminating dependencies on 2D models and reprojections. We systematically explored various architectural combinations using established 3D components. However, our extensive experiments on benchmark datasets reveal significant performance limitations with this direct 3D-native approach, with performance metrics falling short of expectations. Rather than a simple failure, these outcomes provide critical insights into the current deficiencies of existing 3D models when cascaded for complex open-vocabulary tasks. We highlight the lessons learned, identify the pipeline's limitations (e.g., segmenter-encoder domain gap, robustness to imperfect segmentations), and posit future research directions. We argue that a fundamental rethinking of model design and interplay is necessary to realize the potential of truly 3D-native open-vocabulary understanding.Source: PROCEEDINGS INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA. Dublino, Irlanda, 22-24 October 2025
DOI: 10.1109/cbmi66578.2025.11339286Project(s): Social and Human Centered XR
Metrics:
See at:
CNR IRIS
| ieeexplore.ieee.org
| CNR IRIS
| CNR IRIS
2024
Patent
Restricted
Caching historical embeddings in conversational search
Frieder O., Mele I., Muntean C., Nardini F. M., Perego R., Tonellotto N.A method and system are described for improving the speed and efficiency of obtaining conversational search results. A user may speak a phrase to perform a conversational search or a series of phrases to perform a series of searches. These spoken phrases may be enriched by context and then converted into a query embedding. A similarity between the query embedding and document embeddings is used to determine the search results including a query cutoff number of documents and a cache cutoff number of documents. A second search phrase may use the cache of documents along with comparisons of the returned documents and the first query embedding to determine the quality of the cache for responding to the second search query. If the results are high-quality then the search may proceed much more rapidly by applying the second query only to the cached documents rather than to the server.
See at:
CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
Learning to rank for non independent and identically distributed datasets
Cecchetti J., Tonellotto N., Perego R.With the growing data privacy concerns, federated machine learning algorithms capable of preserving the confidentiality of sensitive information while enabling collaborative model training across decentralized data sources are attracting increasing interest. In this paper, we address the problem of collaboratively learning effective ranking models from non-independently and identically distributed (non-IID) training data owned by distinct search clients. We assume that the learning agents cannot access each other's data, and that the models learned from local datasets might be biased or underperforming due to a skewed distribution of certain document features or query topics in the learning-to-rank training data. Thus, we aim to instill in the local ranking model learned from local data the knowledge from other models to obtain a more robust ranker capable of effectively handling documents and queries underrepresented in the local collection. To achieve this, we explore different methods for merging the ranking models, thus obtaining in each client a model that excels in ranking documents from the local data distribution but also performs well on queries retrieving documents having distributions typical of a partner's node. In particular, our findings suggest that by relying on a linear combination of the local models, we can improve IR models effectiveness by up to +17.92% in NDCG@10 (moving from 0.619 to 0.730), and by up to +19.64% in MAP (moving from 0.713 to 0.853).DOI: 10.1145/3664190.3672513Project(s): EFRA 
, Future Artificial Intelligence Research
Metrics:
See at:
IRIS Cnr
| IRIS Cnr
| IRIS Cnr
| Archivio della Ricerca - Università di Pisa
| Archivio della Ricerca - Università di Pisa
| CNR IRIS
2024
Conference article
Open Access
DESIRE-ME: Domain-Enhanced Supervised Information Retrieval Using Mixture-of-Experts
Kasela P., Pasi G., Perego R., Tonellotto N.Open-domain question answering requires retrieval systems able to cope with the diverse and varied nature of questions, providing accurate answers across a broad spectrum of query types and topics. To deal with such topic heterogeneity through a unique model, we propose DESIRE-ME, a neural information retrieval model that leverages the Mixture-of-Experts framework to combine multiple specialized neural models. We rely on Wikipedia data to train an effective neural gating mechanism that classifies the incoming query and that weighs the predictions of the different domain-specific experts correspondingly. This allows DESIRE-ME to specialize adaptively in multiple domains. Through extensive experiments on publicly available datasets, we show that our proposal can effectively generalize domain-enhanced neural models. DESIRE-ME excels in handling open-domain questions adaptively, boosting by up to 12% in NDCG@10 and 22% in P@1, the underlying state-of-the-art dense retrieval modelSource: LECTURE NOTES IN COMPUTER SCIENCE, vol. 14609, pp. 111-125. Glasgow, UK, 24–28/03/2024
DOI: 10.1007/978-3-031-56060-6_8DOI: 10.48550/arxiv.2403.13468Project(s): EFRA
Metrics:
See at:
arXiv.org e-Print Archive
| IRIS Cnr
| IRIS Cnr
| IRIS Cnr
| doi.org
| doi.org
| BOA - Bicocca Open Archive
| Archivio della Ricerca - Università di Pisa
| CNR IRIS
| CNR IRIS
2023
Conference article
Restricted
A geometric framework for query performance prediction in conversational search
Faggioli G., Ferro N., Muntean C., Perego R., Tonellotto N.Thanks to recent advances in IR and NLP, the way users interact with search engines is evolving rapidly, with multi-turn conversations replacing traditional one-shot textual queries. Given its interactive nature, Conversational Search (CS) is one of the scenarios that can benefit the most from Query Performance Prediction (QPP) techniques. QPP for the CS domain is a relatively new field and lacks proper framing. In this study, we address this gap by proposing a framework for the application of QPP in the CS domain and use it to evaluate the performance of predictors. We characterize what it means to predict the performance in the CS scenario, where information needs are not independent queries but a series of closely related utterances. We identify three main ways to use QPP models in the CS domain: as a diagnostic tool, as a way to adjust the system's behaviour during a conversation, or as a way to predict the system's performance on the next utterance. Due to the lack of established evaluation procedures for QPP in the CS domain, we propose a protocol to evaluate QPPs for each of the use cases. Additionally, we introduce a set of spatial-based QPP models designed to work the best in the conversational search domain, where dense neural retrieval models are the most common approaches and query cutoffs are typically small. We show how the proposed QPP approaches improve significantly the predictive performance over the state-of-the-art in different scenarios and collections.DOI: 10.1145/3539618.3591625Project(s): SoBigData-PlusPlus
Metrics:
See at:
dl.acm.org
| CNR IRIS
| CNR IRIS
2023
Journal article
Open Access
Artificial intelligence of things at the edge: scalable and efficient distributed learning for massive scenarios
Bano S., Tonellotto N., Cassarà P., Gotta A.Federated Learning (FL) is a distributed optimization method in which multiple client nodes collaborate to train a machine learning model without sharing data with a central server. However, communication between numerous clients and the central aggregation server to share model parameters can cause several problems, including latency and network congestion. To address these issues, we propose a scalable communication infrastructure based on Information-Centric Networking built and tested on Apache Kafka®. The proposed architecture consists of a two-tier communication model. In the first layer, client updates are cached at the edge between clients and the server, while in the second layer, the server computes global model updates by aggregating the cached models. The data stored in the intermediate nodes at the edge enables reliable and effective data transmission and solves the problem of intermittent connectivity of mobile nodes. While many local model updates provided by clients can result in a more accurate global model in FL, they can also result in massive data traffic that negatively impacts congestion at the edge. For this reason, we couple a client selection procedure based on a congestion control mechanism at the edge for the given architecture of FL. The proposed algorithm selects a subset of clients based on their resources through a time-based backoff system to account for the time-averaged accuracy of FL while limiting the traffic load. Experiments show that our proposed architecture has an improvement of over 40% over the network-centric based FL architecture, i.e., Flower. The architecture also provides scalability and reliability in the case of mobile nodes. It also improves client resource utilization, avoids overflow, and ensures fairness in client selection. The experiments show that the proposed algorithm leads to the desired client selection patterns and is adaptable to changing network environments.Source: COMPUTER COMMUNICATIONS, vol. 205, pp. 45-57
DOI: 10.1016/j.comcom.2023.04.010DOI: https://doi.org/10.1016/j.comcom.2023.04.010Project(s): TEACHING
Metrics:
See at:
CNR IRIS
| ISTI Repository
| Computer Communications
| CNR IRIS
| CNR IRIS
| CNR IRIS
2020
Journal article
Open Access
Topical result caching in web search engines
Mele I, Tonellotto N, Frieder O, Perego RCaching search results is employed in information retrieval systems to expedite query processing and reduce back-end server workload. Motivated by the observation that queries belonging to different topics have different temporal-locality patterns, we investigate a novel caching model called STD (Static-Topic-Dynamic cache), a refinement of the traditional SDC (Static-Dynamic Cache) that stores in a static cache the results of popular queries and manages the dynamic cache with a replacement policy for intercepting the temporal variations in the query stream. Our proposed caching scheme includes another layer for topic-based caching, where the entries are allocated to different topics (e.g., weather, education). The results of queries characterized by a topic are kept in the fraction of the cache dedicated to it. This permits to adapt the cache-space utilization to the temporal locality of the various topics and reduces cache misses due to those queries that are neither sufficiently popular to be in the static portion nor requested within short-time intervals to be in the dynamic portion. We simulate different configurations for STD using two real-world query streams. Experiments demonstrate that our approach outperforms SDC with an increase up to 3% in terms of hit rates, and up to 36% of gap reduction w.r.t. SDC from the theoretical optimal caching algorithm.Source: INFORMATION PROCESSING & MANAGEMENT, vol. 57 (issue 3), pp. 1-21
DOI: 10.1016/j.ipm.2019.102193DOI: 10.48550/arxiv.2001.03010Project(s): BigDataGrapes
Metrics:
See at:
arXiv.org e-Print Archive
| Information Processing & Management
| CNR IRIS
| ISTI Repository
| www.sciencedirect.com
| Information Processing & Management
| doi.org
| CNR IRIS
| CNR IRIS
2019
Journal article
Open Access
Parallel Traversal of Large Ensembles of Decision Trees
Lettich F, Lucchese C, Nardini Fm, Orlando S, Perego R, Tonellotto N, Venturini RMachine-learnt models based on additive ensembles of regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. The deployment of such models is computationally demanding: to compute the final prediction, the whole ensemble must be traversed by accumulating the contributions of all its trees. In particular, traversal cost impacts applications where the number of candidate items is large, the time budget available to apply the learnt model to them is limited, and the users' expectations in terms of quality-of-service is high. Document ranking in web search, where sub-optimal ranking models are deployed to find a proper trade-off between efficiency and effectiveness of query answering, is probably the most typical example of this challenging issue. This paper investigates multi/many-core parallelization strategies for speeding up the traversal of large ensembles of regression trees thus obtaining machine-learnt models that are, at the same time, effective, fast, and scalable. Our best results are obtained by the GPU-based parallelization of the state-of-the-art algorithm, with speedups of up to 102.6x.Source: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (PRINT), vol. 30 (issue 9), pp. 2075-2089
DOI: 10.1109/tpds.2018.2860982DOI: 10.5281/zenodo.2668378DOI: 10.5281/zenodo.2668379Project(s): BigDataGrapes
Metrics:
See at:
IEEE Transactions on Parallel and Distributed Systems
| ZENODO
| ZENODO
| Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari
| CNR IRIS
| ieeexplore.ieee.org
| ISTI Repository
| IEEE Transactions on Parallel and Distributed Systems
| CNR IRIS
| CNR IRIS
2019
Conference article
Open Access
Enhanced news retrieval: passages lead the way!
Catena M., Nardini F. M., Frieder O., Perego R., Muntean Cristina Ioana, Tonellotto N.We observe that most relevant terms in unstructured news articles are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Passage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of news articles. Our experimentation, conducted using three publicly available news datasets, demonstrates that BM25P markedly outperforms BM25 in term of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1.DOI: 10.1145/3331184.3331373Project(s): BIGDATAGRAPES
Metrics:
See at:
dl.acm.org
| CNR IRIS
| doi.org
| CNR IRIS
2018
Journal article
Open Access
Dataset popularity prediction for caching of CMS big data
Meoni M, Perego R, Tonellotto NThe Compact Muon Solenoid (CMS) experiment at the European Organization for Nuclear Research (CERN) deploys its data collections, simulation and analysis activities on a distributed computing infrastructure involving more than 70 sites worldwide. The historical usage data recorded by this large infrastructure is a rich source of information for system tuning and capacity planning. In this paper we investigate how to leverage machine learning on this huge amount of data in order to discover patterns and correlations useful to enhance the overall efficiency of the distributed infrastructure in terms of CPU utilization and task completion time. In particular we propose a scalable pipeline of components built on top of the Spark engine for large-scale data processing, whose goal is collecting from different sites the dataset access logs, organizing them into weekly snapshots, and training, on these snapshots, predictive models able to forecast which datasets will become popular over time. The high accuracy achieved indicates the ability of the learned model to correctly separate popular datasets from unpopular ones. Dataset popularity predictions are then exploited within a novel data caching policy, called PPC (Popularity Prediction Caching). We evaluate the performance of PPC against popular caching policy baselines like LRU (Least Recently Used). The experiments conducted on large traces of real dataset accesses show that PPC outperforms LRU reducing the number of cache misses up to 20% in some sites.Source: JOURNAL OF GRID COMPUTING, vol. 16 (issue 2), pp. 211-228
DOI: 10.1007/s10723-018-9436-4Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| Journal of Grid Computing
| CNR IRIS
| CNR IRIS
2018
Conference article
Open Access
Efficient query processing infrastructures: A half-day tutorial at SIGIR 2018
Tonellotto N, Macdonald CTypically, techniques that benefit effectiveness of information retrieval (IR) systems have a negative impact on efficiency. Yet, with the large scale of Web search engines, there is a need to deploy efficient query processing techniques to reduce the cost of the infrastructure required. This tutorial aims to provide a detailed overview of the infrastructure of an IR system devoted to the efficient yet effective processing of user queries. This tutorial guides the attendees through the main ideas, approaches and algorithms developed in the last 30 years in query processing. In particular, we illustrate, with detailed examples and simplified pseudo-code, the most important query processing strategies adopted in major search engines, with a particular focus on dynamic pruning techniques. Moreover, we present and discuss the state-of-the-art innovations in query processing, such as impact-sorted and blockmax indexes. We also describe how modern search engines exploit such algorithms with learning-to-rank (LtR) models to produce effective results, exploiting new approaches in LtR query processing. Finally, this tutorial introduces query efficiency predictors for dynamic pruning, and discusses their main applications to scheduling, routing, selective processing and parallelisation of query processing, as deployed by a major search engine.DOI: 10.1145/3209978.3210191Metrics:
See at:
dl.acm.org
| CNR IRIS
| ISTI Repository
| doi.org
| CNR IRIS
| CNR IRIS
2018
Journal article
Open Access
Efficient query processing for scalable web search
Tonellotto N, Macdonald C, Ounis ISearch engines are exceptionally important tools for accessing information in today's world. In satisfying the information needs of millions of users, the effectiveness (the quality of the search results) and the efficiency (the speed at which the results are returned to the users) of a search engine are two goals that form a natural trade-off, as techniques that improve the effectiveness of the search engine can also make it less efficient. Meanwhile, search engines continue to rapidly evolve, with larger indexes, more complex retrieval strategies and growing query volumes. Hence, there is a need for the development of efficient query processing infrastructures that make appropriate sacrifices in effectiveness in order to make gains in efficiency. This survey comprehensively reviews the foundations of search engines, from index layouts to basic term-at-a-time (TAAT) and document-at-a-time (DAAT) query processing strategies, while also providing the latest trends in the literature in efficient query processing, including the coherent and systematic reviews of techniques such as dynamic pruning and impact-sorted posting lists as well as their variants and optimisations. Our explanations of query processing strategies, for instance the WAND and BMW dynamic pruning algorithms, are presented with illustrative figures showing how the processing state changes as the algorithms progress. Moreover, acknowledging the recent trends in applying a cascading infrastructure within search systems, this survey describes techniques for efficiently integrating effective learned models, such as those obtained from learning-to-rank techniques. The survey also covers the selective application of query processing techniques, often achieved by predicting the response times of the search engine (known as query efficiency prediction), and making per-query tradeoffs between efficiency and effectiveness to ensure that the required retrieval speed targets can be met. Finally, the survey concludes with a summary of open directions in efficient search infrastructures, namely the use of signatures, real-time, energy-efficient and modern hardware and software architectures.Source: FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, vol. 12, pp. 319-500
DOI: 10.1561/1500000057DOI: 10.1561/9781680835434DOI: 10.5281/zenodo.3268358DOI: 10.5281/zenodo.3268359Project(s): BigDataGrapes
Metrics:
See at:
Enlighten
| Archivio della Ricerca - Università di Pisa
| ZENODO
| Foundations and Trends® in Information Retrieval
| CNR IRIS
| www.nowpublishers.com
2018
Contribution to book
Metadata Only Access
Popularity-based caching of CMS datasets
Meoni M, Perego R, Tonellotto NThe distributed monitoring infrastructure of the Compact Muon Solenoid (CMS) experiment at the European Organization for Nuclear Research (CERN) records on a Hadoop infrastructures a broad variety of computing and storage logs. They represent a valuable source of information for system tuning and capacity planning. In this paper we analyze machine learning (ML) techniques on large amount of traces to discover patterns and correlations useful to classify the popularity of experiment-related datasets. We implement a scalable pipeline of Spark components which collect the dataset access logs from heterogeneous monitoring sources and group them into weekly snapshots organized by CMS sites. Predictive models are trained on these snapshots and forecast which dataset will become popular over time. Dataset popularity predictions are then used to experiment a novel strategy of data caching, called Popularity Prediction Caching (PPC). We compare the hit rates of PPC with those produced by well known caching policies. We demonstrate how the performance improvement is as high as 20% in some sites.DOI: 10.3233/978-1-61499-843-3-221Metrics:
See at:
ebooks.iospress.nl
| CNR IRIS