2025
Conference article  Open Access

Efficient re-ranking with cross-encoders via early exit

Busolin F., Lucchese C., Nardini F. M., Orlando S., Perego R., Trani S., Veneri A.

Efficiency  Early exit  LLM-based rankers 

Pre-trained language models based on transformer networks arehighly effective for document re-ranking in ad-hoc search. Amongthese, cross-encoders stand out for their effectiveness, as they pro-cess query-document pairs through the entire transformer networkto compute ranking scores. However, this traversal is computation-ally expensive. To address this, prior work has explored early-exitstrategies, enabling the model to terminate the traversal of query-document pairs. These techniques rely on learned classifiers, placedafter each transformer block, that decide if a query-document paircan be dropped. Diverging from previous approaches, we proposeSimilarity-based Early Exit ( SEE ), a novel—non-learned—strategythat exploits the similarities between query and document tokenembeddings to early-terminate the inference of documents that willmost likely be non-relevant to the query. Even though SEE can beused after every transformer block, we show that the best advan-tage is achieved when applied before the first transformer block,thus saving most of the inference cost for the query-document pairs.Reproducible experiments on 17 public datasets covering in-domainand out-of-domain evaluation show that SEE can be effectively ap-plied to four different cross-encoders, achieving speedups of up to3.5× with a limited loss in ranking effectiveness.

Publisher: Association for Computing Machinery


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:iris.cnr.it:20.500.14243/562499,
	title = {Efficient re-ranking with cross-encoders via early exit},
	author = {Busolin F. and Lucchese C. and Nardini F.  M. and Orlando S. and Perego R. and Trani S. and Veneri A.},
	publisher = {Association for Computing Machinery},
	doi = {10.1145/3726302.3729962},
	year = {2025}
}