Catena M., Nardini F. M., Frieder O., Perego R., Muntean C. I., Tonellotto N.
Information Retrieval
We observe that most relevant terms in unstructured news articles are primarily concentrated towards the beginning and the end of the document. Exploiting this observation, we propose a novel version of the classical BM25 weighting model, called BM25 Passage (BM25P), which scores query results by computing a linear combination of term statistics in the different portions of news articles. Our experimentation, conducted using three publicly available news datasets, demonstrates that BM25P markedly outperforms BM25 in term of effectiveness by up to 17.44% in NDCG@5 and 85% in NDCG@1.
Source: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1269–1272, Parigi, Francia, 21-25 July 2019
@inproceedings{oai:it.cnr:prodotti:415603, title = {Enhanced news retrieval: passages lead the way!}, author = {Catena M. and Nardini F. M. and Frieder O. and Perego R. and Muntean C. I. and Tonellotto N.}, doi = {10.1145/3331184.3331373}, booktitle = {42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1269–1272, Parigi, Francia, 21-25 July 2019}, year = {2019} }