2022
Conference article  Open Access

A preliminary assessment of the article deduplication algorithm used for the OpenAIRE Research Graph

Vichos K., De Bonis M., Kanellos I., Chatzopoulos S., Atzori C., Manola N., Manghi P., Vergoulis T.

Deduplication  Open Science  Scholarly data  Knowledge graphs 

In recent years, a large number of Scholarly Knowledge Graphs (SKGs) have been introduced in the literature. The communities behind these graphs strive to gather, clean, and integrate scholarly metadata from various sources to produce clean and easy-to-process knowledge graphs. In this context, a very important task of the respective cleaning and integration workflows is deduplication. In this paper, we briefly describe and evaluate the accuracy of the deduplication algorithm used for the OpenAIRE Research Graph. Our experiments show that the algorithm has an adequate performance producing a small number of false positives and an even smaller number of false negatives.

Source: IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy, 24-25/02/2022



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:468963,
	title = {A preliminary assessment of the article deduplication algorithm used for the OpenAIRE Research Graph},
	author = {Vichos K. and De Bonis M. and Kanellos I. and Chatzopoulos S. and Atzori C. and Manola N. and Manghi P. and Vergoulis T.},
	booktitle = {IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy, 24-25/02/2022},
	year = {2022}
}
CNR ExploRA

Bibliographic record

ISTI Repository

Published version Open Access

Also available from

ceur-ws.orgOpen Access

OpenAIRE Nexus
OpenAIRE-Nexus Scholarly Communication Services for EOSC users


OpenAIRE