2018
Conference article  Open Access

De-duplicating the OpenAIRE scholarly communication big graph

Atzori C., Manghi P., Bardi A.

open science  Scholarly communication  Graph  entity deduplication  Deduplication  duplicate identification  OpenAIRE  Big data 

The OpenAIRE infrastructure populates a scholarly communication big graph interlinking metadata objects of publications, datasets, software, organizations, funders, and projects. In order to de-duplicate this graph, OpenAIRE has developed GDup, an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup offers functionalities to realize a hilly-fledged entity deduplication workflow over a generic input graph, inclusive of Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph.

Source: e-science 2018 - 14th IEEE International Conference on e-Science (e-Science), pp. 372–373, Amsterdam, the Netherlands, 29 October - 01 November 2018


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:402402,
	title = {De-duplicating the OpenAIRE scholarly communication big graph},
	author = {Atzori C. and Manghi P. and Bardi A.},
	doi = {10.1109/escience.2018.00104 and 10.5281/zenodo.1489139 and 10.5281/zenodo.1489140},
	booktitle = {e-science 2018 - 14th IEEE International Conference on e-Science (e-Science), pp. 372–373, Amsterdam, the Netherlands, 29 October -  01 November 2018},
	year = {2018}
}

OpenAIRE2020
Open Access Infrastructure for Research in Europe 2020

OpenAIRE-Advance
OpenAIRE Advancing Open Scholarship


OpenAIRE