Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., La Bruzzo S. F., Mannocci A., Manghi P.
Deduplication Framework Java Spark Hadoop
The GDup Software enables an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup supports practitioners with the functionalities needed to realize a fully-fledged entity deduplication workflow over a generic input graph, including Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph. GDup is today one of the core components of the OpenAIRE infrastructure production system, monitoring Open Science trends on behalf of the European Commission.
@misc{oai:it.cnr:prodotti:478833, title = {dnet-dedup framework}, author = {Artini M. and Atzori C. and Bardi A. and Baglioni M. and De Bonis M. and Dell'Amico A. and La Bruzzo S. F. and Mannocci A. and Manghi P.}, year = {2022} }
Artini, Michele
0000-0002-4406-428X
Atzori, Claudio
0000-0001-9613-6639
Baglioni, Miriam
0000-0002-2273-9004
Bardi, Alessia
0000-0002-1112-1292
De Bonis, Michele
0000-0003-2347-6012
Dell'Amico, Andrea
0000-0002-0127-2791
La Bruzzo, Sandro Fabrizio
0000-0003-2855-1245
Manghi, Paolo
0000-0001-7291-3210
Infrastructures for Science (2021-ongoing)
Servizio Infrastruttura Informatica ISTI e Supporto ai Servizi (2018-ongoing)
OpenAIRE-Advance
OpenAIRE Advancing Open Scholarship
OpenAIRE Nexus
OpenAIRE-Nexus Scholarly Communication Services for EOSC users