Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., La Bruzzo S. F., Mannocci A., Manghi P.
Deduplication Framework Java Spark Hadoop
The GDup Software enables an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup supports practitioners with the functionalities needed to realize a fully-fledged entity deduplication workflow over a generic input graph, including Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph. GDup is today one of the core components of the OpenAIRE infrastructure production system, monitoring Open Science trends on behalf of the European Commission.
@misc{oai:it.cnr:prodotti:478833, title = {dnet-dedup framework}, author = {Artini M. and Atzori C. and Bardi A. and Baglioni M. and De Bonis M. and Dell'Amico A. and La Bruzzo S. F. and Mannocci A. and Manghi P.}, year = {2022} }
Artini, Michele0000-0002-4406-428X
Atzori, Claudio0000-0001-9613-6639
Baglioni, Miriam0000-0002-2273-9004
Bardi, Alessia0000-0002-1112-1292
De Bonis, Michele0000-0003-2347-6012
Dell'Amico, Andrea0000-0002-0127-2791
La Bruzzo, Sandro Fabrizio0000-0003-2855-1245
Manghi, Paolo0000-0001-7291-3210
Mannocci, Andrea0000-0002-5193-7851
Infrastructures for Science (2021-ongoing)
Servizio Infrastruttura Informatica ISTI e Supporto ai Servizi (2018-ongoing)
OpenAIRE-Advance
OpenAIRE Advancing Open Scholarship
OpenAIRE Nexus
OpenAIRE-Nexus Scholarly Communication Services for EOSC users