2015
Software
Metadata Only Access
See at:
CNR IRIS
2022
Conference article
Restricted
Sci-K 2022 - International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment
Manghi P, Mannocci A, Osborne F, Sacharidis D, Salatino A, Vergoulis TIn this paper we present the 2nd edition of the Scientific Knowledge: Representation, Discovery, and Assessment (Sci-K 2022) workshop. Sci-K aims to explore innovative solutions and ideas for the generation of approaches, data models, and infrastructures (e.g., knowledge graphs) for supporting, directing, monitoring and assessing the scientific knowledge and progress. This edition is also a reflection point as the community is seeking alternative solutions to the now-defunct Microsoft Academic Graph (MAG).DOI: 10.1145/3487553.3524883Metrics:
See at:
dl.acm.org
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
Exploring the Italian research landscape on Digital Library in the Conference IRCDL
Bernasconi E., Mannocci A., Tammaro A. M.This study aims to explore the structure of knowledge around digital libraries embedded in IRCDL Conference presentations and examine research trends over time. It also analysed the published articles' subject, the authors, their affiliations and provenance and the collaboration network in IRCDL. We applied several bibliometric techniques, including productivity visualisation, authorship network analysis, and subject analysis.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3643, pp. 230-245. Brixen, 22-23 February 2024.
See at:
ceur-ws.org
| CNR IRIS
| CNR IRIS
2022
Software
Metadata Only Access
dnet-dedup framework
Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., La Bruzzo S. F., Mannocci A., Manghi P.The GDup Software enables an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup supports practitioners with the functionalities needed to realize a fully-fledged entity deduplication workflow over a generic input graph, including Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph. GDup is today one of the core components of the OpenAIRE infrastructure production system, monitoring Open Science trends on behalf of the European Commission.Project(s): OpenAIRE-Advance 
,
OpenAIRE Nexus 
See at:
github.com
| CNR IRIS
2013
Conference article
Restricted
Data searchery: preliminary analysis of data sources interlinking
Manghi P, Mannocci AThe novel e-Science's data-centric paradigm has proved that interlinking publications and research data objects coming from different realms and data sources (e.g. publication repositories, data repositories) makes dissemination, re-use, and validation of research activities more effective. Scholarly Communication Infrastructures are advocated for bridging such data sources, by offering tools for identification, creation, and navigation of relationships. Since realization and maintenance of such infrastructures is expensive, in this demo we propose a lightweight approach for "preliminary analysis of data source interlinking" to help practitioners at evaluating whether and to what extent realizing them can be effective. We present Data Searchery, a congurable tool enabling users to easily plug-in data sources from different realms with the purpose of cross-relating their objects, be them publications or research data, by identifying relationships between their metadata descriptions.DOI: 10.1007/978-3-642-40501-3_60Project(s): RDA EUROPE
Metrics:
See at:
CNR IRIS
| CNR IRIS
| link.springer.com
2014
Contribution to book
Restricted
Preliminary analysis of data sources interlinking
Mannocci A., Manghi P.The novel e-Science's data-centric paradigm has proved that interlinking publications and research data objects coming from different realms and data sources (e.g. publication repositories, data repositories) makes dissemination, re-use, and validation of research activities more effective. Scholarly Communication Infrastructures (SCIs) are advocated for bridging such data sources by offering an overlay of services for identification, creation, and navigation of relationships among objects of different nature. Since realization and maintenance of such infrastructures is in general very cost-consuming, in this paper we propose a lightweight approach for "preliminary analysis of data source interlinking" to help practitioners at evaluating whether and to what extent realizing them can be effective. We present Data Searchery, a configurable tool delivering a service for relating objects across data sources, be them publications or research data, by identifying relationships between their metadata descriptions in real-time.DOI: 10.1007/978-3-319-08425-1_6DOI: 10.1007/978-3-319-14226-5_6Metrics:
See at:
biblioproxy.cnr.it
| doi.org
| doi.org
| CNR IRIS
| CNR IRIS
2021
Conference article
Open Access
BIP! DB: a dataset of impact measures for scientific publications
Vergoulis T, Kanellos I, Atzori C, Mannocci A, Chatzopoulos S, La Bruzzo S, Manola N, Manghi PThe growth rate of the number of scientific publications is constantly increasing, creating important challenges in the identification of valuable research and in various scholarly data management applications, in general. In this context, measures which can effectively quantify the scientific impact could be invaluable. In this work, we present BIP! DB, an open dataset that contains a variety of impact measures calculated for a large collection of more than 100 million scientific publications from various disciplines.DOI: 10.1145/3442442.3451369DOI: 10.48550/arxiv.2101.12001Project(s): OpenAIRE-Advance 
,
OpenAIRE Nexus
Metrics:
See at:
arXiv.org e-Print Archive
| arxiv.org
| dl.acm.org
| CNR IRIS
| ISTI Repository
| doi.org
| doi.org
| CNR IRIS
| CNR IRIS
2021
Dataset
Metadata Only Access
OpenAIRE research graph: dumps for research communities and initiatives
Manghi P, Atzori C, Bardi A, Baglioni M, Schirrwagen J, Dimitropoulos H, La Bruzzo S, Foufoulas I, Lohden A, Backer A, Mannocci A, Horst M, Czerniak A, Kiatropoulou K, Kokogiannaki A, De Bonis M, Artini M, Ottonello E, Lempesis A, Ioannidis A, Summan FThis dataset contains dumps of the OpenAIRE Research Graph containing metadata records relevant for the research communities and initiatives collaborating with OpenAIRE. Each dataset is a tar file containing gzip files with one json per line. Each json is compliant to the schema available at DOI: 10.5281/zenodo.3974226DOI: 10.5281/zenodo.3974604Project(s): RISIS 2 
,
BE OPEN 
,
OpenAIRE-Advance
Metrics:
See at:
CNR IRIS
2022
Other
Open Access
OpenAIRE Research Graph deduplication workflow
La Bruzzo Sf, Artini M, Atzori C, Bardi A, Baglioni M, De Bonis M, Mannocci A, Manghi P, Pavone GThe OpenAIRE aggregation workflow can collect metadata records from different providers about the same scholarly work. Each metadata record can carry different information because, for example, some providers are not aware of links to projects, keywords, or other details. Another typical case is when OpenAIRE collects one metadata record from a repository about a pre-print and another from a journal about the published article. To provide correct statistics, OpenAIRE must identify those cases and "merge" the two metadata records so that the scholarly work is counted only once in the statistics OpenAIRE produces. This technical Report describes the Deduplication workflow and technique adopted to deduplicate the OpenAIRE Graph.DOI: 10.32079/isti-tr-2022/032Project(s): OpenAIRE-Connect 
,
OpenAIRE Nexus
Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2023
Conference article
Open Access
Tracing data footprints: formal and informal data citations in the scientific literature
Irrera O, Mannocci A, Manghi P, Silvello GData citation has become a prevalent practice within the scientific community, serving the purpose of facilitating data discovery, reproducibility, and credit attribution. Consequently, data has gained significant importance in the scholarly process. Despite its growing prominence, data citation is still at an early stage, with considerable variations in practices observed across scientific domains. Such diversity hampers the ability to consistently analyze, detect, and quantify data citations.We focus on the European Marine Science (MES) community to examine how data is cited in this specific context. We identify four types of data citations: formal, informal, complete, and incomplete. By analyzing the usage of these diverse data citation modalities, we investigate their impact on the widespread adoption of data citation practices.DOI: 10.1007/978-3-031-43849-3_7Metrics:
See at:
CNR IRIS
| link.springer.com
| ISTI Repository
| doi.org
| CNR IRIS
| CNR IRIS
not yet published
Conference article
Open Access
Exploring scientometrics with the OpenAIRE Graph: introducing the OpenAIRE Beginner's Kit
Mannocci A., Baglioni M.The OpenAIRE Graph is an extensive resource housing diverse information onresearch products, including literature, datasets, and software, alongsideresearch projects and other scholarly outputs and context. It stands as acornerstone among contemporary research information databases, offeringinvaluable insights for scientometric investigations. Despite its wealth ofdata, its sheer size may initially appear daunting, potentially hindering itswidespread adoption. To address this challenge, this paper introduces theOpenAIRE Beginner's Kit, a user-friendly solution providing access to a subsetof the OpenAIRE Graph within a sandboxed environment coupled with a Jupyternotebook for analysis. The OpenAIRE Beginner's Kit is meticulously designed todemocratise research and data exploration, offering accessibility from standarddesktop and laptop setups. Within this paper, we provide a brief overview ofthe included dataset and offer guidance on leveraging the kit through aselection of illustrative queries tailored to address common scientometricinquiries.
See at:
arxiv.org
| CNR IRIS
| CNR IRIS
2024
Conference article
Open Access
The ARIADNEplus Knowledge Base: a Linked Open Data set for archaeological research
Bardi A., Baglioni M., Artini M., Mannocci A., Pavone G.The ARIADNE infrastructure provides tools and services for researchers to address archaeological grand challenges that require discovery and analysis of information scattered across different thematic and geographically distributed sources. The ARIADNEplus Knowledge Base (KB) is an archaeological Linked Open Data set modelled according to the ARIADNE ontology, based on CIDOC-CRM, and provided by an international network of organisations leaders in different domains of archaeological sciences. In February 2024, the ARIADNEplus KB features about 4 million archaeological resources. Thanks to the ARIADNE infrastructure, data providers increased the level of fairness of their resources and contributed to a unique asset for the archaeology research community, the European Open Science Cloud and society at large.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3741, pp. 91-100. Viallasimius, Italy, 23-26/06/2024
Project(s): ARIADNEplus 
,
ATRIUM 
See at:
ceur-ws.org
| CNR IRIS
| CNR IRIS
2019
Other
Open Access
The OpenAIRE research graph: third-party publishing APIs
Atzori C, Baglioni M, Bardi A, Manghi P, La Bruzzo S, De Bonis M, Dell'Amico A, Artini M, Mannocci A, Ottonello EThis work describes the specification of the OpenAIRE publishing APIs that support third-party services at publishing metadata about interlinked and packaged research products into the OpenAIRE Research Graph, in respect of the OpenAIRE interoperability guidelines (https://guidelines.openaire.eu). Research products generated by researchers using services of research infrastructures are today manually published by researchers in a repository external to their research infrastructure. This phase is often considered an extra burden, because researchers have to fill in metadata forms with information that is already available in the scope of the services they used. By using the OpenAIRE publishing APIs, services of research infrastructures can implement an on-demand publishing workflow for any type of research products to support their researchers at improving the FAIRness of their research products and relief them from the tedious step of finding a suitable repository and manually depositing the products in it.
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2021
Other
Open Access
InfraScience Research Activity Report 2020
Artini M, Assante M, Atzori C, Baglioni M, Bardi A, Candela L, Casini G, Castelli D, Cirillo R, Coro G, Debole F, Dell'Amico A, Frosini L, La Bruzzo S, Lazzeri E, Lelii L, Manghi P, Mangiacrapa F, Mannocci A, Pagano P, Panichi G, Piccioli T, Sinibaldi F, Straccia UInfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2020 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, e\-Sci\-ence, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, \ie D4Science and OpenAIRE. During 2020 InfraScience members contributed to the publishing of 30 papers, to the research and development activities of 12 research projects (11 funded by EU), to the organization of conferences and training events, to several working groups and task forces.DOI: 10.32079/isti-ar-2021/002Project(s): ARIADNEplus 
,
Blue Cloud 
,
PerformFISH 
,
EOSC-Pillar 
,
DESIRA 
,
EOSCsecretariat.eu 
,
RISIS 2 
,
TAILOR 
,
I-GENE 
,
MOVING 
,
OpenAIRE-Advance 
,
SoBigData-PlusPlus
Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2022
Other
Open Access
OpenAIRE Research Graph: aggregation workflow
La Bruzzo Sf, Artini M, Atzori C, Bardi A, Baglioni M, De Bonis M, Dell'Amico A, Mannocci A, Manghi P, Pavone GThe OpenAIRE Graph (formerly the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide. It is key in fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community. OpenAIRE collects metadata records from more than 70K scholarly communication sources worldwide, including Open Access institutional repositories, data archives, and journals. All the metadata records (i.e., descriptions of research products) are put together in a data lake with records from Crossref, Unpaywall, ORCID, ROR, and information about projects provided by national and international funders. This technical Report describes the main Aggregation Workflow to orchestrate the data aggregation and the implemented mapping from some of the main datasources into the OpenAIRE research graph data model.DOI: 10.32079/isti-tr-2022/033Project(s): OpenAIRE-Advance 
,
OpenAIRE Nexus
Metrics:
See at:
CNR IRIS
| ISTI Repository
| CNR IRIS
2014
Conference article
Restricted
The Europeana network of Ancient Greek and Latin Epigraphy data infrastructure
Mannocci A, Casarosa V, Manghi P, Zoppi FEpigraphic archives, containing collections of editions about ancient Greek and Latin inscriptions, have been created in several European countries during the last couple of centuries. Today, the project EAGLE (Europeana network of Ancient Greek and Latin Epigraphy, a Best Practice Network partially funded by the European Commission) aims at providing a single access point for the content of about 15 epigraphic archives, totaling about 1,5M digital objects. This paper illustrates some of the challenges encountered and their solution for the realization of the EAGLE data infrastructure. The challenges mainly concern the harmonization, interoperability and service integration issues caused by the aggregation of metadata from heterogeneous archives (different data models and metadata schemas, and exchange formats). EAGLE has defined a common data model for epigraphic information, into which data models from different archives can be optimally mapped. The data infrastructure is based on the D-NET software toolkit, capable of dealing with data collection, mapping, cleaning, indexing, and access provisioning through web portals or standard access protocols.Source: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT), pp. 286-300. Karlsruhe, Germany, 27-29 November 2014
DOI: 10.1007/978-3-319-13674-5_27Metrics:
See at:
doi.org
| CNR IRIS
| CNR IRIS
| link.springer.com
| www.scopus.com