69 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2025 Other Restricted
InfraScience research activity report 2024
Angioni S., Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bosio C., Bove P., Calanducci A., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., Ibrahim Ahmed, La Bruzzo S., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Molinaro E., Oliviero A., Pagano P., Panichi G., Teresa M. T., Pavone G., Peccerillo B., Piccioli T., Procaccini M., Straccia U., Vannini G. L., Versienti L.
InfraScience is a research group within the Institute of Information Science and Technologies (ISTI) of the National Research Council of Italy (CNR), based in Pisa. This activity report outlines the group's research achievements and initiatives throughout 2024. InfraScience focused its efforts on key challenges in the areas of Data Infrastructures, e-Science, and Intelligent Systems, maintaining a strong synergy between research and development and a firm commitment to open science principles. In 2024, the group played a leading role in the development and evolution of two major Open Science infrastructures: D4Science and OpenAIRE. InfraScience researchers contributed significantly to the scientific community through the publication of peer-reviewed papers, active participation in EU-funded research projects, organization of international conferences and training activities, and engagement in various working groups and task forces. This report highlights these contributions and underscores the group's ongoing dedication to advancing open, collaborative, and impactful science.DOI: 10.32079/isti-ar-2025/001
Metrics:


See at: CNR IRIS Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2024 Conference article Open Access OPEN
FDup framework: a general-purpose solution for efficient entity deduplication of record collections
De Bonis M., Atzori C., La Bruzzo S., Manghi P.
Deduplication is a technique aimed at identifying and resolving duplicate metadata records in a collection with a special focus on the performances of the approach. This paper describes FDup(Flat Collections Deduper), a general-purpose software framework supporting a complete deduplication workflow to manage big data record collections: metadata record data model definition, identification of candidate duplicates, identification of duplicates. FDup brings two main innovations: first, it delivers a full deduplication framework in a single easy-to-use software package based on Apache Spark Hadoop framework, where developers can customize the optimal and parallel workflow steps of blocking, sliding windows, and similarity matching function via an intuitive configuration file; second, it introduces a novel approach to improve performance, beyond the known techniques of “blocking” and “sliding window”, by introducing a smart similarity-matching function T-match. T-match is engineered as a decision tree that drives the comparisons of the fields of two records as branches of predicates and allows for successful or unsuccessful early exit strategies. The efficacy of the approach is proved by experiments performed over big data collections of metadata records in the OpenAIRE Graph, a known open-access knowledge base in Scholarly communication.Source: CEUR WORKSHOP PROCEEDINGS, vol. 3741, pp. 624-632. Villasimius, Italy, 23-26/06/2024
Project(s): FAIRCORE4EOSC via OpenAIRE

See at: ceur-ws.org Open Access | CNR IRIS Open Access | CNR IRIS Restricted


2024 Dataset Open Access OPEN
OpenAIRE Graph Dataset v8.0.0 (July 2024)
Manghi P., Atzori C., Bardi A., Baglioni M., Dimitropoulos H., La Bruzzo S., Foufoulas I., Mannocci A., Horst M., Iatropoulou K., Kokogiannaki A., De Bonis M., Artini M., Lempesis A., Ioannidis A., Manola N., Principe P., Vergoulis T., Chatzopoulos S.
The OpenAIRE Graph is a large and rich collection of open and linked scholarly records from trusted data sources, such as journals, repositories, and registries. It aims to foster Open Science practices and enable the scientific community to discover, monitor, and evaluate science. The Graph is cleaned, deduplicated, enriched, and full-text mined to generate statistics and insights. The Graph is accessible via various services, such as OpenAIRE MONITOR, EXPLORE, ScholeXplorer (Scholix API for the retrieval of literature-data links), search APIs and snapshots in json format updated every six months. The Graph data are openly available with CC-BY license for third-parties to reuse and create added value services. The documentation is available at: https://graph.openaire.euDOI: 10.5281/zenodo.12819872
Project(s): FAIRCORE4EOSC via OpenAIRE, SciLake via OpenAIRE, EOSC Beyond via OpenAIRE, GraspOS via OpenAIRE, OSTrails via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | zenodo.org Open Access | CNR IRIS Restricted


2023 Other Open Access OPEN
OpenAIRE, comunità e servizi per praticare la scienza aperta
Pavone G, Atzori C, Baglioni M, Bardi A, Manghi P, Castelli D
Per praticare la ricerca secondo i principi dell'Open Science sono al contempo necessarie tecnologie - con infrastrutture che consentano e facilitino la collaborazione e lo scambio massivo di informazioni su scala internazionale - e competenze che permettano di massimizzarne uso e risultati. In altre parole occorrono servizi, scambio di competenze e formazione. Su queste direttrici si concentra il lavoro di OpenAIRE (Open Access Infrastructure for Research in Europe), l'infrastruttura europea per la Scienza Aperta che offre servizi tecnologici e una rete europea di scambio e sinergia per favorire la scienza aperta. Avviata come progetto europeo nel 2009 per il monitoraggio dell'Open Access, nel corso degli anni l'iniziativa è stata rifinanziata e il suo ambito di interesse esteso a tutte le componenti dell'Open Science. Nel 2018 si è costituita come organizzazione senza scopo di lucro per garantire una struttura permanente a supporto delle politiche nazionali ed europee per l'Open Science. Il network di OpenAIRE conta oltre 40 membri tra centri di ricerca, università, fondazioni ed enti gestori di servizi distribuiti in tutta Europa. Come comunità di pratica, OpenAIRE ha la missione di costituire e gestire un'infrastruttura che supporti una comunicazione scientifica aperta e sostenibile, fornendo i servizi, le risorse e il coordinamento di iniziative ed esperti necessari per implementare un ambiente comune europeo per la scienza aperta. Per realizzare questa visione, OpenAIRE offre servizi tecnologici, di training e di supporto, coprendo l'intero ciclo di vita della ricerca (la lista completa dei servizi è consultabile su catalogue.openaire.eu). I servizi tecnologici spaziano dalla gestione dei dati al discovery, dalla gestione di riviste al monitoraggio dei risultati della ricerca e dell'adozione di pratiche Open Science. Inoltre la rete internazionale dei NOAD (National Open Access Desk: openaire.eu/contact-noads) promuove la scienza aperta fornendo assistenza e formazione a vari livelli. L'obiettivo è abilitare i vari attori coinvolti nell'attività scientifica nelle pratiche dell'open science e dell'open access organizzando workshop nazionali e training dedicati. I NOADs inoltre forniscono consulenza esperta sulle infrastrutture che supportano i flussi di lavoro per la scienza aperta, nonché per la definizione di politiche per la sua implementazione, quali stesura e aggiornamento di policies istituzionali, individuazione degli obblighi normativi, di adempimenti relativi ai finanziamenti o di strumenti per il Data Management Plan (DMP). Il CNR, in particolare il suo istituto ISTI, in qualità di centro di sviluppo e innovazione tecnologica dell'infrastruttura e di gestore del NOAD Italiano, opera in accordo con la missione di OpenAIRE contribuendo in modo significativo alle sue attività e agli organismi di governo. L'ente offre dunque le sue competenze per garantire il mantenimento, l'operatività e l'innovazione dell'infrastruttura partecipando in iniziative e progetti che contribuiscono alla sostenibilità e all'innovazione dei servizi di questa infrastruttura. Come NOAD, offre formazione e supporto per affrontare problematiche quali la definizione di DMP, il rispetto dei principi "FAIR" per la gestione dei dati, e la stesura di politiche istituzionali. Le attività sono portate avanti in collaborazione con i NOAD in altri paesi europei in modo da massimizzare l'integrazione di soluzioni e politiche a livello europeo.

See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2023 Other Open Access OPEN
OpenAIRE Graph: una risorsa aperta per la scienza aperta
Atzori C, Bardi A, Baglioni M, Manghi P
OpenAIRE Graph (OAG) is a knowledge graph that aggregates information (metadata, relationships) about different entities in the research world, such as publications, datasets, software, funded projects, repositories, and organisations. These entities are interconnected through semantic relationships, such as citations, supplements, similarity, and participation in projects. OAG is an open resource that can be used by funders, organisations, researchers, research communities, and publishers to gain a better understanding of the research landscape and dynamics at various levels, both local and global. As an open and freely accessible resource, produced in accordance with the fundamental values of Open Science as outlined in the UNESCO Recommendation on Open Science, OAG overcomes the use of proprietary data sources, supporting the reform of research assessment, researchers, and organisations as envisaged by the Coalition for Advancing Research Assessment (CoARA). OAG is built from bibliographic records obtained from well-known sources such as Crossref, open access journals registered in DOAJ (Directory of Open Access Journals), ORCID, Microsoft Academic Graph, Datacite, as well as from over 1,000 institutional repositories. The metadata of research products contained in the graph are disambiguated and enriched through full-text and data mining processes, making OAG usable for a variety of purposes, including: Research discovery Research assessment Analysis and/or prediction of research collaborations Support for research policy decision-making OAG is a freely accessible resource: search and discovery features are available through the explore.openaire.eu portal, programmatic integration is available through the HTTP Search API, the complete dataset, as well as other datasets that offer specialised views, are available on Zenodo. The monitor.openaire.eu portal hosts several dashboards dedicated to research organisations and funders that include the results of statistical, bibliometrics, and indicator analyses. Additional information is available at https://graph.openaire.eu, where the data models to which the datasets conform, API documentation, as well as the methodological approach used to build and process OAG are described. OAG can play a significant role in research assessment by providing a more comprehensive and accurate view of research output and impact. By aggregating data from a variety of sources, OAG can provide a more holistic picture of a researcher's or organization's research activities. This can help to identify areas of strength and weakness, as well as potential areas for collaboration. OAG can also be used to track the impact of research over time. By tracking citations, downloads, and other forms of engagement, OAG can help to measure the influence of research and the impact it has on society. This information can be used to inform research funding decisions, as well as to promote the dissemination of research findings. In addition to its quantitative measures, OAG can also provide qualitative insights into research. By analyzing the relationships between different research products, OAG can help to identify emerging trends and areas of collaboration. This information can be used to support research policy development and to promote the cross-fertilization of ideas. In conclusion, OAG is a powerful tool that has the potential to revolutionise the way research is assessed. By providing a more comprehensive and accurate view of research output and impact, OAG can help to make research assessment more fair, transparent, and equitable.

See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2023 Other Open Access OPEN
InfraScience research activity report 2023
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bosio C., Bove P., Calanducci A., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., Ibrahim A. S. T., La Bruzzo S., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Molinaro E., Pagano P., Panichi G., Paratore M. T., Pavone G., Piccioli T., Sinibaldi F., Straccia U., Vannini G. L.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2023 to highlight the major results. In particular, the InfraScience group engaged in research challenges characterising Data Infrastructures, e-Science, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2023 InfraScience members contributed to the publishing of several papers, to the research and development activities of several research projects (primarily funded by EU), to the organization of conferences and training events, to several working groups and task forces.DOI: 10.32079/isti-ar-2023/002
Project(s): Blue Cloud via OpenAIRE, EOSC Future via OpenAIRE, TAILOR via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | CNR IRIS Restricted


2023 Contribution to conference Open Access OPEN
OpenOrgs: the OpenAIRE tool for bridging registries of research organizations
Atzori C., Pavone G., Končić I, Macan B.
Building a connected open scholarly communication system requires unambiguously identifying research-related entities. It is not a simple task: for example, the same organization can have a range of names (eg. legal name, a shortened version, an abbreviation, in national language or in English), as well as varied metadata in other sources. Additionally, persistent IDs might not be helpful when several data sources (eg. ROR, ISNI, EC PIC) utilize various PID schemas to identify organizations. Due to this ambiguity, there are efficiency issues with information sharing, discoverability of research outputs, keeping track of activities, and ultimately building an integrated open scholarly communication system and OS services (Artini et al., 2022).A new tool called OpenOrgs was developed to address this old issue: the disambiguation of organizations engaged in the research process (Pavone, 2021) as well as the parentchild relationships between departments and organizations. OpenOrgs tackles the ambiguity in the data that OpenAIRE collects from several research organization registries (eg. ROR), as well as other sources including institutional repositories, scientific journals, and CRIS systems, and aggregates them to populate the OpenAIRE Graph (OpenAIRE Graph, n.d.). To make up for the lack of information and increase the organization's discoverability and recognition, OpenOrgs combines automated processes with human curation. Numerous data sources are used to gather and merge information on organizations. Their metadata are automatically compared and combined, then these suggested identities are manually checked by data curators assigned at a national or multi-national level.These two steps work as follows:1. The deduplication algorithm (De Bonis, 2022) suggests a similarity between organizations that emerge in various sources by comparing their metadata (eg. the organization name, URL, country).2. The automatic procedure is then verified by a manual curation process. By indicating whether two or more entities pertain to the same organization or not, data curators can clear up any ambiguity surrounding duplicates detected using the automated approach. Furthermore, they can themselves suggest new duplicates unidentified by the algorithm and make up for the information shortage by editing metadata description of the organization, compensating the lack of information from sources and enhancing the organization records' completeness and discoverability, for example by adding a persistent identifier, an alternative name (OpenAIRE, 2023), or establishing parent- child relationships (eg. university and departments). As of now, there are more than 70 registered data curators from over 40 countries, with more than 100,000 curated organizations.OpenOrgs offers a number of advantages for researchers, Research Performing Organizations (RPOs), Research Funding Organizations (RFOs), and all other stakeholders of Open Science services. It improves the findability of digital objects for academics and provides RPOs with a consistent showcase of the overall scientific production. It offers RFOs consistent data on the impact of resources. Finally, OpenOrgs offers functional and up-to-date services to all parties involved in Open Science.In the OpenAIRE ecosystem, OpenOrgs plays an important role. For example, OpenAIRE Explore displays the curated metadata from OpenOrgs, giving researchers quick and easy access to details about the organizations involved in the research process (OpenAIRE EXPLORE, n.d.). These data are also used by the OpenAIRE Monitor service, which tracks and monitors research activities and Open Science trends of organizations (OpenAIRE MONITOR, n.d.). This integration improves these organizations’ discoverability and recognition even more, fostering a more open and cooperative research environment. Therefore, rather than just a tool, OpenOrgs is a game-changer for the research community, and we believe it will contribute positively to build and maintain an integrated open scholarly communication system in the years to come.DOI: 10.15291/pubmet.4281
Metrics:


See at: PUBMET Open Access | CNR IRIS Open Access | PUBMET Open Access | PUBMET Restricted | CNR IRIS Restricted


2022 Conference article Open Access OPEN
A preliminary assessment of the article deduplication algorithm used for the OpenAIRE Research Graph
Vichos K, De Bonis M, Kanellos I, Chatzopoulos S, Atzori C, Manola N, Manghi P, Vergoulis T
In recent years, a large number of Scholarly Knowledge Graphs (SKGs) have been introduced in the literature. The communities behind these graphs strive to gather, clean, and integrate scholarly metadata from various sources to produce clean and easy-to-process knowledge graphs. In this context, a very important task of the respective cleaning and integration workflows is deduplication. In this paper, we briefly describe and evaluate the accuracy of the deduplication algorithm used for the OpenAIRE Research Graph. Our experiments show that the algorithm has an adequate performance producing a small number of false positives and an even smaller number of false negatives.Source: CEUR WORKSHOP PROCEEDINGS. Padua, Italy, 24-25/02/2022
Project(s): OpenAIRE Nexus via OpenAIRE

See at: ceur-ws.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Other Open Access OPEN
InfraScience research activity report 2021
Artini M, Assante M, Atzori C, Baglioni M, Bardi A, Bove P, Candela L, Casini G, Castelli D, Cirillo R, Coro G, De Bonis M, Debole F, Dell'Amico A, Frosini L, La Bruzzo S, Lazzeri E, Lelii L, Manghi P, Mangiacrapa F, Mangione D, Mannocci A, Ottonello E, Pagano P, Panichi G, Pavone G, Piccioli T, Sinibaldi F, Straccia U
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2021 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, eScience, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2021 InfraScience members contributed to the publishing of 25 papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.DOI: 10.32079/isti-ar-2022/001
Project(s): ARIADNEplus via OpenAIRE, Blue Cloud via OpenAIRE, PerformFISH via OpenAIRE, EOSC-Pillar via OpenAIRE, DESIRA via OpenAIRE, EOSC Future via OpenAIRE, EOSCsecretariat.eu via OpenAIRE, EcoScope via OpenAIRE, RISIS 2 via OpenAIRE, OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Journal article Open Access OPEN
FDup: a framework for general-purpose and efficient entity deduplication of record collections
De Bonis M., Manghi P., Atzori C.
Deduplication is a technique aiming at identifying and resolving duplicate metadata records in a collection. This article describes FDup (Flat Collections Deduper), a general-purpose software framework supporting a complete deduplication workflow to manage big data record collections: metadata record data model definition, identification of candidate duplicates, identification of duplicates. FDup brings two main innovations: first, it delivers a full deduplication framework in a single easy-to-use software package based on Apache Spark Hadoop framework, where developers can customize the optimal and parallel workflow steps of blocking, sliding windows, and similarity matching function via an intuitive configuration file; second, it introduces a novel approach to improve performance, beyond the known techniques of "blocking" and "sliding window", by introducing a smart similarity matching function T-match. T-match is engineered as a decision tree that drives the comparisons of the fields of two records as branches of predicates and allows for successful or unsuccessful early-exit strategies. The efficacy of the approach is proved by experiments performed over big data collections of metadata records in the OpenAIRE Research Graph, a known open access knowledge base in Scholarly communication.Source: PEERJ. COMPUTER SCIENCE., vol. 8 (issue e1058)
DOI: 10.7717/peerj-cs.1058
Project(s): OpenAIRE Nexus via OpenAIRE
Metrics:


See at: OpenAIRE Open Access | CNR IRIS Open Access | ISTI Repository Open Access | peerj.com Open Access | CNR IRIS Restricted


2022 Software Metadata Only Access
dnet-dedup framework
Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., La Bruzzo S. F., Mannocci A., Manghi P.
The GDup Software enables an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup supports practitioners with the functionalities needed to realize a fully-fledged entity deduplication workflow over a generic input graph, including Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph. GDup is today one of the core components of the OpenAIRE infrastructure production system, monitoring Open Science trends on behalf of the European Commission.Project(s): OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE

See at: github.com Restricted | CNR IRIS Restricted


2022 Other Open Access OPEN
Data model description of the OpenAIRE Research Graph
La Bruzzo Sf, Artini M, Atzori C, Bardi A, Baglioni M, De Bonis M, Mannocci A, Manghi P, Pavone G
The OpenAIRE Graph (formerly known as the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide, key to fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community. Imagine a vast collection of research products all linked together, contextualized, and openly available. For the past years, OpenAIRE has been working to gather this valuable record. It is a massive collection of metadata and links between scientific products such as articles, datasets, software, and other research products, entities like organizations, funders, funding streams, projects, communities, and data sources. This technical Report describes the public data model adopted by the OpenAIRE Graph.DOI: 10.32079/isti-tr-2022/031
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Other Open Access OPEN
OpenAIRE Research Graph: aggregation workflow
La Bruzzo Sf, Artini M, Atzori C, Bardi A, Baglioni M, De Bonis M, Dell'Amico A, Mannocci A, Manghi P, Pavone G
The OpenAIRE Graph (formerly the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide. It is key in fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community. OpenAIRE collects metadata records from more than 70K scholarly communication sources worldwide, including Open Access institutional repositories, data archives, and journals. All the metadata records (i.e., descriptions of research products) are put together in a data lake with records from Crossref, Unpaywall, ORCID, ROR, and information about projects provided by national and international funders. This technical Report describes the main Aggregation Workflow to orchestrate the data aggregation and the implemented mapping from some of the main datasources into the OpenAIRE research graph data model.DOI: 10.32079/isti-tr-2022/033
Project(s): OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Other Open Access OPEN
OpenAIRE Research Graph deduplication workflow
La Bruzzo Sf, Artini M, Atzori C, Bardi A, Baglioni M, De Bonis M, Mannocci A, Manghi P, Pavone G
The OpenAIRE aggregation workflow can collect metadata records from different providers about the same scholarly work. Each metadata record can carry different information because, for example, some providers are not aware of links to projects, keywords, or other details. Another typical case is when OpenAIRE collects one metadata record from a repository about a pre-print and another from a journal about the published article. To provide correct statistics, OpenAIRE must identify those cases and "merge" the two metadata records so that the scholarly work is counted only once in the statistics OpenAIRE produces. This technical Report describes the Deduplication workflow and technique adopted to deduplicate the OpenAIRE Graph.DOI: 10.32079/isti-tr-2022/032
Project(s): OpenAIRE-Connect via OpenAIRE, OpenAIRE Nexus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Other Open Access OPEN
InfraScience research activity report 2022
Artini M, Assante M, Atzori C, Baglioni M, Bardi A, Bove P, Candela L, Casini G, Castelli D, Cirillo R, Coro G, De Bonis M, Debole F, Dell'Amico A, Frosini L, La Bruzzo S, Lelii L, Manghi P, Mangiacrapa F, Mangione D, Mannocci A, Ottonello E, Pagano P, Panichi G, Pavone G, Piccioli T, Sinibaldi F, Straccia U, Zoppi F
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2022 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, e-Science, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2022 InfraScience members contributed to the publishing of several papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.DOI: 10.32079/isti-ar-2022/004
Project(s): ARIADNEplus via OpenAIRE, Blue Cloud via OpenAIRE, EOSC-Pillar via OpenAIRE, DESIRA via OpenAIRE, EOSC Future via OpenAIRE, RISIS 2 via OpenAIRE, TAILOR via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2022 Other Open Access OPEN
RISIS tool demonstration event - The OpenAIRE Research Graph: an Open Access resource for research on research
Bardi A, Baglioni M, Atzori C
RISIS embraces the International Open Access Week 2022 with a session on the OpenAIRE Research Graph: an Open Access dataset with metadata about research products (literature, datasets, software, etc.) linked to other entities of the research ecosystem like organisations, project grants, data sources, and services. The session included a presentation of the graph and a guided practical session where participants can learn how to use the OpenAIRE Research Graph for research and policy-related activities. More information about the event is available on the RISIS2 project web site. The practical part has been conducted on the RISIS Lab Virtual Research Environment of the D4Science infrastructure operated by CNR - ISTI. The Jupyter notebooks can be run on the JupyterHub integrated in the RISIS Lab or in other JupyterHub instances supporting PySpark. The data analysis was performed on a subset of the OpenAIRE Research Graph composed of 848 H2020 projects related to the Sustainable Development Goal Climate Action (SDG13), their funded research products, and their related organizations (risis_dataset.zip). Details on the subset, the model, and other useful documentation is available in the slides.Project(s): RISIS 2 via OpenAIRE, OpenAIRE Nexus via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2021 Conference article Open Access OPEN
Reflections on the misuses of ORCID iDs
Baglioni M, Mannocci A, Manghi P, Atzori C, Bardi A, La Bruzzo S
Since 2012, the "Open Researcher and Contributor Identification Initiative" (ORCID) has been successfully running a worldwide registry, with the aim of unequivocally pinpoint researchers and the body of knowledge they contributed to. In practice, ORCID clients, e.g., publishers, repositories, and CRIS systems, make sure their metadata can refer to iDs in the ORCID registry to associate authors and their work unambiguously. However, the ORCID infrastructure still suffers from several "service misuses", which put at risk its very mission and should be therefore identified and tackled. In this paper, we classify and qualitatively document such misuses, occurring from both users (researchers and organisations) of the ORCID registry and the ORCID clients. We conclude providing an outlook and a few recommendations aiming at improving the exploitation of the ORCID infrastructure.Source: CEUR WORKSHOP PROCEEDINGS, pp. 117-125. Online conference, 18-19/02/2021
Project(s): OpenAIRE-Advance via OpenAIRE

See at: ceur-ws.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | CNR IRIS Restricted


2021 Conference article Open Access OPEN
BIP! DB: a dataset of impact measures for scientific publications
Vergoulis T, Kanellos I, Atzori C, Mannocci A, Chatzopoulos S, La Bruzzo S, Manola N, Manghi P
The growth rate of the number of scientific publications is constantly increasing, creating important challenges in the identification of valuable research and in various scholarly data management applications, in general. In this context, measures which can effectively quantify the scientific impact could be invaluable. In this work, we present BIP! DB, an open dataset that contains a variety of impact measures calculated for a large collection of more than 100 million scientific publications from various disciplines.DOI: 10.1145/3442442.3451369
DOI: 10.48550/arxiv.2101.12001
Project(s): OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | arxiv.org Open Access | dl.acm.org Open Access | CNR IRIS Open Access | ISTI Repository Open Access | doi.org Restricted | doi.org Restricted | CNR IRIS Restricted | CNR IRIS Restricted


2021 Dataset Metadata Only Access
OpenAIRE research graph: dumps for research communities and initiatives
Manghi P, Atzori C, Bardi A, Baglioni M, Schirrwagen J, Dimitropoulos H, La Bruzzo S, Foufoulas I, Lohden A, Backer A, Mannocci A, Horst M, Czerniak A, Kiatropoulou K, Kokogiannaki A, De Bonis M, Artini M, Ottonello E, Lempesis A, Ioannidis A, Summan F
This dataset contains dumps of the OpenAIRE Research Graph containing metadata records relevant for the research communities and initiatives collaborating with OpenAIRE. Each dataset is a tar file containing gzip files with one json per line. Each json is compliant to the schema available at DOI: 10.5281/zenodo.3974226DOI: 10.5281/zenodo.3974604
Project(s): RISIS 2 via OpenAIRE, BE OPEN via OpenAIRE, OpenAIRE-Advance via OpenAIRE
Metrics:


See at: CNR IRIS Restricted


2021 Dataset Open Access OPEN
OpenAIRE Covid-19 publications, datasets, software and projects metadata
Bardi A., Kuchma I., Pavone G., Artini M., Atzori C., Backer A., Baglioni M., Czerniak A., De Bonis M., Dimitropoulos H., Foufoulas I., Horst M., Iatropoulou K., Jacewicz P., Kokogiannaki A., La Bruzzo S., Lazzeri E., Lohden A., Manghi P., Mannocci A., Manola N., Ottonello E., Schirrwagen J.
This dump provides access to the metadata records of publications, research data, software and projects that may be relevant to the Corona Virus Disease (COVID-19) fight. The dump contains records of the OpenAIRE COVID-19 Gateway (https://covid-19.openaire.eu/), identified via full-text mining and inference techniques applied to the OpenAIRE Research Graph (https://explore.openaire.eu/). The Graph is one of the largest Open Access collections of metadata records and links between publications, datasets, software, projects, funders, and organizations, aggregating 12,000+ scientific data sources world-wide, among which the Covid-19 data sources Zenodo COVID-19 Community, WHO (World Health Organization), BIP! FInder for COVID-19, Protein Data Bank, Dimensions, scienceOpen, and RSNA.The dump consists of a gzip file containing one json per line. Each json is compliant to the schema available at https://doi.org/10.5281/zenodo.3974226DOI: 10.5281/zenodo.3980490
Project(s): OpenAIRE-Advance via OpenAIRE
Metrics:


See at: CNR IRIS Open Access | CNR IRIS Restricted