(Semi)automated disambiguation of scholarly repositories Baglioni M., Mannocci A., Pavone G., De Bonis M., Manghi P. The full exploitation of scholarly repositories is pivotal in modern Open Science, and scholarly repository registries are kingpins in enabling researchers and research infrastructures to list and search for suitable repositories. However, since multiple registries exist, repository managers are keen on registering multiple times the repositories they manage to maximise their traction and visibility across different research communities, disciplines, and applications. These multiple registrations ultimately lead to information fragmentation and redundancy on the one hand and, on the other, force registries' users to juggle multiple registries, profiles and identifiers describing the same repository. Such problems are known to registries, which claim equivalence between repository profiles whenever possible by cross-referencing their identifiers across different registries. However, as we will see, this "claim set" is far from complete and, therefore, many replicas slip under the radar, possibly creating problems downstream. In this work, we combine such claims to create duplicate sets and extend them with the results of an automated clustering algorithm run over repository metadata descriptions. Then we manually validate our results to produce an "as accurate as possible" de-duplicated dataset of scholarly repositories.Source: IRCDL 2023 - 19th conference on Information and Research Science Connecting to Digital and Library Science, pp. 47–59, Bari, Italy, 23-24/02/2023 Project(s): OpenAIRE Nexus
Open Science repository platforms Manghi P., Artini M., La Bruzzo S., Ottonello E., Pavone G. Institutional and thematic repositories today play a key role in scholarly communication and more broadly in scientific workflows. Many institutions and communities have set the ambitious goal of providing an open access repository for their community of users. However, given the amount of expectations from their users, choosing the right solution is often a non-trivial choice. Some platforms may be served out-of-the-box, to be put in operation after straightforward configurations, but are in general less customizable to adhere to specific functional, non-functional, or contextual needs. Other platforms may be instead extremely customizable and flexible but require skilled personnel for their adaptation and deployment.
This report performs an analysis of existing state-of-the-art Open Source repository solutions from the functional, operational, and software perspectives. As a result of the analysis, it will factor out the pros and cons of such solutions and identify typical scenarios of adoption.Source: ISTI Technical Report, ISTI-2022-TR/009, 2022 DOI: 10.32079/isti-tr-2022/009 Project(s): OpenAIRE Nexus Metrics:
Bioschemas data sources aggregation to OpenAIRE Research Graph Ottonello E., Artini M., La Bruzzo S., Pavone G. In this report we propose an extended Hadoop-based aggregator for the harvesting of Bioschemas data sources. In this extended hadoop-based aggregator, the downloaded data will be processed according to the consolidated data flow: the original contents will be mapped onto an internal representation that will make them eligible to be integrated in the OpenAIRE research graph.Source: ISTI Technical Report, ISTI-2022-TR/010, 2022 DOI: 10.32079/isti-tr-2022/010 Project(s): EOSC Future , OpenAIRE Nexus Metrics:
Practical course on FAIR Data Stewardship in Life Science Pavone G., Lazzeri E., Carta C., Chiara M., Quaglia F. The course on FAIR Data Stewardship in Life Science organised by ELIXIR Italy, EOSC-Pillar Project, the ICDI Competence Centre, and EOSC-Life, in collaboration with ELIXIR Netherlands. The course is focused on FAIR data management and stewardship. It provides the skills, tools, and standards required to embed Open Science in the research workflow, covering many aspects, such as: Open Science in Horizon Europe, Open Access in publishing, Research Data Management (RDM), FAIR principles, Open Data, and Data Management Plan. The course has been structured in 5 on-line training modules, each one built on frontal lessons and several interactions.
This record includes both slides from the lesson modules and additional material (excercises, mentimeter).
Claudio Carta - ELIXIR-IT, Istituto Superiore di Sanità - Roma, Italy
Matteo Chiara - ELIXIR-IT, Dept. of Biosciences, University of Milano, Italy
Mijke Jetten/Celia van Gelder - ELIXIR-NL, Dutch Techcentre for Life Sciences (DTL), Netherlands
Emma Lazzeri - ICDI, GARR, Roma, Italy
Gina Pavone - ICDI, National Research Council - ISTI, Pisa, Italy
Federica Quaglia - University of Padua, Italy
The course is aimed at Italian life-science researchers, technicians, data stewards, and data managers, especially if they are involved in projects that require planning and writing a Data Management Plan.
Course dates: 2-4-7-9-11 March 2022 at 14.30 - 17.30 CET
Emma Lazzeri - ICDI, GARR, Roma, Italy
Loredana Le Pera - ELIXIR-IT, Istituto Superiore di Sanità - FAST, Roma, Italy
Ivan Miçetic - ELIXIR-IT, University of Padua, Italy
Gina Pavone - ICDI, National Research Council - ISTI, Pisa, Italy
Allegra Via - ELIXIR-IT, National Research Council - IBPM, Roma, Italy
Course page on ELIXIR-IT website: https://elixir-italy.org/event/practical-course-on-fair-data-stewardship-in-life-science/
Course page on EOSC-Pillar website: https://www.eosc-pillar.eu/events/practical-course-fair-data-stewardship-life-science
Course page on ICDI website: https://www.icdi.it/it/news/123-corso-pratico-sulla-gestione-dei-dati-fair-nella-scienza-della-vitaProject(s): EOSC-Pillar , ELIXIR-EXCELERATE , EOSC-Life
Open Science in research projects Pavone G., Lazzeri E. Series of three seminars on Open Science in research projects held at the University of Siena.
Day 1: Open Science and research assessment
Day 2: Research Data Management
Day 3: Horizon Europe, Eosc and Research Infrastructures
Presentations are in English while the course outline with the detailed description of modules and interactions is in Italian.Project(s): EOSC-Pillar , OpenAIRE-Advance
Research Data Management basics: why it is essential to take care of data Pavone G., Lazzeri E. Properly managing research data is useful for oneself and it is fundamental to create an ecosystem where data can be valued by multiple actors and perspectives. The presentation was given as an introductory and motivational training webinar in which Research Data Management was framed as one of the practices of Open Science. In addition, the steps to be taken to start a proper strategy to design a shared workflow of research data management within CISUP were discussed. CISUP is the Center for Instrument Sharing of the University of Pisa, an interdepartmental laboratory platform offering access to a wide range of analytical instrumentation for life and physical science researchers: https://cisup.unipi.it/Source: Webinar in collaborazione con CISUP, online, 20/01/2022 DOI: 10.5281/zenodo.5887497 Project(s): EOSC-Pillar , OpenAIRE-Advance Metrics:
FAIR principles and Data Management Plan Pavone G. Three modules of the Open Science course for PhD students, University of Pisa, year 2022.
Agenda of the 3 lectures:
What FAIR principles are and their value for research
Why do we need "fairness", some examples and hands-on session
Data Management Plan: what it is, how to do it, some examples and hands-on sessionProject(s): EOSC-Pillar
InfraScience research activity report 2021 Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bove P., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., La Bruzzo S., Lazzeri E., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Ottonello E., Pagano P., Panichi G., Pavone G., Piccioli T., Sinibaldi F., Straccia U. InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2021 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, eScience, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE.
During 2021 InfraScience members contributed to the publishing of 25 papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual report, 2022 DOI: 10.32079/isti-ar-2022/001 Project(s): ARIADNEplus , Blue Cloud , PerformFISH , EOSC-Pillar , DESIRA , EOSC Future , EOSCsecretariat.eu , EcoScope , RISIS 2 , OpenAIRE-Advance , OpenAIRE Nexus , SoBigData-PlusPlus Metrics:
A primer on open science-driven repository platforms Bardi A., Manghi P., Mannocci A., Ottonello E., Pavone G. Following Open Science mandates, institutions and communities increasingly demand repositories with native support for publishing scientific literature together with research data, software, and other research products. Such repositories may be thematic or general-purpose and are deeply integrated with the scholarly communication ecosystem to ensure versioning, persistent identifiers, data curation, usage stats, and so on. Identifying the most suitable off-the-shelf repository platform is often a non-trivial task as the choice depends on functional requirements, programming and technical skills, and infrastructure resources.
This work analyses four state-of-the-art Open Source repository platforms, namely Dryad, Dataverse, DSpace, and InvenioRDM, from both a functional and a software perspective. This work intends to provide an overview serving as a primer for choosing repository platform solutions in different application scenarios. Moreover, this paper highlights how these platforms reacted to some key Open Science demands, moving away from the original and old-fashioned concept of a repository serving as a static container of files and metadata.Source: MTSR 2022 - International Conference on Metadata and Semantics Research, Londra, UK, 07-11/11/2022 Project(s): OpenAIRE Nexus
Data model description of the OpenAIRE Research Graph La Bruzzo S. F., Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Mannocci A., Manghi P., Pavone G. The OpenAIRE Graph (formerly known as the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide, key to fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community.
Imagine a vast collection of research products all linked together, contextualized, and openly available. For the past years, OpenAIRE has been working to gather this valuable record. It is a massive collection of metadata and links between scientific products such as articles, datasets, software, and other research products, entities like organizations, funders, funding streams, projects, communities, and data sources. This technical Report describes the public data model adopted by the OpenAIRE Graph.Source: ISTI Technical Report, ISTI-2022-TR/031, 2022 DOI: 10.32079/isti-tr-2022/031 Metrics:
OpenAIRE Research Graph: aggregation workflow La Bruzzo S. F., Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., Mannocci A., Manghi P., Pavone G. The OpenAIRE Graph (formerly the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide. It is key in fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community. OpenAIRE collects metadata records from more than 70K scholarly communication sources worldwide, including Open Access institutional repositories, data archives, and journals. All the metadata records (i.e., descriptions of research products) are put together in a data lake with records from Crossref, Unpaywall, ORCID, ROR, and information about projects provided by national and international funders. This technical Report describes the main Aggregation Workflow to orchestrate the data aggregation and the implemented mapping from some of the main datasources into the OpenAIRE research graph data model.Source: ISTI Technical Report, ISTI-2022-TR/033, 2022 DOI: 10.32079/isti-tr-2022/033 Project(s): OpenAIRE-Advance , OpenAIRE Nexus Metrics:
OpenAIRE Research Graph deduplication workflow La Bruzzo S. F., Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Mannocci A., Manghi P., Pavone G. The OpenAIRE aggregation workflow can collect metadata records from different providers about the same scholarly work. Each metadata record can carry different information because, for example, some providers are not aware of links to projects, keywords, or other details. Another typical case is when OpenAIRE collects one metadata record from a repository about a pre-print and another from a journal about the published article. To provide correct statistics, OpenAIRE must identify those cases and "merge" the two metadata records so that the scholarly work is counted only once in the statistics OpenAIRE produces. This technical Report describes the Deduplication workflow and technique adopted to deduplicate the OpenAIRE Graph.Source: ISTI Technical Report, ISTI-2022-TR/032, 2022 DOI: 10.32079/isti-tr-2022/032 Project(s): OpenAIRE-Connect , OpenAIRE Nexus Metrics:
OpenOrgs: a tool for the disambiguation of organizations Artini M., La Bruzzo S. F., De Bonis M., Pavone G. Organizations appear all over the Research & Innovation ecosystem in different shapes and formats: the same organization may appear with different metadata fields, different names - e.g., full legal name, short or alternative names, acronym. The ambiguity of organizations results in a huge deficiency in the exchange of information, the findability of research products, the monitoring of activities, and ultimately building a linked open scholarly communication system. OpenOrgs combines an automated process and human curation to compensate for the lack of information available and improve the organization's discoverability.Source: ISTI Technical Report, ISTI-2022-TR/034, 2022 DOI: 10.32079/isti-tr-2022/034 Project(s): OpenAIRE-Advance , OpenAIRE Nexus Metrics:
InfraScience research activity report 2022 Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bove P., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., La Bruzzo S., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Ottonello E., Pagano P., Panichi G., Pavone G., Piccioli T., Sinibaldi F., Straccia U., Zoppi F. InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2022 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, e-Science, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2022 InfraScience members contributed to the publishing of several papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual reports, 2022 Project(s): ARIADNEplus , Blue Cloud , EOSC-Pillar , DESIRA , EOSC Future , RISIS 2 , TAILOR , SoBigData-PlusPlus
Progettare un evento divulgativo online. L'esperienza di "Fai una domanda su Covid-19, gli esperti rispondono" Pavone G., Lazzeri E., Circella M., De Leo F. Il documento raccoglie le domande ricevute e le risposte date in occasione del webinar "Fai una domanda su Covid-19, gli esperti rispondono", realizzato il 27 novembre 2020.
Inoltre il report descrive gli aspetti principali dell'organizzazione di un webinar "Ask Me Anything", che voleva essere un evento informativo, online e rivolto ai giovani sul tema pandemia di Covid-19.
A distanza di diversi mesi dall'inizio della pandemia, erano ancora tanti i dubbi e le incertezze diffuse tra le persone, anche riguardo ad aspetti noti e chiariti in sede di ricerca scientifica.
In occasione dell'edizione 2020 della Notte Europea dei Ricercatori e delle Ricercatrici, e? stato fatto uno sforzo divulgativo di apertura e confronto con alcuni esperti impegnati nella ricerca su Sars-Cov-2 e Covid- 19.Source: ISTI Technical Report, ISTI-2021-TR/002, 2021 DOI: 10.32079/isti-tr-2021/002 Project(s): ELIXIR-EXCELERATE , ERN-Apulia , OpenAIRE-Advance Metrics:
Practicing Open Science in Earth and Environmental Sciences Lazzeri E., Cocco M., Bailo D., Sarretta A., Locati M., Pavone G. A cycle of four webinars on Open Science and Open Access for Earth and environmental sciences, with discipline-specific tools and practical resources.
- Introduction and motivations
- Open Science in Solid Earth Science
- Research Data Management
- OS in solid Earth sciences: the EPOS research infrastructure experience
- FAIR principles and Open Data
- Implementing FAIR. Considerations from the solid Earth domain
- The Data Management Plan
- The adoption of Open Science Paradigm at INGV
- Practical TipsProject(s): EOSC-Pillar , EPOS IP , OpenAIRE-Advance
OpenAIRE Covid-19 publications, datasets, software and projects metadata Bardi A., Kuchma I., Pavone G., Artini M., Atzori C., Backer A., Baglioni M., Czerniak A., De Bonis M., Dimitropoulos H., Foufoulas I., Horst M., Iatropoulou K., Jacewicz P., Kokogiannaki A., La Bruzzo S., Lazzeri E., Lohden A., Manghi P., Mannocci A., Manola N., Ottonello E., Schirrwagen J. This dump provides access to the metadata records of publications, research data, software and projects that may be relevant to the Corona Virus Disease (COVID-19) fight. The dump contains records of the OpenAIRE COVID-19 Gateway (https://covid-19.openaire.eu/), identified via full-text mining and inference techniques applied to the OpenAIRE Research Graph (https://explore.openaire.eu/). The Graph is one of the largest Open Access collections of metadata records and links between publications, datasets, software, projects, funders, and organizations, aggregating 12,000+ scientific data sources world-wide, among which the Covid-19 data sources Zenodo COVID-19 Community, WHO (World Health Organization), BIP! FInder for COVID-19, Protein Data Bank, Dimensions, scienceOpen, and RSNA.
The dump consists of a gzip file containing one json per line. Each json is compliant to the schema available at https://doi.org/10.5281/zenodo.3974226DOI: 10.5281/zenodo.3980490 Project(s): OpenAIRE-Advance Metrics:
Open Science e finanziamenti europei: come ottemperare agli obblighi nei progetti H2020 e in Horizon Europe Lazzeri E., Giglia E., Pavone G. Da tempo la Commissione Europea ha abbracciato l'Open Science e la sua visione partecipativa e collaborativa del lavoro scientifico. Tutti i beneficiari di fondi europei per la ricerca sono tenuti a rispettare le richieste in fatto di Open Access alle pubblicazioni, ai dati o ad altri prodotti del lavoro di ricerca. In concreto rendere disponibili i prodotti della ricerca significa fare scelte ponderate, saper distinguere le alternative disponibili e gli strumenti da usare. Nonché conoscere le motivazioni che guidano la scelta di un modello aperto e trasparente di comunicazione scientifica tra pari. Il National Open Access Desk di OpenAIRE per l'Italia ha organizzato, in collaborazione con il Competence Center di ICDI, un corso indirizzato ai coordinatori di progetti H2020 e Horizon Europe, in cui sono state approfondite tutte le principali questioni su Open Science e Open Access legate ai finanziamenti europei. Il corso è stato articolato in un ciclo di 4 webinar, svolti a febbraio 2021. In ciascuna lezione stato previsto uno spazio per la discussione e per rispondere alle domande dei partecipanti. Programma del corso: Modulo 1 - Introduzione e motivazioni Motivazioni: come funziona la scienza oggi, valutazione della ricerca, problematiche L'alternativa Open Science La Commissione Europea e l'Open Science Servizi e supporto per i ricercatori in Italia e in Europa: ICDI e OpenAIRE Domande e discussione (30 minuti) Modulo 2 - Open Access e Open Data Cos'è e come si fa Open Access alle pubblicazioni Open Data Pilot L'importanza di gestire i dati della ricerca Domande e discussione (30 minuti) Modulo 3 - Data Management Plan e principi FAIR Principi FAIR Cos'è e a cosa serve il Data Management Plan Esempi di DMP Strumenti per il DMP Domande e discussione (30 minuti) Modulo 4 - Sessione pratica Uso di strumenti e servizi per ottemperare agli obblighi Integrare le buone pratiche di open science nel proprio quotidiano Domande e discussione (30 minuti) Organizzatori: Emma Lazzeri, Isti-Cnr; Gina Pavone, Isti-Cnr; Catherine Bosio, Isti-Cnr.DOI: 10.5281/zenodo.4570624 Project(s): OpenAIRE-Advance Metrics:
ICDI Competence Centre for Open Science, FAIR and EOSC - Mission, strategy and action plan Lazzeri E., Tanlongo F., Pavone G., Alpi F., Ansuini A., Bertazzon E., Bonaccorsi D., Cappelluti F., Casati S., Castelli D., Cippitani R., Colcelli V., Costantini A., Cozzini S., Degl'Innocenti E., Di Donato F., Di Giorgio S., Fava I., Fiore S., Forni M., Galimberti G., Giglia E., Giorgetti A., Kurapati S., Landoni M., Lavitrano M., Marras C., Niccolucci F., Occioni M., Osmenaj E., Paolini G., Pasquale V., Petrillo C., Pugliese R., Ripepi E., Rivoira G., Rossi G., Salon S., Sarretta A., Sartori A., Spiga D., Tamagno D., Tammaro A. M., Vellico M., Vignocchi M., Zane D. This document presents the mission and strategy of the Italian Competence Centre on Open Science, FAIR, and EOSC. The Competence Centre is an initiative born within the Italian Computing and Data Infrastructure (ICDI), a forum created by representatives of major Italian Research Infrastructures and e-Infrastructures, with the aim of promoting synergies at the national level, and optimising the Italian participation to European and global challenges in this field, including the European Open Science Cloud (EOSC), the European Data Infrastructure (EDI) and HPC.
This working paper depicts the mission and objectives of the ICDI Competence Centre, a network of experts with various skills and competencies that are supporting the national stakeholders on topics related to Open Science, FAIR principles application and participation to the EOSC. The different actors and roles are described in the document as well as the activities and services offered, and the added value each stakeholder can find the in Competence Centre. The tools and services provided, in particular the concept for the portal, through which the Centre will connect to the national landscape and users, are also presented.
This record is the English translation of the original Italian (2021). Competence Centre ICDI per Open Science, FAIR, ed EOSC - Mission, Strategia e piano d'azione. Zenodo. https://doi.org/10.5281/zenodo.5071055Source: ISTI Technical Report, ISTI-2021-TR/023, 2021 DOI: 10.32079/isti-tr-2021/023 Metrics: