Page 1 of 4

2018 Software Metadata Only Access

DOIBoost software toolkit 1.0
La Bruzzo S
Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of metadata and, where possible, their relative payloads. To this end, CrossRef plays a pivotal role by providing free access to its entire metadata collection, and allowing other initiatives to link and enrich its information. Therefore, a number of key pieces of information result scattered across diverse datasets and resources freely available online. As a result of this fragmentation, researchers in this domain end up struggling with daily integration problems producing a plethora of ad-hoc datasets, therefore incurring in a waste of time, resources, and infringing open science best practices. This software package the spark scripts to generate DOIBoost, a metadata collection that enriches CrossRef with inputs from Microsoft Academic Graph, ORCID, and UnPayWall for the purpose of supporting high-quality and robust research.Project(s): OpenAIRE-Advance via OpenAIRE

See at: CNR IRIS Restricted | zenodo.org

2022 Software Metadata Only Access

Scholexplorer-API
La Bruzzo Sf
The Scholix API allows clients to run REST queries over the Scholexplorer index in order to fetch links matching given criteria. In the current version, clients can search for: Links whose source object has a given PID or PID type; Links whose source object has been published by a given data source ("data source as publisher") Links that were collected from a given data source ("data source as provider").Project(s): OpenAIRE-Advance via OpenAIRE

, OpenAIRE Nexus via OpenAIRE

See at: github.com Restricted | CNR IRIS

2013 Other Open Access

EFG1914 - Documentation of schema mappings and metadata exchange interfaces
Manghi P, La Bruzzo S
The EFG1914 data infrastructure delivers two main requirements as identified by the user community: i) Single access point to the European movie archives, and ii) High-quality metadata descriptions. These requisites are hindered by the highly heterogeneous nature of the archives. Related problems have been solved through the EFG1914 Common Metadata Model defining a set of eight interrelated entities. In order to better illustrate the model and the relationships it defines among the above entities, this deliverable shows a real-case example about the film "2001: A Space Odyssey" directed by Stanley Kubrik.Project(s): EFG1914

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2015 Conference article Open Access

On bridging data centers and publishers: the data-literature interlinking service
Burton A, Koers H, Manghi P, La Bruzzo S, Aryani A, Diepenbroek M, Schindler U
Although research data publishing is today widely regarded as crucial for reproducibility and proper assessment of scientific results, several challenges still need to be solved to fully realize its potential. Developing links between the published literature and data sets is one of them. Current solutions are mostly based on bilateral, ad-hoc agreements between publishers and data centers, operating in silos whose content cannot be readily combined to deliver a network connecting research data and literature. The RDA Publishing Data Services Working Group (PDS-WG) aims to address this issue by bringing together different stakeholders to agree on common standards, combine links from disparate sources, and create a universal, open service for collecting and sharing such links: the Data-Literature Interlinking Service. This paper presents the synergic effort of the PDS-WG and the OpenAIRE infrastructure to realize and operate such a service. The Service populates and provides access to a graph of dataset-literature links collected from a variety of major data centers, publishers, and research organizations. At the time of writing, the Service has close to one million links with further contributions expected. Based on feedback from content providers and consumers, PDS-WG will continue to refine the Service data model and exchange format to make it a universal, cross-platform, cross-discipline solution for collecting and sharing dataset-literature links.Source: COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE (PRINT), pp. 324-335. Manchester, UK, 9-11 September 2015
DOI: 10.1007/978-3-319-24129-6_28
Project(s): OPENAIREPLUS via OpenAIRE

, OpenAIRE2020 via OpenAIRE

, RDA EUROPE via OpenAIRE

, THOR

Metrics:

2017 Journal article Open Access

The data-literature interlinking service: towards a common infrastructure for sharing data-article links
Burton A, Koers H, Manghi P, La Bruzzo S, Aryani A, Diepenbroek M, Schindler U
Purpose Research data publishing is today widely regarded as crucial for reproducibility, proper assessment of scientific results, and as a way for researchers to get proper credit for sharing their data. However, several challenges need to be solved to fully realize its potential, one of them being the development of a global standard for links between research data and literature. Current linking solutions are mostly based on bilateral, ad hoc agreements between publishers and data centers. These operate in silos so that content cannot be readily combined to deliver a network graph connecting research data and literature in a comprehensive and reliable way. The Research Data Alliance (RDA) Publishing Data Services Working Group (PDS-WG) aims to address this issue of fragmentation by bringing together different stakeholders to agree on a common infrastructure for sharing links between datasets and literature. The paper aims to discuss these issues. Design/methodology/approach This paper presents the synergic effort of the RDA PDS-WG and the OpenAIRE infrastructure toward enabling a common infrastructure for exchanging data-literature links by realizing and operating the Data-Literature Interlinking (DLI) Service. The DLI Service populates and provides access to a graph of data set-literature links (at the time of writing close to five million, and growing) collected from a variety of major data centers, publishers, and research organizations. Findings To achieve its objectives, the Service proposes an interoperable exchange data model and format, based on which it collects and publishes links, thereby offering the opportunity to validate such common approach on real-case scenarios, with real providers and consumers. Feedback of these actors will drive continuous refinement of the both data model and exchange format, supporting the further development of the Service to become an essential part of a universal, open, cross-platform, cross-discipline solution for collecting, and sharing data set-literature links. Originality/value This realization of the DLI Service is the first technical, cross-community, and collaborative effort in the direction of establishing a common infrastructure for facilitating the exchange of data set-literature links. As a result of its operation and underlying community effort, a new activity, name Scholix, has been initiated involving the technological level stakeholders such as DataCite and CrossRef.Source: PROGRAM, vol. 51 (issue 1), pp. 75-100
DOI: 10.1108/prog-06-2016-0048
Project(s): OpenAIRE2020 via OpenAIRE

, RDA EUROPE via OpenAIRE

Metrics:

2019 Dataset Metadata Only Access

OpenAIRE scholeXplorer service: Scholix JSON Dump
La Bruzzo S, Manghi P
This dataset contains the GZ-compressed dump of the Scholix links (schema Version 3) exposed by the OpenAIRE ScholeXplorer service. The dataset doubled since its last version and consists of 240+Mi bi-directional links (i.e. 480+Mi directed links) between literature-dataset and dataset-dataset involving 17+ Mi literature objects and 50+ Mi datasets. Links were collected from publishers (CrossRef, EventData), data centers (DataCite), institutional/thematic repositories (OpenAIRE), and life-science databases (EMBL-EBI). The links are organized in ~1000 compressed files, each of at most 50MB, for a total of ~38GB.Project(s): OpenAIRE-Advance via OpenAIRE

, RDA EUROPE via OpenAIRE

See at: CNR IRIS Restricted | zenodo.org

2022 Other Open Access

Scholexplorer activity report 2022
La Bruzzo Sf, Manghi P
Scholexplorer is a service that accepts publications-data or data-data links from validated sources, builds a de-duplicated graph and provides access to it. ScholExplorer is an implementation of the Scholix initiative (an RDA and WDS). This document is a report on the Scholexplorer installations operation activity after two years of operation, including a detailed set of indicators.DOI: 10.32079/isti-tr-2022/035
Project(s): OpenAIRE Nexus via OpenAIRE

Metrics:

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2011 Other Open Access

HOPE - Heritage of the People's Europe
Bardi A, La Bruzzo S, Zoppi F
The aim of this deliverable is to explain how the D-NET System software - as developed by DRIVER1 - was customized to meet the requirements of the HOPE System Architecture [HOPEHLD] for the implementation of the HOPE Aggregator. The customization activity consists in: . The evaluation and testing of existing D-NET assets, namely the set of services outcome of the DRIVER project, to realize and adapt those needed to implement a federation of repositories compliant with the requirements. . The configuration of existing services for the specific needs of the HOPE project. . The design and implementation of new services. Examples of the customization activity consists in the personalization of the look and feel of the user interface, the implementation of wrappers and harvesting services tailored to specific operational contexts, the customization of the search functionality as to serve diverse user needs and the adaptation of harvesting and aggregator services to deal with the HOPE data model. Section 2 introduces the most relevant characteristics of the D-NET architecture components and services, while Section 3 provides a description of the D-NET adaptation to the HOPE scenario focusing on those components and services which have been configured, customized or developed to fulfil the HOPE requirements. Section 4 and Section 5 respectively introduce the Aggregator Implementation and Test Plans to formalize the milestones MS24 and MS25 as defined by the HOPE Description of Work. Finally, Appendix A is a Glossary of the technical terms used throughout the document, and Appendix B summarizes the workflow for submitting and curating metadata with the HOPE Aggregator.Project(s): HOPE - Heritage of the People's Europe

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2012 Other Open Access

OAIzer : customized OAI-ORE and OAI-PMH exports of compound objects for the Fedora repository
Bardi A, La Bruzzo S, Manghi P
Modern Digital Library Systems (DLSs) are based on docu- ment models which surpass the traditional payload-metadata document model to incorporate further entities involved in the research life-cycle. Such DLSs manage graphs of interconnected objects, hence oer tools for the creation, visualization and exports of such graphs. In particular, ob- jects in the graph are exported via standard OAI-ORE and OAI-PMH pro- tocols, encoded as (XML) packages of interlinked information objects", also known as compound objects. Fedora is a well-known repository plat- form, designed to support the realization of DLSs implementing modern document models. To date, Fedora does not provide tools to customize compound object exports from DLS object graphs. This paper presents Fedora-OAIzer, an extension of Fedora which allows DLS developers to customize the structure of compound objects to be exported from a given DLS document model { expressed in terms of Fedora Content Models { and to select the OAI protocol of preference. In order to prove the com- pleteness of the approach, Fedora-OAIzer is compared to other solutions for exporting compound objects from Fedora repositories.Project(s): OPENAIREPLUS via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2013 Journal article Restricted

See at: CNR IRIS Restricted | CNR IRIS | CNR IRIS

2012 Conference article Open Access

See at: CNR IRIS Open Access | CNR IRIS Restricted | CNR IRIS

2013 Conference article Restricted

OAIzer: configurable OAI exports over relational databases
La Bruzzo S, Manghi P, Bardi A
Modern Digital Library Systems (DLSs) typically support information spaces of interconnected objects, whose graph-like document models surpass the traditional DL payload-metadata document models. Examples are repositories for enhanced publications, CRIS systems, cultural heritage archives. To enable interoperability, DLSs expose their objects and interlinks with other objects as "export packages", via standard exchange formats (e.g. XML, RDF encodings) and OAI-ORE or OAI-PMH protocols. This paper presents OAIzer, a tool for the easy configuration and automatic deploy of OAI interfaces over an RDBMSbased DLS. Starting from the given relational representation of a document model, OAIzer provides DLS developers with user interfaces for drafting the intended structure of export packages and the automated deploy of OAI endpoints capable of exporting such packages.DOI: 10.1007/978-3-319-03437-9_5
Project(s): OPENAIREPLUS via OpenAIRE

Metrics:

See at: doi.org Restricted | CNR IRIS | CNR IRIS | link.springer.com

2013 Other Open Access

OPENAIREPLUS - Functional specification for the data curation services
Manghi P, Artini M, La Bruzzo S
This deliverable presents the functionality and the high-level architecture of the End-User feedback Services. The services provide end-users with tools for improving the quality of the Information Space by submitting fixes and enrichments, to be validated by OpenAIRE data curators prior application.Project(s): OPENAIREPLUS via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2013 Conference article Restricted

See at: CNR IRIS Restricted | CNR IRIS

2016 Other Open Access

A subscription and notification broker for scholarly communication: functionalities and architecture
Artini M, Atzori C, La Bruzzo S
The OpenAIRE infrastructure services populate and provide access to a graph of objects relative to publications, datasets, people, organizations, projects, and funders aggregated from a variety of data sources. Not only, objects in the graph are harmonized to achieve semantic homogeneity, de-duplicated and merged, and enriched by inference with missing properties and/or relationships. The OpenAIRE Literature Broker Service is designed to offer subscription and notification functionalities for institutional repositories to: (i) learn about publication objects in OpenAIRE that do not appear in their collection but may be pertinent to it, and (ii) learn about extra properties or relationships relative to publication objects in their collection. Due to the high variability of the information space the following problems may arise: (i) subscriptions may vary over time to adapt to information space evolution, (ii) repository managers need to be able to quickly test their configurations before activating them, (iii) notifications may be redundant, and (iv) notifications may be very large over time. This paper presents the data model and software architecture of the OLBS, specifically designed to address these issues.Project(s): OpenAIRE2020 via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2014 Other Open Access

The OpenAIRE action manager framework
Artini M, Atzori C, La Bruzzo S
The OpenAire infrastructure offers services for collecting records (publications, datasets, persons, organizations, data sources, projects) from external data sources with the purpose of identifying relationships between them. The collected objects and their relationships are stored in HBASE according to the OpenAIRE data model. The Action Manager framework has been designed to offer an OpenAIRE data model oriented API for the enrichment and the fixing of the OpenAIRE HBASE information space.Project(s): OPENAIRE via OpenAIRE

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2012 Conference article Open Access

Customized OAI-ORE and OAI-PMH exports of compound objects for the Fedora repository
Bardi A, La Bruzzo S, Manghi P
Modern Digital Library Systems (DLSs) are based on docu- ment models which surpass the traditional payload-metadata document model to incorporate further entities involved in the research life-cycle. Such DLSs manage graphs of interconnected objects, hence oer tools for the creation, visualization and exports of such graphs. In particular, ob- jects in the graph are exported via standard OAI-ORE and OAI-PMH pro- tocols, encoded as (XML) packages of interlinked information objects", also known as compound objects. Fedora is a well-known repository plat- form, designed to support the realization of DLSs implementing modern document models. To date, Fedora does not provide tools to customize compound object exports from DLS object graphs. This paper presents Fedora-OAIzer, an extension of Fedora which allows DLS developers to customize the structure of compound objects to be exported from a given DLS document model { expressed in terms of Fedora Content Models { and to select the OAI protocol of preference. In order to prove the com- pleteness of the approach, Fedora-OAIzer is compared to other solutions for exporting compound objects from Fedora repositories.

See at: CNR IRIS Open Access | ISTI Repository | CNR IRIS Restricted

2019 Conference article Open Access

OpenAIRE's DOIBoost - Boosting Crossref for Research
La Bruzzo S, Manghi P, Mannocci A
Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of scholarly entities metadata and, where possible, their relative payloads. Since such metadata information is scattered across diverse, freely accessible, online resources (e.g. Crossref, ORCID), researchers in this domain are doomed to struggle with (meta)data integration problems, in order to produce custom datasets of often undocumented and rather obscure provenance. This practice leads to waste of time, duplication of efforts, and typically infringes open science best practices of transparency and reproducibility of science. In this article, we describe how to generate DOIBoost, a metadata collection that enriches Crossref with inputs from Microsoft Academic Graph, ORCID, and Unpaywall for the purpose of supporting high-quality and robust research experiments, saving times to researchers and enabling their comparison. To this end, we describe the dataset value and its schema, analyse its actual content, and share the software Toolkit and experimental workflow required to reproduce it. The DOIBoost dataset and Software Toolkit are made openly available via Zenodo.org. DOIBoost will become an input source to the OpenAIRE information graph.DOI: 10.1007/978-3-030-11226-4_11
Project(s): OpenAIRE-Advance via OpenAIRE

Metrics:

2018 Dataset Metadata Only Access

DOIBoost Dataset Dump
La Bruzzo S, Manghi P, Mannocci A
Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of metadata and, where possible, their relative payloads. To this end, CrossRef plays a pivotal role by providing free access to its entire metadata collection, and allowing other initiatives to link and enrich its information. Therefore, a number of key pieces of information result scattered across diverse datasets and resources freely available online. As a result of this fragmentation, researchers in this domain end up struggling with daily integration problems producing a plethora of ad-hoc datasets, therefore incurring in a waste of time, resources, and infringing open science best practices. DOIBoost is a metadata collection that enriches CrossRef with inputs from Microsoft Academic Graph, ORCID, and Unpaywall for the purpose of supporting high-quality and robust research experiments, saving times to researchers and enabling their comparison.Project(s): OpenAIRE-Advance via OpenAIRE

See at: CNR IRIS Restricted | zenodo.org

2021 Conference article Open Access

BIP! DB: a dataset of impact measures for scientific publications
Vergoulis T, Kanellos I, Atzori C, Mannocci A, Chatzopoulos S, La Bruzzo S, Manola N, Manghi P
The growth rate of the number of scientific publications is constantly increasing, creating important challenges in the identification of valuable research and in various scholarly data management applications, in general. In this context, measures which can effectively quantify the scientific impact could be invaluable. In this work, we present BIP! DB, an open dataset that contains a variety of impact measures calculated for a large collection of more than 100 million scientific publications from various disciplines.DOI: 10.1145/3442442.3451369
DOI: 10.48550/arxiv.2101.12001
Project(s): OpenAIRE-Advance via OpenAIRE

, OpenAIRE Nexus via OpenAIRE

Metrics: