Page 1 of 13

2023 Contribution to book Open Access

A primer on open science-driven repository platforms
Bardi A., Manghi P., Mannocci A., Ottonello E., Pavone G.
Following Open Science mandates, institutions and communities increasingly demand repositories with native support for publishing scientific literature together with research data, software, and other research products. Such repositories may be thematic or general-purpose and are deeply integrated with the scholarly communication ecosystem to ensure versioning, persistent identifiers, data curation, usage stats, and so on. Identifying the most suitable off-the-shelf repository platform is often a non-trivial task as the choice depends on functional requirements, programming and technical skills, and infrastructure resources. This work analyses four state-of-the-art Open Source repository platforms, namely Dryad, Dataverse, DSpace, and InvenioRDM, from both a functional and a software perspective. This work intends to provide an overview serving as a primer for choosing repository platform solutions in different application scenarios. Moreover, this paper highlights how these platforms reacted to some key Open Science demands, moving away from the original and old-fashioned concept of a repository serving as a static container of files and metadata.Source: Metadata and Semantic Research, edited by Garoufallou E., Vlachidis A., pp. 222–234, 2023
DOI: 10.1007/978-3-031-39141-5_19
Project(s): OpenAIRE Nexus via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | link.springer.com Restricted | CNR ExploRA

2023 Conference article Open Access

(Semi)automated disambiguation of scholarly repositories
Baglioni M., Mannocci A., Pavone G., De Bonis M., Manghi P.
The full exploitation of scholarly repositories is pivotal in modern Open Science, and scholarly repository registries are kingpins in enabling researchers and research infrastructures to list and search for suitable repositories. However, since multiple registries exist, repository managers are keen on registering multiple times the repositories they manage to maximise their traction and visibility across different research communities, disciplines, and applications. These multiple registrations ultimately lead to information fragmentation and redundancy on the one hand and, on the other, force registries' users to juggle multiple registries, profiles and identifiers describing the same repository. Such problems are known to registries, which claim equivalence between repository profiles whenever possible by cross-referencing their identifiers across different registries. However, as we will see, this "claim set" is far from complete and, therefore, many replicas slip under the radar, possibly creating problems downstream. In this work, we combine such claims to create duplicate sets and extend them with the results of an automated clustering algorithm run over repository metadata descriptions. Then we manually validate our results to produce an "as accurate as possible" de-duplicated dataset of scholarly repositories.Source: IRCDL 2023 - 19th conference on Information and Research Science Connecting to Digital and Library Science, pp. 47–59, Bari, Italy, 23-24/02/2023
Project(s): OpenAIRE Nexus via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Journal article Open Access

A novel curated scholarly graph connecting textual and data publications
Irrera O., Mannocci A., Manghi P., Silvello G.
In the last decade, scholarly graphs became fundamental to storing and managing scholarly knowledge in a structured and machine-readable way. Methods and tools for discovery and impact assessment of science rely on such graphs and their quality to serve scientists, policymakers, and publishers. Since research data became very important in scholarly communication, scholarly graphs started including dataset metadata and their relationships to publications. Such graphs are the foundations for Open Science investigations, data-article publishing workflows, discovery, and assessment indicators. However, due to the heterogeneity of practices (FAIRness is indeed in the making), they often lack the complete and reliable metadata necessary to perform accurate data analysis; e.g., dataset metadata is inaccurate, author names are not uniform, and the semantics of the relationships is unknown, ambiguous or incomplete. This work describes an open and curated scholarly graph we built and published as a training and test set for data discovery, data connection, author disambiguation, and link prediction tasks. Overall the graph contains 4,047 publications, 5,488 datasets, 22 software, 21,561 authors; 9,692 edges interconnect publications to datasets and software and are labeled with semantics that outline whether a publication is citing, referencing, documenting, supplementing another product. To ensure high-quality metadata and semantics, we relied on the information extracted from PDFs of the publications and the datasets and software webpages to curate and enrich nodes metadata and edges semantics. To the best of our knowledge, this is the first ever published resource, including publications and datasets with manually validated and curated metadata.Source: ACM journal of data and information quality (Online) 15 (2023). doi:10.1145/3597310
DOI: 10.1145/3597310
Project(s): OpenAIRE Nexus via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | dl.acm.org Restricted | CNR ExploRA

2023 Conference article Open Access

A discovery hub for Diamond Open Access publishing
Bardi A., Bargheer M., Manghi P.
Open Access (OA) publishing is the set of practices thanks to which research publications are accessible freely without barriers. With Diamond Open Access, authors can publish free of charge as the institutional sector with universities, research institutions or libraries provide the necessary technological infrastructure. However, the Diamond OA landscape continues to be fragmented, is often underfunded, and is not always technically proficient enough to develop its full potential for science and society. The CRAFT-OA project, started in January 2023, aims to consolidate the Diamond OA publishing landscape both from the technical and organisational point of views. In this paper we describe the context and architecture of the Diamond Discovery Hub that will be released by the project to increase visibility, discoverability and recognition of Diamond OA institutional publishers and their content. The Diamond Discovery Hub will facilitate the integration with the wider scholarly communication ecosystem and the European Open Science Cloud to enlarge visibility, discoverability and reach of open access publications as part of the emerging Open Science paradigm.Source: IRCDL2023 - 19th Conference on Information and Research Science Connecting to Digital and Library Science, pp. 162–166, Bari, Italy, 23-24/02/2023

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2023 Journal article Open Access

Data management plans as linked open data: exploiting ARGOS FAIR and machine actionable outputs in the OpenAIRE research graph
Papadopoulou E., Bardi A., Kakaletris G., Tziotzios D., Manghi P., Manola N.
Open Science Graphs (OSGs) are scientific knowledge graphs representing different entities of the research lifecycle (e.g. projects, people, research outcomes, institutions) and the relationships among them. They present a contextualized view of current research that supports discovery, re-use, reproducibility, monitoring, transparency and omni-comprehensive assessment. A Data Management Plan (DMP) contains information concerning both the research processes and the data collected, generated and/or re-used during a project's lifetime. Automated solutions and workflows that connect DMPs with the actual data and other contextual information (e.g., publications, fundings) are missing from the landscape. DMPs being submitted as deliverables also limit their findability. In an open and FAIR-enabling research ecosystem information linking between research processes and research outputs is essential. ARGOS tool for FAIR data management contributes to the OpenAIRE Research Graph (RG) and utilises its underlying services and trusted sources to progressively automate validation and automations of Research Data Management (RDM) practices.Source: Journal of biomedical semantics 14 (2023). doi:10.1186/s13326-023-00297-5
DOI: 10.1186/s13326-023-00297-5
Metrics:

See at: jbiomedsem.biomedcentral.com Open Access | CNR ExploRA

2023 Contribution to conference Open Access

OpenAIRE, comunità e servizi per praticare la scienza aperta
Pavone G., Atzori C., Baglioni M., Bardi A., Manghi P., Castelli D.
Per praticare la ricerca secondo i principi dell'Open Science sono al contempo necessarie tecnologie - con infrastrutture che consentano e facilitino la collaborazione e lo scambio massivo di informazioni su scala internazionale - e competenze che permettano di massimizzarne uso e risultati. In altre parole occorrono servizi, scambio di competenze e formazione. Su queste direttrici si concentra il lavoro di OpenAIRE (Open Access Infrastructure for Research in Europe), l'infrastruttura europea per la Scienza Aperta che offre servizi tecnologici e una rete europea di scambio e sinergia per favorire la scienza aperta. Avviata come progetto europeo nel 2009 per il monitoraggio dell'Open Access, nel corso degli anni l'iniziativa è stata rifinanziata e il suo ambito di interesse esteso a tutte le componenti dell'Open Science. Nel 2018 si è costituita come organizzazione senza scopo di lucro per garantire una struttura permanente a supporto delle politiche nazionali ed europee per l'Open Science. Il network di OpenAIRE conta oltre 40 membri tra centri di ricerca, università, fondazioni ed enti gestori di servizi distribuiti in tutta Europa. Come comunità di pratica, OpenAIRE ha la missione di costituire e gestire un'infrastruttura che supporti una comunicazione scientifica aperta e sostenibile, fornendo i servizi, le risorse e il coordinamento di iniziative ed esperti necessari per implementare un ambiente comune europeo per la scienza aperta. Per realizzare questa visione, OpenAIRE offre servizi tecnologici, di training e di supporto, coprendo l'intero ciclo di vita della ricerca (la lista completa dei servizi è consultabile su catalogue.openaire.eu). I servizi tecnologici spaziano dalla gestione dei dati al discovery, dalla gestione di riviste al monitoraggio dei risultati della ricerca e dell'adozione di pratiche Open Science. Inoltre la rete internazionale dei NOAD (National Open Access Desk: openaire.eu/contact-noads) promuove la scienza aperta fornendo assistenza e formazione a vari livelli. L'obiettivo è abilitare i vari attori coinvolti nell'attività scientifica nelle pratiche dell'open science e dell'open access organizzando workshop nazionali e training dedicati. I NOADs inoltre forniscono consulenza esperta sulle infrastrutture che supportano i flussi di lavoro per la scienza aperta, nonché per la definizione di politiche per la sua implementazione, quali stesura e aggiornamento di policies istituzionali, individuazione degli obblighi normativi, di adempimenti relativi ai finanziamenti o di strumenti per il Data Management Plan (DMP). Il CNR, in particolare il suo istituto ISTI, in qualità di centro di sviluppo e innovazione tecnologica dell'infrastruttura e di gestore del NOAD Italiano, opera in accordo con la missione di OpenAIRE contribuendo in modo significativo alle sue attività e agli organismi di governo. L'ente offre dunque le sue competenze per garantire il mantenimento, l'operatività e l'innovazione dell'infrastruttura partecipando in iniziative e progetti che contribuiscono alla sostenibilità e all'innovazione dei servizi di questa infrastruttura. Come NOAD, offre formazione e supporto per affrontare problematiche quali la definizione di DMP, il rispetto dei principi "FAIR" per la gestione dei dati, e la stesura di politiche istituzionali. Le attività sono portate avanti in collaborazione con i NOAD in altri paesi europei in modo da massimizzare l'integrazione di soluzioni e politiche a livello europeo.Source: GenoOA Week 2023, Genoa, Italy, 23-27/10/2023

See at: ISTI Repository Open Access | CNR ExploRA

2023 Contribution to conference Open Access

OpenAIRE Graph: una risorsa aperta per la scienza aperta
Atzori C., Bardi A., Baglioni M., Manghi P.
L'OpenAIRE Graph (OAG) è un knowledge graph costruito aggregando informazioni (metadati, relazioni) riguardo diverse entità del mondo della ricerca quali pubblicazioni, dataset, software ed altri prodotti, progetti finanziati, repository ed organizzazioni, interconnesse tra loro attraverso relazioni semantiche (e.g. citazioni, supplementi, similarità, partecipazione a progetti). L'OAG è una risorsa aperta che può essere utilizzata da enti finanziatori, organizzazioni, ricercatori, comunità di ricerca e editori per ottenere una migliore comprensione del panorama e delle dinamiche della ricerca a vari livelli, sia locale che globale. Trattandosi di una risorsa aperta e liberamente accessibile, prodotta rispettando i valori fondamentali dell'Open Science elaborati nella raccomandazione dell'UNESCO sulla Scienza Aperta, l'OAG permette di superare l'uso di sorgenti dati proprietarie supportando la riforma della valutazione della ricerca, dei ricercatori e delle organizzazioni previste dalla Coalition for Advancing Research Assessment (CoARA). L'OAG è costruito a partire da record bibliografici ottenuti da sorgenti note quali Crossref, le riviste open access registrate in DOAJ (Directory of Open Access Journals), ORCID, Microsoft Academic Graph, Datacite, cosi come da oltre 1000 repository istituzionali. I metadati dei prodotti della ricerca contenuti nel grafo sono disambiguati ed arricchiti grazie a processi di full text e data mining, questo rende l'OAG utilizzabile per una varietà di scopi, tra cui: research discovery, valutazione della ricerca, analisi e/o predizione delle collaborazioni di ricerca, supporto ai processi di decisione delle politiche di ricerca. L'OAG è una risorsa liberamente accessibile: le funzionalità di search & discovery sono disponibili attraverso il portale explore.openaire.eu, l'integrazione per via programmatica è disponibile attraverso le HTTP Search API, il dataset completo, così come altri dataset che offrono viste specializzate sono disponibili su Zenodo. Il portale monitor.openaire.eu ospita diverse dashboard dedicate ad organizzazioni di ricerca ed enti finanziatori che includono i risultati di analisi statistiche, bibliometriche, ed indicatori. Ulteriori informazioni sono disponibili su https://graph.openaire.eu, in cui sono descritti i modelli dati ai quali rispondono i dataset, la documentazione delle API, così come l'approccio metodologico utilizzato per la costruzione e l'elaborazione dell'OAG. A Luglio 2023 l'OAG include circa 170 milioni di pubblicazioni, 40 milioni di dataset, 110K research software ed oltre 3 miliardi di relazioni tra essi. Questo lo rende una delle più grandi raccolte di record accademici al mondo. Ha il potenziale di avere un impatto significativo sul modo in cui la ricerca viene condotta e comunicata. Rendendo più facile trovare, comprendere e utilizzare i dati di ricerca, l'OAG può aiutare a: accelerare la scoperta scientifica, migliorare la collaborazione in materia di ricerca, supportare le decisioni sulle politiche di ricerca, monitorare i progressi della ricerca, identificare le aree in cui sono necessari maggiori investimenti, aumentare la visibilità della ricerca nei paesi in via di sviluppo, supportare la riproducibilità della ricerca, promuovere le pratiche di open science. Per queste sue caratteristiche, l'OAG ha il potenziale per contribuire significativamente al progresso della scienza e della società.Source: GenoOA Week 2023, Genova, Italy e online, 23-27/10/2023

See at: ISTI Repository Open Access | CNR ExploRA

2023 Journal article Open Access

Graph-based methods for author name disambiguation: a survey
De Bonis M., Falchi F., Manghi P.
Scholarly knowledge graphs (SKG) are knowledge graphs representing research-related information, powering discovery and statistics about research impact and trends. Author name disambiguation (AND) is required to produce high-quality SKGs, as a disambiguated set of authors is fundamental to ensure a coherent view of researchers' activity. Various issues, such as homonymy, scarcity of contextual information, and cardinality of the SKG, make simple name string matching insufficient or computationally complex. Many AND deep learning methods have been developed, and interesting surveys exist in the literature, comparing the approaches in terms of techniques, complexity, performance, etc. However, none of them specifically addresses AND methods in the context of SKGs, where the entity-relationship structure can be exploited. In this paper, we discuss recent graph-based methods for AND, define a framework through which such methods can be confronted, and catalog the most popular datasets and benchmarks used to test such methods. Finally, we outline possible directions for future work on this topic.Source: PeerJ Computer Science 9 (2023). doi:10.7717/peerj-cs.1536
DOI: 10.7717/peerj-cs.1536
Project(s): EOSC Future via OpenAIRE

, OpenAIRE Nexus via OpenAIRE

Metrics:

See at: PeerJ Computer Science Open Access | ISTI Repository | peerj.com | CNR ExploRA

2023 Conference article Open Access

A graph neural network approach for evaluating correctness of groups of duplicates
De Bonis M., Minutella F., Falchi F., Manghi P.
Unlabeled entity deduplication is a relevant task already studied in the recent literature. Most methods can be traced back to the following workflow: entity blocking phase, in-block pairwise comparisons between entities to draw similarity relations, closure of the resulting meshes to create groups of duplicate entities, and merging group entities to remove disambiguation. Such methods are effective but still not good enough whenever a very low false positive rate is required. In this paper, we present an approach for evaluating the correctness of "groups of duplicates", which can be used to measure the group's accuracy hence its likelihood of false-positiveness. Our novel approach is based on a Graph Neural Network that exploits and combines the concept of Graph Attention and Long Short Term Memory (LSTM). The accuracy of the proposed approach is verified in the context of Author Name Disambiguation applied to a curated dataset obtained as a subset of the OpenAIRE Graph that includes PubMed publications with at least one ORCID identifier.Source: TPDL 2023 - 27th International Conference on Theory and Practice of Digital Libraries, pp. 207–219, Zadar, Croatia, 26-29/09/2023
DOI: 10.1007/978-3-031-43849-3_18
Project(s): OpenAIRE Nexus via OpenAIRE

Metrics:

See at: doi.org Open Access | link.springer.com | ISTI Repository | CNR ExploRA

2023 Report Unknown

InfraScience research activity report 2023
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bosio C., Bove P., Calanducci A., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., Ibrahim A. S. T., La Bruzzo S., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Molinaro E., Pagano P., Panichi G., Paratore M. T., Pavone G., Piccioli T., Sinibaldi F., Straccia U., Vannini G. L.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2023 to highlight the major results. In particular, the InfraScience group engaged in research challenges characterising Data Infrastructures, e-Science, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2023 InfraScience members contributed to the publishing of several papers, to the research and development activities of several research projects (primarily funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual Reports, 2023
DOI: 10.32079/isti-ar-2023/002
Project(s): Blue Cloud via OpenAIRE

, EOSC Future via OpenAIRE

, TAILOR

Metrics:

See at: CNR ExploRA

2023 Conference article Open Access

Tracing data footprints: formal and informal data citations in the scientific literature
Irrera O., Mannocci A., Manghi P., Silvello G.
Data citation has become a prevalent practice within the scientific community, serving the purpose of facilitating data discovery, reproducibility, and credit attribution. Consequently, data has gained significant importance in the scholarly process. Despite its growing prominence, data citation is still at an early stage, with considerable variations in practices observed across scientific domains. Such diversity hampers the ability to consistently analyze, detect, and quantify data citations. We focus on the European Marine Science (MES) community to examine how data is cited in this specific context. We identify four types of data citations: formal, informal, complete, and incomplete. By analyzing the usage of these diverse data citation modalities, we investigate their impact on the widespread adoption of data citation practices.Source: TPDL 2023 - 27th International Conference on Theory and Practice of Digital Libraries, pp. 79–92, Zadar, Croatia, 26-29/09/2023
DOI: 10.1007/978-3-031-43849-3_7
Metrics:

See at: ISTI Repository Open Access | doi.org Restricted | link.springer.com | CNR ExploRA

2023 Report Open Access

Landscape study on (semi-)automatic publishing workflows/integration between RI and repository services
Eosc Future Working Group On The Eosc Interoperability Framework For Research Product Publishing
Open Science calls for researchers to publish as soon as possible any type of research product in such a way their research activity can be transparently assessed, reviewed, reproduced, and rewarded in all its aspects. However, the publishing process has become more and more a burden for scientists, who must, most of the time, spend time to publish their articles, data, software, and other products in the many institutional or thematic repositories of reference. Scenarios include first-time publishing of new resource products or double-publishing of research products, to satisfy institutional mandates and community practices. Such tedious work is often incomplete, with some products ending up unpublished and others showing incomplete or imprecise metadata. As a solution to these problems, some communities investigated and realised the integration of their research performing services, from RIs and Clusters, with repositories. The integration ensures that outcomes of such services are deposited by the services, prior authorization of the users, into a given repository, giving life to an end-to-end scientific workflow, from experimentation to publishing. This document reports the experiences of the WG members, describing solutions at different maturity levels (design, prototype, beta, production) and involving different types of services (repository, analysis/research tool, publisher, other scholarly service) for the (semi-)automatic deposition steps of research assets produced in a research infrastructure to a target service in the scholarly communication ecosystem. It also presents a list of scenarios that would benefit from an interoperability framework for research product deposition.Source: ISTI Research reports, 2023
DOI: 10.5281/zenodo.8094651
DOI: 10.5281/zenodo.8094650
Project(s): EOSC Future via OpenAIRE

, OpenAIRE Nexus via OpenAIRE

Metrics:

See at: ZENODO Open Access | ZENODO | ISTI Repository | CNR ExploRA

2022 Contribution to journal Open Access

New trends in scientific knowledge graphs and research impact assessment
Manghi P., Mannocci A., Osborne F., Sacharidis D., Salatino A., Vergoulis T.
Source: Quantitative Science Studies 2 (2022): 1296–1300. doi:10.1162/qss_e_00160
DOI: 10.1162/qss_e_00160
Metrics:

2022 Report Open Access

Open Science repository platforms
Manghi P., Artini M., La Bruzzo S., Ottonello E., Pavone G.
Institutional and thematic repositories today play a key role in scholarly communication and more broadly in scientific workflows. Many institutions and communities have set the ambitious goal of providing an open access repository for their community of users. However, given the amount of expectations from their users, choosing the right solution is often a non-trivial choice. Some platforms may be served out-of-the-box, to be put in operation after straightforward configurations, but are in general less customizable to adhere to specific functional, non-functional, or contextual needs. Other platforms may be instead extremely customizable and flexible but require skilled personnel for their adaptation and deployment. This report performs an analysis of existing state-of-the-art Open Source repository solutions from the functional, operational, and software perspectives. As a result of the analysis, it will factor out the pros and cons of such solutions and identify typical scenarios of adoption.Source: ISTI Technical Report, ISTI-2022-TR/009, 2022
DOI: 10.32079/isti-tr-2022/009
Project(s): OpenAIRE Nexus via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2022 Conference article Open Access

Will open science change authorship for good? Towards a quantitative analysis
Mannocci A., Irrera O., Manghi P.
Authorship of scientific articles has profoundly changed from early science until now. If once upon a time a paper was authored by a handful of authors, scientific collaborations are much more prominent on average nowadays. As authorship (and citation) is essentially the primary reward mechanism according to the traditional research evaluation frameworks, it turned to be a rather hot-button topic from which a significant portion of academic disputes stems. However, the novel Open Science practices could be an opportunity to disrupt such dynamics and diversify the credit of the different scientific contributors involved in the diverse phases of the lifecycle of the same research effort. In fact, a paper and research data (or software) contextually published could exhibit different authorship to give credit to the various contributors right where it feels most appropriate. We argue that this can be computationally analysed by taking advantage of the wealth of information in model Open Science Graphs. Such a study can pave the way to understand better the dynamics and patterns of authorship in linked literature, research data and software, and how they evolved over the years.Source: IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy, 24-25/02/2022
Project(s): OpenAIRE Nexus via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2022 Conference article Open Access

Towards unsupervised machine learning approaches for knowledge graphs
Minutella F., Falchi F., Manghi P., De Bonis M., Messina N.
Nowadays, a lot of data is in the form of Knowledge Graphs aiming at representing information as a set of nodes and relationships between them. This paper proposes an efficient framework to create informative embeddings for node classification on large knowledge graphs. Such embeddings capture how a particular node of the graph interacts with his neighborhood and indicate if it is either isolated or part of a bigger clique. Since a homogeneous graph is necessary to perform this kind of analysis, the framework exploits the metapath approach to split the heterogeneous graph into multiple homogeneous graphs. The proposed pipeline includes an unsupervised attentive neural network to merge different metapaths and produce node embeddings suitable for classification. Preliminary experiments on the IMDb dataset demonstrate the validity of the proposed approach, which can defeat current state-of-the-art unsupervised methods.Source: IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy, 24-25/02/2022
Project(s): OpenAIRE Nexus via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2022 Conference article Open Access

A preliminary assessment of the article deduplication algorithm used for the OpenAIRE Research Graph
Vichos K., De Bonis M., Kanellos I., Chatzopoulos S., Atzori C., Manola N., Manghi P., Vergoulis T.
In recent years, a large number of Scholarly Knowledge Graphs (SKGs) have been introduced in the literature. The communities behind these graphs strive to gather, clean, and integrate scholarly metadata from various sources to produce clean and easy-to-process knowledge graphs. In this context, a very important task of the respective cleaning and integration workflows is deduplication. In this paper, we briefly describe and evaluate the accuracy of the deduplication algorithm used for the OpenAIRE Research Graph. Our experiments show that the algorithm has an adequate performance producing a small number of false positives and an even smaller number of false negatives.Source: IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy, 24-25/02/2022
Project(s): OpenAIRE Nexus via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository | CNR ExploRA

2022 Conference article Open Access

Data models for an imaging bio-bank for colorectal, prostate and gastric cancer: the NAVIGATOR project
Berti A., Carloni G., Colantonio S., Pascali M. A., Manghi P., Pagano P., Buongiorno R., Pachetti E., Caudai C., Di Gangi D., Carlini E., Falaschi Z., Ciarrocchi E., Neri E., Bertelli E., Miele V., Carpi R., Bagnacci G., Di Meglio N., Mazzei M. A., Barucci A.
Researchers nowadays may take advantage of broad collections of medical data to develop personalized medicine solutions. Imaging bio-banks play a fundamental role, in this regard, by serving as organized repositories of medical images associated with imaging biomarkers. In this context, the NAVIGATOR Project aims to advance colorectal, prostate, and gastric oncology translational research by leveraging quantitative imaging and multi-omics analyses. As Project's core, an imaging bio-bank is being designed and implemented in a web-accessible Virtual Research Environment (VRE). The VRE serves to extract the imaging biomarkers and further process them within prediction algorithms. In our work, we present the realization of the data models for the three cancer use-cases of the Project. First, we carried out an extensive requirements analysis to fulfill the necessities of the clinical partners involved in the Project. Then, we designed three separate data models utilizing entity-relationship diagrams. We found diagrams' modeling for colorectal and prostate cancers to be more straightforward, while gastric cancer required a higher level of complexity. Future developments of this work would include designing a common data model following the Observational Medical Outcomes Partnership Standards. Indeed, a common data model would standardize the logical infrastructure of data models and make the bio-bank easily interoperable with other bio-banks.Source: BHI '22 - IEEE-EMBS International Conference on Biomedical and Health Informatics, Ioannina, Greece, 27-30/09/2022
DOI: 10.1109/bhi56158.2022.9926910
Metrics:

See at: ISTI Repository Open Access | ieeexplore.ieee.org Restricted | CNR ExploRA

2022 Report Open Access

InfraScience research activity report 2021
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bove P., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., La Bruzzo S., Lazzeri E., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Ottonello E., Pagano P., Panichi G., Pavone G., Piccioli T., Sinibaldi F., Straccia U.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2021 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, eScience, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2021 InfraScience members contributed to the publishing of 25 papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual report, 2022
DOI: 10.32079/isti-ar-2022/001
Project(s): ARIADNEplus via OpenAIRE

, Blue Cloud via OpenAIRE

, PerformFISH via OpenAIRE

, EOSC-Pillar via OpenAIRE

, DESIRA

, EOSC Future via OpenAIRE

, EOSCsecretariat.eu via OpenAIRE

, EcoScope via OpenAIRE

, RISIS 2

, OpenAIRE-Advance via OpenAIRE

, OpenAIRE Nexus via OpenAIRE

, SoBigData-PlusPlus via OpenAIRE

Metrics:

See at: ISTI Repository Open Access | CNR ExploRA

2022 Conference article Open Access

BIP! scholar: a service to facilitate fair researcher assessment
Vergoulis T., Chatzopoulos S., Vichos K., Kanellos I., Mannocci A., Manola N., Manghi P.
In recent years, assessing the performance of researchers has become a burden due to the extensive volume of the existing research output. As a result, evaluators often end up relying heavily on a selection of performance indicators like the h-index. However, over-reliance on such indicators may result in reinforcing dubious research practices, while overlooking important aspects of a researcher's career, such as their exact role in the production of particular research works or their contribution to other important types of academic or research activities (e.g., production of datasets, peer reviewing). In response, a number of initiatives that attempt to provide guidelines towards fairer research assessment frameworks have been established. In this work, we present BIP! Scholar, a Web-based service that offers researchers the opportunity to set up profiles that summarise their research careers taking into consideration well-established guidelines for fair research assessment, facilitating the work of evaluators who want to be more compliant with the respective practices.Source: JCDL'22 - 22nd ACM/IEEE Joint Conference on Digital Libraries, Cologne, Germany, 20-24/06/2022
DOI: 10.1145/3529372.3533296
DOI: 10.48550/arxiv.2205.03152
Project(s): OpenAIRE Nexus via OpenAIRE

Metrics: