AGINFRA PLUS - Open Science Data Analytics Technologies D3.1
Candela L., Cirillo R., Coro G., Lelii L., Pagano P., Panichi G., Scarponi P., Sinibaldi F.
Data Analytics & Processing Layer
Deliverable D3.1 "Open Science Data Analytics Technologies" is a deliverable of type Demonstrator
meaning that it manifests in artefacts (software releases) other than reports. In particular, the
deliverable is about the software realising the Data Analytics & Processing Layer of the AGINFRA+.
This software is part of a large software system named gCube (www.gcube-system.org). The gCube
system offers a large array of services supporting the entire lifecycle underlying a research activity (data
management and collation, analytics, collaboration, sharing) and the possibility to combine these
services in Virtual Research Environments1.
In the context of AGINFRA PLUS the following gCube components have been primarily exploited,
consolidated and enhanced to serve the analytics needs arising in the context of the project use cases.
DataMiner, i.e. a service enacting its users to perform data analytics tasks by relying on an array of
analytics methods and a distributed and heterogeneous computing infrastructure. This service is
available by a web-based GUI as well as via a web-based API based on the OGC WPS standard.
SAI (Statistical Algorithm Importer), i.e. a service enacting its users to make available their own analytics
methods via the DataMiner service.
In addition to that, the entire analytics solution made available for AGINFRA PLUS cases counts on (i) a
shared workspace realising a cloud-based file manager for managing content of interest and sharing this
content with co-workers, (ii) a social networking area enabling users to post messages and have
discussions, (iii) a flexible catalogue enabling to publish and discover items of interest including
"research objects" resulting from an analytics task.
This technology is deployed in its latest version in every Virtual Research Environment supporting
AGINFRA PLUS cases2.
The major enhancements to the technology pertaining to AGINFRA PLUS have been included in three
gCube major releases3 4.7 (October 2017), 4.8 (November 2017), and 4.9 (under production).In
particular, with these releases a new"black-box" oriented approach (https://wiki.gcubesystem.
envisaged and implemented to enact analytics method owners and developers to easily integrate
theirsolutions into the DataMinerservice. Among the supported black-box typologies there is that for
KNIME workflows, i.e. analytics methods implemented by a KNIME workflow. KNIME is among the key
technologies supporting the Food Safety Risk Assessment cases. In order to enact the execution of
KNIME-based black-boxes, the distributed computing part of the data analytics platform has been
extended to integrate the KNIME execution engine. Other cases are counting on the same mechanism to
integrate entire applications (WOFOST4) as well as Python-based methods.
Source: Project report, AGINFRA PLUS, Deliverable D3.1, pp.1–4, 2017Back to previous page