H2020 Project Project Number: 654182
ENVRIPLUS DATAFOR
SCIENCE THEME
DR. ZHIMING ZHAO
H2020 Project Project Number: 654182
Data for science
MOTIVATION:SYSTEMLEVELOFENVIRONMENTALSCIENCES
REUSABLESOLUTIONSTOCOMMON
CHALLENGES
Five themes
1. Technical innovation
2. Data for science
3. Access
4. Social relevance
5. Knowledge transfer
6. Communication and dissemination
ENVRI PLUS: EU H2020 project
www.envriplus.eu
4 years, 15M
37 partners, more than 20 research infrastructures
H2020 Project Project Number: 654182
Common problems
Reusable solutions
Common challenges towards EOSC
OUTLINE
Approach: multi viewpoint
modelling, aims at a common
ontological framework
ENVRI RM: Reference model
OIL-e: Open information linking
for Environmental RIs
1. Chen, Y., et al., (2013) A common reference model for environmental science research
infrastructures. In Proceedings of EnviroInfo2013.
2. Zhao. Z, et al., (2015) Open Information Linking for Environmental Research
Infrastructures. In the proceedings of IEEE eScience [doi:10.1109/eScience.2015.66]
H2020 Project Project Number: 654182
Output: ENVRI RM 1.0,
OIL-e 1.0
Data query and processing
prototype
Approach: multi viewpoint
modelling, aims at a common
ontological framework
Public
clouds
Challenge 1:
support system
level science
Challenge 2: share
solutions to
common problems
Challenge 3: Interface
with virtual research
environment(s)
Challenge 4: re-use technologies
(e.g. from e-Infrastructures)
Approach: RM guided system design
Zhao, Z., et al. (2015) Reference Model Guided System Design and Implementation for Interoperable
Environmental Research Infrastructures, proceedings of IEEE eScience, 2015
[doi:10.1109/eScience.2015.41]
H2020 Project Project Number: 654182
Output: ENVRI RM 1.0,
OIL-e 1.0
Data query and processing
prototype
Approach: multi viewpoint
modelling, aims at a common
ontological framework
Approach:
RM guided RI co-
design, agile use
case driven
development
Highlighted achievements
1. ENVRI RM 2.x
2. OIL-e 2.x
3. Knowledge base of the
ENVRI RIs
4. Service portfolio
Discover reusable components among research infrastructures.
Design new research infrastructures.
Optimise the evolutionary path.
ICOS EUFARLTER
Euro-
Argo
DASSH
Semantic description and reasoning tools
RI: how did other RIs
implement my missing
functionality?
knowledge base
New RI: What are the best
practices for meeting my
requirements?
New RIs
RI: how should I
upgrade my services?
Data for science theme knowledge base
http://oil-e.vlan400.uvalight.net/
Data for Science theme service portfolio
Data for
science
theme
Service
portfolio
A. Reference model related
A1: reference model training service - CU
A2: open information linking for ENVRI-RIs- UvA
A3: ENVRI knowledge base - UvA
A4: RI architecture design - NERC
B. Theme2 service pillar
B1: Linked open data ingestion and metadata service -
ICOS/LU
B2: d4science data analytics - CNR
B3: dynamic real-time infrastructure planner - UvA
B4: curation - NERC
B5: flagship cataloguing - IFREMER
B6: Provenance - EAA
C. Reusable solution from use cases/RIs
C1: Data subscription service - EUDAT
C2: Pipeline for semantic annotation of relational DB –
ANAEE/INRA
C3: Data / metadata generation from semantic
annotations- ANAEE/INRA
C4: DEMIS - LTER/EAA
D. Software quality checking and testbed
D1: envriplus service test bed - EGI
A. Reference model related
A1: reference model training service - CU
A2: open information linking for ENVRI-RIs- UvA
A3: ENVRI knowledge base - UvA
A4: RI architecture design - NERC
B. Theme2 service pillar
B1: Linked open data ingestion and metadata service -
ICOS/LU
B2: d4science data analytics - CNR
B3: dynamic real-time infrastructure planner - UvA
B4: curation - NERC
B5: flagship cataloguing - IFREMER
B6: Provenance - EAA
C. Reusable solution from use cases/RIs
C1: Data subscription service - EUDAT
C2: Pipeline for semantic annotation of relational DB –
ANAEE/INRA
C3: Data / metadata generation from semantic
annotations- ANAEE/INRA
C4: DEMIS - LTER/EAA
D. Software quality checking and testbed
D1: envriplus service test bed - EGI
Data for Science theme service portfolio
Data for
science
theme
Service
portfolio
How to use?
Successful
stories
TRL and
support.
https://envriplus.manageprojects.com/s/notebook/Og4C2wEWLso0k
H2020 Project Project Number: 654182
EU FP7 ENVRI:
Understand
Common
challenges and
requirements
Output:
ENVRI RM/OIL-e 2.x,
knowledge base,
Service portfolio
Output: ENVRI RM 1.0,
OIL-e 1.0
Data query and processing
prototype
EU H2020 ENVRIPLUS:
Data for science theme
Build reusable solutions to
common development
challenges.
Approach: multi viewpoint
modelling, aims at a common
ontological framework
Approach:
RM guided RI co-
design, agile use
case driven
development
Current challenges:
1) Operational challenges:
AAA, deployment,
maintenance
2) Science challenges:
effective VREs,
discovery, optimization
3) Sustainability challenges:
ENV-RIs in
EOSC
Solve complex
system-level
problems
Collaboration,
Performing
cross-
disciplinary
research…
Innovation from
open science
H2020 Project Project Number: 654182
Current challenges:
1)Operational challenges: AAI, deployment,
maintenance
2)Science support challenges: effective
VREs, discovery, optimization
3)Sustainability challenges: software
upkeep, long term contracts for service
provisioning?
H2020 Project Project Number: 654182
Operational challenges
1. effective operational model which can exploit well the digital ecosystems,
an RI will balance disruption against assured benefits as it engages to
maximize resources and gain interoperability with other infrastructures,
2. authenticating and authorizing users from different communities to use
shared resources, and accounting for the usage of the data and software
services and underlying e-infrastructure,
3. technical coordination across RIs (interoperability) at appropriate interfaces
between them, e.g., for adopting interfaces for supporting VRE, community
catalogues, etc.,
4. assured performance and quality of services and user experiences
required by scientists, in particular, when the scale of the user base and data
assets increases, and in the case of cyber attack;
5. effectively provision RI resources, including data and tools offered by RIs,
and services delivering underlying data infrastructure, to serve a broad range
of demands.
6. Some RIs have extensive sensor networks and technology; it is not clear how
EOSC will deal with 'fog' and 'edge' computing;
H2020 Project Project Number: 654182
Science-supporting challenges
1. Enabling the interdisciplinary research activities to meet environmental
research goals, not only sharing research data and software assets from
different RIs, but also co-developing and using methodologies and models
drawing expertise from multiple domains;
2. FAIR support for data required by the scientists
3. User-specified and steered data processing, and automated workflows
are important issue. The generation of workflows from user requests and
their optimal deployment will grow in importance for environmental research.
4. The recording and provision of provenance information for (a) user
assessment of relevance and quality of an asset; (b) audit; (c) backup and
retry, and (d) reproducibility.
5. Reuse of the data and knowledge from different RIs, requires effective data
and knowledge mining tools
6. With increasing volume data, exascale computing support for data
analysis and simulation. Frequently, complex workflows using such
simulations need to interwork between HPC and cloud (HTC) platforms.
H2020 Project Project Number: 654182
Sustainability challenges: governance
and engineering
1. How to provide sustainable business models that serve data
contributors, service developers, researchers, innovation
makers and other payers into EOSC.
2. How to provide sustainable data management and
stewardship, including the curation and long-term
preservation of assets (information and software) and access
to them.
3. How to provide sustainable technical decisions, including
standards and interfaces, so that they can fit the evolution of
the ecosystem and operation model of the RI services.
4. How to provide sustainable system architecture and
engineering, which can meet demands for scaling of technical
solutions for large numbers of users.
5. How to choose effective underlying infrastructure for
provisioning RI and deploying services to achieve sustainable
service quality and reliability.
H2020 Project Project Number: 654182
B.ACTIVITYREPORT-SITEVISITS
March April May June Aug. Nov.July Sept. Oct. Dec.Feb.
Common solution development
Use case prototype
EAA Ifremer
INGV
ANAEE-
INRA
ICOS-
LU
Objectives:
• Disseminate the development results
• Learn the latest development status and updated requirements from RIs
• Establish new use cases, and update R&D agenda
Meeting plan and reports: https://docs.google.com/spreadsheets/d/1iFxvoPzMiqkzP84IQ-
tRadZXJsR9LajGw5VK5z6WrdA/edit#gid=395408692
H2020 Project Project Number: 654182
REFERENCES
EU FP7 ENVRI www.envri.eu
EU H2020 ENVRIPLUS www.envriplus.eu
EU H2020 VRE4EIC www.vre4eic.eu
ENVRI RM: http://envri.eu/rm
Service portfolio: https://envriplus.manageprojects.com/s/notebook/Og4C2wEWLso0k
THeme2 knowledge base: http://oil-e.vlan400.uvalight.net/

ENVRIPLUS Data for Science Theme

  • 1.
    H2020 Project ProjectNumber: 654182 ENVRIPLUS DATAFOR SCIENCE THEME DR. ZHIMING ZHAO
  • 2.
    H2020 Project ProjectNumber: 654182 Data for science MOTIVATION:SYSTEMLEVELOFENVIRONMENTALSCIENCES
  • 3.
    REUSABLESOLUTIONSTOCOMMON CHALLENGES Five themes 1. Technicalinnovation 2. Data for science 3. Access 4. Social relevance 5. Knowledge transfer 6. Communication and dissemination ENVRI PLUS: EU H2020 project www.envriplus.eu 4 years, 15M 37 partners, more than 20 research infrastructures
  • 4.
    H2020 Project ProjectNumber: 654182 Common problems Reusable solutions Common challenges towards EOSC OUTLINE
  • 5.
    Approach: multi viewpoint modelling,aims at a common ontological framework ENVRI RM: Reference model OIL-e: Open information linking for Environmental RIs 1. Chen, Y., et al., (2013) A common reference model for environmental science research infrastructures. In Proceedings of EnviroInfo2013. 2. Zhao. Z, et al., (2015) Open Information Linking for Environmental Research Infrastructures. In the proceedings of IEEE eScience [doi:10.1109/eScience.2015.66]
  • 6.
    H2020 Project ProjectNumber: 654182 Output: ENVRI RM 1.0, OIL-e 1.0 Data query and processing prototype Approach: multi viewpoint modelling, aims at a common ontological framework
  • 7.
    Public clouds Challenge 1: support system levelscience Challenge 2: share solutions to common problems Challenge 3: Interface with virtual research environment(s) Challenge 4: re-use technologies (e.g. from e-Infrastructures)
  • 8.
    Approach: RM guidedsystem design Zhao, Z., et al. (2015) Reference Model Guided System Design and Implementation for Interoperable Environmental Research Infrastructures, proceedings of IEEE eScience, 2015 [doi:10.1109/eScience.2015.41]
  • 9.
    H2020 Project ProjectNumber: 654182 Output: ENVRI RM 1.0, OIL-e 1.0 Data query and processing prototype Approach: multi viewpoint modelling, aims at a common ontological framework Approach: RM guided RI co- design, agile use case driven development Highlighted achievements 1. ENVRI RM 2.x 2. OIL-e 2.x 3. Knowledge base of the ENVRI RIs 4. Service portfolio
  • 10.
    Discover reusable componentsamong research infrastructures. Design new research infrastructures. Optimise the evolutionary path. ICOS EUFARLTER Euro- Argo DASSH Semantic description and reasoning tools RI: how did other RIs implement my missing functionality? knowledge base New RI: What are the best practices for meeting my requirements? New RIs RI: how should I upgrade my services? Data for science theme knowledge base http://oil-e.vlan400.uvalight.net/
  • 11.
    Data for Sciencetheme service portfolio Data for science theme Service portfolio A. Reference model related A1: reference model training service - CU A2: open information linking for ENVRI-RIs- UvA A3: ENVRI knowledge base - UvA A4: RI architecture design - NERC B. Theme2 service pillar B1: Linked open data ingestion and metadata service - ICOS/LU B2: d4science data analytics - CNR B3: dynamic real-time infrastructure planner - UvA B4: curation - NERC B5: flagship cataloguing - IFREMER B6: Provenance - EAA C. Reusable solution from use cases/RIs C1: Data subscription service - EUDAT C2: Pipeline for semantic annotation of relational DB – ANAEE/INRA C3: Data / metadata generation from semantic annotations- ANAEE/INRA C4: DEMIS - LTER/EAA D. Software quality checking and testbed D1: envriplus service test bed - EGI
  • 12.
    A. Reference modelrelated A1: reference model training service - CU A2: open information linking for ENVRI-RIs- UvA A3: ENVRI knowledge base - UvA A4: RI architecture design - NERC B. Theme2 service pillar B1: Linked open data ingestion and metadata service - ICOS/LU B2: d4science data analytics - CNR B3: dynamic real-time infrastructure planner - UvA B4: curation - NERC B5: flagship cataloguing - IFREMER B6: Provenance - EAA C. Reusable solution from use cases/RIs C1: Data subscription service - EUDAT C2: Pipeline for semantic annotation of relational DB – ANAEE/INRA C3: Data / metadata generation from semantic annotations- ANAEE/INRA C4: DEMIS - LTER/EAA D. Software quality checking and testbed D1: envriplus service test bed - EGI Data for Science theme service portfolio Data for science theme Service portfolio How to use? Successful stories TRL and support. https://envriplus.manageprojects.com/s/notebook/Og4C2wEWLso0k
  • 13.
    H2020 Project ProjectNumber: 654182 EU FP7 ENVRI: Understand Common challenges and requirements Output: ENVRI RM/OIL-e 2.x, knowledge base, Service portfolio Output: ENVRI RM 1.0, OIL-e 1.0 Data query and processing prototype EU H2020 ENVRIPLUS: Data for science theme Build reusable solutions to common development challenges. Approach: multi viewpoint modelling, aims at a common ontological framework Approach: RM guided RI co- design, agile use case driven development Current challenges: 1) Operational challenges: AAA, deployment, maintenance 2) Science challenges: effective VREs, discovery, optimization 3) Sustainability challenges: ENV-RIs in EOSC Solve complex system-level problems Collaboration, Performing cross- disciplinary research… Innovation from open science
  • 14.
    H2020 Project ProjectNumber: 654182 Current challenges: 1)Operational challenges: AAI, deployment, maintenance 2)Science support challenges: effective VREs, discovery, optimization 3)Sustainability challenges: software upkeep, long term contracts for service provisioning?
  • 15.
    H2020 Project ProjectNumber: 654182 Operational challenges 1. effective operational model which can exploit well the digital ecosystems, an RI will balance disruption against assured benefits as it engages to maximize resources and gain interoperability with other infrastructures, 2. authenticating and authorizing users from different communities to use shared resources, and accounting for the usage of the data and software services and underlying e-infrastructure, 3. technical coordination across RIs (interoperability) at appropriate interfaces between them, e.g., for adopting interfaces for supporting VRE, community catalogues, etc., 4. assured performance and quality of services and user experiences required by scientists, in particular, when the scale of the user base and data assets increases, and in the case of cyber attack; 5. effectively provision RI resources, including data and tools offered by RIs, and services delivering underlying data infrastructure, to serve a broad range of demands. 6. Some RIs have extensive sensor networks and technology; it is not clear how EOSC will deal with 'fog' and 'edge' computing;
  • 16.
    H2020 Project ProjectNumber: 654182 Science-supporting challenges 1. Enabling the interdisciplinary research activities to meet environmental research goals, not only sharing research data and software assets from different RIs, but also co-developing and using methodologies and models drawing expertise from multiple domains; 2. FAIR support for data required by the scientists 3. User-specified and steered data processing, and automated workflows are important issue. The generation of workflows from user requests and their optimal deployment will grow in importance for environmental research. 4. The recording and provision of provenance information for (a) user assessment of relevance and quality of an asset; (b) audit; (c) backup and retry, and (d) reproducibility. 5. Reuse of the data and knowledge from different RIs, requires effective data and knowledge mining tools 6. With increasing volume data, exascale computing support for data analysis and simulation. Frequently, complex workflows using such simulations need to interwork between HPC and cloud (HTC) platforms.
  • 17.
    H2020 Project ProjectNumber: 654182 Sustainability challenges: governance and engineering 1. How to provide sustainable business models that serve data contributors, service developers, researchers, innovation makers and other payers into EOSC. 2. How to provide sustainable data management and stewardship, including the curation and long-term preservation of assets (information and software) and access to them. 3. How to provide sustainable technical decisions, including standards and interfaces, so that they can fit the evolution of the ecosystem and operation model of the RI services. 4. How to provide sustainable system architecture and engineering, which can meet demands for scaling of technical solutions for large numbers of users. 5. How to choose effective underlying infrastructure for provisioning RI and deploying services to achieve sustainable service quality and reliability.
  • 18.
    H2020 Project ProjectNumber: 654182 B.ACTIVITYREPORT-SITEVISITS March April May June Aug. Nov.July Sept. Oct. Dec.Feb. Common solution development Use case prototype EAA Ifremer INGV ANAEE- INRA ICOS- LU Objectives: • Disseminate the development results • Learn the latest development status and updated requirements from RIs • Establish new use cases, and update R&D agenda Meeting plan and reports: https://docs.google.com/spreadsheets/d/1iFxvoPzMiqkzP84IQ- tRadZXJsR9LajGw5VK5z6WrdA/edit#gid=395408692
  • 19.
    H2020 Project ProjectNumber: 654182 REFERENCES EU FP7 ENVRI www.envri.eu EU H2020 ENVRIPLUS www.envriplus.eu EU H2020 VRE4EIC www.vre4eic.eu ENVRI RM: http://envri.eu/rm Service portfolio: https://envriplus.manageprojects.com/s/notebook/Og4C2wEWLso0k THeme2 knowledge base: http://oil-e.vlan400.uvalight.net/