The BlueBRIDGE approach to collaborative research

Blue BRIDGE
Blue BRIDGEInnovative data services for Blue Growth & beyond
BlueBRIDGE receives funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No. 675680 www.bluebridge-vres.eu
The BlueBRIDGE approach to
collaborative research
Gianpaolo Coro
CNR, Italy
gianpaolo.coro@isti.cnr.it
Context
Progress in Information Technology has changed
the paradigms of Science
 The large and fast increase of volume and
complexity of data requires new approaches to
collect-curate-analyse the data
 This requires new tools to guarantee exchange
and longevity of the data and of the reapplication
of the experiments
Big Data
• Large volume
• High generation velocity
• Large variety
• Untrustworthy
(veracity)
• High complexity
(variability)
Big Data: a dataset with large volume, variety, generation velocity, containing complex and
untrustworthy information that requires nonconventional methods to extract, manage and
process information within a reasonable time.
New Science Paradigms
 Open Science: make scientific research, data and dissemination
accessible to all levels of an inquiring society, amateur or
professional.
Keywords: Open Access, Open research, Open Notebook Science
 E-Science: computationally intensive science is carried out in highly
distributed network environments that use large data sets and
require distributed computing and collaborative tools.
Keywords: Provenance of the scientific process, Scientific workflows
 Science 2.0: process and publish large data sets using a
collaborative approach. Share from raw data to experimental
results and processes. Support collaborative experiments and
Reproducibility-Repeatability-Reusability (R-R-R) of Science.
Keywords: collaborative and repeatable Science
Requirements for IT systems
• Support collaborative research and experimentation
• Implement Reproducibility-Repeatability-Reusability of
Science
• Allow sharing data, processes and findings
• Grant free access to the produced scientific knowledge
• Tackle Big Data challenges
• Sustainability: low operational costs, low maintenance
prices
• Manage heterogeneous data/processes access policies
• Meet industrial processes requirements
e-Infrastructures
e-Infrastructures enable researchers at different locations across the world
to collaborate in the context of their home institutions or in national or multinational
scientific initiatives.
• People can work together having shared access to unique or distributed scientific
facilities (including data, instruments, computing and communications).
Examples:
Belief, http://www.beliefproject.org/
OpenAire, http://www.openaire.eu/
i-Marine, http://www.i-marine.eu/
EU-Brazil OpenBio,
http://www.eubrazilopenbio.eu/
Virtual Research Environments
• Define sub-communities
• Allow temporary dedicated
assignment of computational,
storage, and data resources
• Manage policies
• Support data and information
sharing
Integrates
e-Infrastructure
Unified Resource Space
Enables
VRE VRE VRE
WPS
External e-Infrastructures
Virtual Research Environments
Innovative, web-based, community-oriented, comprehensive, flexible, and
secure working environments.
• Communities are provided with applications to interact with the VRE services
• Client services are provided both with APIs (Java, R) and simple HTTP-REST interfaces
VREs Example
The D4Science e-Infrastructure
D4Science supports scientists in several domains
1. More than 25 000
taxonomic
studies per month
www.i-marine.eu
2. More than 60 000
species distribution
maps produced and
hosted
www.d4science.eu
3. Used to build a
pan- European
geothermal energy
map
www.egip.d4science.org
4. Processing and
management of
heterogeneous
environmental and
Earth system data
www.envriplus.eu
5. Enhances
communication and
exchange in Linguistic
Studies, Humanities,
Cultural Heritage,
History and
Archaeology
www.parthenos-project.eu
BlueBRIDGE VREs
Stock Assessment
assess the health status of fisheries stocks.
http://www.bluebridge-vres.eu/services/stock-
assessment
CMSY model
Marine Protected Areas
reduce adverse impact of human activities
(e.g. fishing, aquaculture, tourism) on
ecosystems, and ensure these activities are
properly embedded in policy frameworks.
http://www.bluebridge-vres.eu/services/protected-area-
impact-maps
Education VREs
Lecture-style: the course topics stress is different
depending on the audience
Interactive: after each explained topic, students do
experiments
Experimental: students reproduce the experiment
shown by the teacher and possibly repeat it on their own
data
Social: students communicate via messaging or VRE
discussion panel
• 1 course/year
In Pisa
• 1 course/year
In Paris
• 12 courses
In Copenhagen
www.bluebridge-vres.eu
International Council for
the Exploration of the Sea
• 38 courses
All over the world
+1000 attendees
Social networking is key to share information in e-Infrastructure
BlueBRIDGE offers a continuously updated list of events / news produced by users
and applications
User-shared
News
Application-
shared News
Share News
BlueBRIDGE VREs:
Social Networking
A free-of-use folder-based file system allows managing and sharing
information objects.
Information objects can be
• files, dataset, workflows,
experiments, etc.
• organized
into folders
• shared
• disseminated via public
URLs
BlueBRIDGE VREs:
The Workspace – an online files storage system
Storage
Databases Cloud storage Geospatial data
Metadata generation
and management
Harmonisation Sharing
Data
management
Cloud computing Elastic resources
assignment
Multi-platform: R,
Java, Fortran
Processing
BlueBRIDGE Facilities:
Overview
Data Processing
• Experiments on Big Data
• Sharing inputs and results
• Save the provenance of experiments
• Supports R-R-R of experiments
WPS
REST
• Input/Out
• Parameters
• Provenance
Cloud Computing
Platform
BlueBRIDGE computational
capabilitiesProject resources:
 6 Virtual Machines (VM) with 16 virtual CPU cores, 16GB of RAM and
100GB of storage
 100 VMs with 2 virtual CPU cores, 8GB of RAM and 20GB of storage
Processes:
 ~ 200 algorithms hosted in all the VREs
 ~ 20 contributing institutes
 ~ 30,000 requests per month
 ~ 2000 scientists/students in 44 countries using VREs
 Programming languages: R, Java, Python, Fortran, Linux-compiled
External providers (European Grid Infrastructure):
 6 VMs: 8 virtual CPU cores, 16GB of RAM and 100GB of storage
 2 VMs: 16 virtual CPU cores, 32GB of RAM and 100GB of storage
 24 VMs: 2 virtual CPU cores, 8GB of RAM and 50GB of storage
 5VMs: 4 virtual CPUs cores, 8GB of RAM and 80GB of disk
Integrating new processes
Integration: putting a script that works offline into the Cloud
computing platform.
Tools:
https://wiki.gcube-system.org/gcube/How-to_Implement_Algorithms_for_the_Statistical_Manager
https://wiki.gcube-system.org/gcube/Statistical_Algorithms_Importer
R script
Computing platform Web interface and Web service
SAI - Importing tool
Automatic
Advantages
 The process is available as-a-Service
 Invoked via communication standards
 Higher computational capabilities
 Automatic creation of a Web interface
 Provenance management
 Storage of results on a high-availability system
 Collaboration and sharing
 Re-usability, e.g. from other software (e.g. QGIS)
Collaborative experiments
WS
Shared online folders
Inputs
Outputs
Results
Computational system
In the e-Infrastructure
Through third party software
Ensemble Model
Implementation of an ensemble model approach to support advice and management in
fisheries.
Thorpe et al. (2015). Evaluation and management implications of uncertainty in a multispecies size structured
model of population and community responses to fishing. Methods in Ecology and Evolution, 6(1), 49-58.
 Diet Information
 Life history diet information
 Historical fishing scenarios
 MSY fishing scenarios
 Initial abundance values
 Life history prior information
 Total Biomass
 Stock Spawning Biomass
 Life history traits
Input Output
Process
Python script
EM Integration
Download the python script
and the user’s data
Execute script
Collect output
Destroy local copies of I/O and script
Save Output on the User’s Workspace, with provenance info
Scientist’s provided
script
User’s data
Infrastructure
machine
EM Interface
User’s private
Workspace
EM Interface
EM Interface
EM Interface
Scientific Workflow
Script provider
Updates the script on
his private Workspace
The service downloads
the script on-the-fly
A user executes an
experiment on
his/her data
The output, the input
and the parameters can
be shared with another
user
This user can execute the
experiment again
and share the
computation with the
other user
1
2
3
4
5
6
7
89
10
Limitations and requirements
Input OutputScript
Script
Required Provided
Issues:
 Code is often designed for one precise data set
 Often, prototype scripts have code that is not separable from the I/O
In the context of e-Infrastructures and Science 2.0:
 Modularity is necessary for integration
 Scripts should be re-organised in a way they could be re-used on other data without
changing the code
Vs
WS
Self-consistent comp. products
RepeatabilityProvenance Prov-O
Reusability
Use of standards
Reproducibility
Conclusions
 E-Infrastructures endow processes with several Science
2.0 features
 BlueBRIDGE offers an e-Infrastructure and resources to
host processes and collaborate
 Effort is required to algorithms providers to comply with
service and generalisation requirements
The BlueBRIDGE approach to collaborative research
1 of 30

Recommended

Virtual research environments for implementing long tail open science by
Virtual research environments for implementing long tail open scienceVirtual research environments for implementing long tail open science
Virtual research environments for implementing long tail open scienceBlue BRIDGE
463 views16 slides
The BlueBRIDGE multidisciplinary & multi-sector approach: challenges and user... by
The BlueBRIDGE multidisciplinary & multi-sector approach: challenges and user...The BlueBRIDGE multidisciplinary & multi-sector approach: challenges and user...
The BlueBRIDGE multidisciplinary & multi-sector approach: challenges and user...Blue BRIDGE
367 views10 slides
What can «blue» do for you: overcoming ICES challenges with BlueBRIDGE tools by
What can «blue» do for you: overcoming ICES challenges with BlueBRIDGE toolsWhat can «blue» do for you: overcoming ICES challenges with BlueBRIDGE tools
What can «blue» do for you: overcoming ICES challenges with BlueBRIDGE toolsBlue BRIDGE
864 views12 slides
Changing data management in Blue Growth by
Changing data management in Blue GrowthChanging data management in Blue Growth
Changing data management in Blue GrowthBlue BRIDGE
248 views11 slides
The EOSC and Blue Growth by
The EOSC and Blue Growth The EOSC and Blue Growth
The EOSC and Blue Growth Blue BRIDGE
460 views9 slides
How BlueBRIDGE data management services can support the marine & maritime sector by
How BlueBRIDGE data management services can support the marine & maritime sectorHow BlueBRIDGE data management services can support the marine & maritime sector
How BlueBRIDGE data management services can support the marine & maritime sectorBlue BRIDGE
546 views12 slides

More Related Content

What's hot

Geo Data Technology Workshop, 25 April 2019, Chris Atherton and Andres Steijaert by
Geo Data Technology Workshop, 25 April 2019, Chris Atherton and Andres SteijaertGeo Data Technology Workshop, 25 April 2019, Chris Atherton and Andres Steijaert
Geo Data Technology Workshop, 25 April 2019, Chris Atherton and Andres SteijaertOCRE | Open Clouds for Research Environments
148 views12 slides
e-research_oz by
e-research_oze-research_oz
e-research_ozCraig Bellamy
227 views25 slides
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure by
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructureeROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructuree-ROSA
62 views13 slides
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro... by
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...e-ROSA
164 views12 slides
eROSA Stakeholder WS1: EOSC Architecture by
eROSA Stakeholder WS1: EOSC ArchitectureeROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC Architecturee-ROSA
186 views18 slides
Research data spring: DataVault by
Research data spring: DataVaultResearch data spring: DataVault
Research data spring: DataVaultJisc RDM
5.2K views17 slides

What's hot(20)

eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure by e-ROSA
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructureeROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
eROSA Stakeholder WS1: EUDAT – The pan-European data infrastructure
e-ROSA62 views
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro... by e-ROSA
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...
eROSA Stakeholder WS1: The European Open Science Cloud for Research Pilot Pro...
e-ROSA164 views
eROSA Stakeholder WS1: EOSC Architecture by e-ROSA
eROSA Stakeholder WS1: EOSC ArchitectureeROSA Stakeholder WS1: EOSC Architecture
eROSA Stakeholder WS1: EOSC Architecture
e-ROSA186 views
Research data spring: DataVault by Jisc RDM
Research data spring: DataVaultResearch data spring: DataVault
Research data spring: DataVault
Jisc RDM5.2K views
Virtual Research Environments as-a-serive by Blue BRIDGE
Virtual Research Environments as-a-seriveVirtual Research Environments as-a-serive
Virtual Research Environments as-a-serive
Blue BRIDGE203 views
Virtual Research Environments supporting tailor-made data management service... by Blue BRIDGE
Virtual Research Environments supporting tailor-made data management service...Virtual Research Environments supporting tailor-made data management service...
Virtual Research Environments supporting tailor-made data management service...
Blue BRIDGE518 views
How ICES uses BlueBRIDGE tools to overcome our challenges by Blue BRIDGE
How ICES uses BlueBRIDGE tools to overcome our challenges How ICES uses BlueBRIDGE tools to overcome our challenges
How ICES uses BlueBRIDGE tools to overcome our challenges
Blue BRIDGE285 views
Leveraging ICT developments for societal challenges: the BlueBRIDGE way towar... by Blue BRIDGE
Leveraging ICT developments for societal challenges: the BlueBRIDGE way towar...Leveraging ICT developments for societal challenges: the BlueBRIDGE way towar...
Leveraging ICT developments for societal challenges: the BlueBRIDGE way towar...
Blue BRIDGE235 views
BlueBRIDGE solutions for Fisheries Data Working Group by Blue BRIDGE
BlueBRIDGE solutions for Fisheries Data Working GroupBlueBRIDGE solutions for Fisheries Data Working Group
BlueBRIDGE solutions for Fisheries Data Working Group
Blue BRIDGE149 views
EGI - Open Data Platform by EUDAT
EGI - Open Data PlatformEGI - Open Data Platform
EGI - Open Data Platform
EUDAT64 views
DMP exercise: linking data management activities to services - EUDAT Summer ... by EUDAT
DMP exercise: linking data management activities to services  - EUDAT Summer ...DMP exercise: linking data management activities to services  - EUDAT Summer ...
DMP exercise: linking data management activities to services - EUDAT Summer ...
EUDAT253 views
Sshoc kick off meeting - 1.4.2 EOSC-Life: an open collaborative space for dig... by SSHOC
Sshoc kick off meeting - 1.4.2 EOSC-Life: an open collaborative space for dig...Sshoc kick off meeting - 1.4.2 EOSC-Life: an open collaborative space for dig...
Sshoc kick off meeting - 1.4.2 EOSC-Life: an open collaborative space for dig...
SSHOC49 views
The European Cluster Observatory: Measuring the performance of clusters by Göran Lindqvist
The European Cluster Observatory: Measuring the performance of clustersThe European Cluster Observatory: Measuring the performance of clusters
The European Cluster Observatory: Measuring the performance of clusters
Göran Lindqvist328 views

Similar to The BlueBRIDGE approach to collaborative research

Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR) by
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
433 views44 slides
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co... by
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...Blue BRIDGE
589 views32 slides
Using e-Infrastructures for Biodiversity Conservation by
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationBlue BRIDGE
848 views65 slides
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re... by
D4Science:An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...D4Science:An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...FAO
279 views33 slides
D4science-II Codata by
D4science-II CodataD4science-II Codata
D4science-II CodataFAO
275 views33 slides
10th e concertation-brussels-06march2013-v2 by
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2Alex Hardisty
544 views11 slides

Similar to The BlueBRIDGE approach to collaborative research(20)

Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR) by Blue BRIDGE
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Blue BRIDGE433 views
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co... by Blue BRIDGE
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Blue BRIDGE589 views
Using e-Infrastructures for Biodiversity Conservation by Blue BRIDGE
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity Conservation
Blue BRIDGE848 views
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re... by FAO
D4Science:An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...D4Science:An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
D4Science: An e-Infrastructure for Facilitating Fisheries and Aquaculture Re...
FAO279 views
D4science-II Codata by FAO
D4science-II CodataD4science-II Codata
D4science-II Codata
FAO275 views
10th e concertation-brussels-06march2013-v2 by Alex Hardisty
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
Alex Hardisty544 views
IDB-Cloud Providing Bioinformatics Services on Cloud by stratuslab
IDB-Cloud Providing Bioinformatics Services on CloudIDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on Cloud
stratuslab877 views
Internet2 Bio IT 2016 v2 by Dan Taylor
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
Dan Taylor342 views
Supporting open science oriented skills building by virtual research environm... by Blue BRIDGE
Supporting open science oriented skills building by virtual research environm...Supporting open science oriented skills building by virtual research environm...
Supporting open science oriented skills building by virtual research environm...
Blue BRIDGE285 views
RDMkit, a Research Data Management Toolkit. Built by the Community for the ... by Carole Goble
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble712 views
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services by Archiver
Archiver at CS3 - Cloud Storage Synchronization and Sharing ServicesArchiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver at CS3 - Cloud Storage Synchronization and Sharing Services
Archiver 167 views
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work... by Archiver
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Hybrid Cloud storage deployment models: ARCHIVER presentation at the CS3 Work...
Archiver 126 views
The AGINFRA+ Virtual Research Environment (VRE) by AGINFRA
The AGINFRA+ Virtual Research Environment (VRE)The AGINFRA+ Virtual Research Environment (VRE)
The AGINFRA+ Virtual Research Environment (VRE)
AGINFRA101 views
FAIR Computational Workflows by Carole Goble
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble493 views
Data-intensive applications on cloud computing resources: Applications in lif... by Ola Spjuth
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
Ola Spjuth433 views
WEBINAR: "How to manage your data to make them open and fair" by OpenAIRE
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
OpenAIRE1.7K views
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A... by BigData_Europe
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
BigData_Europe423 views

More from Blue BRIDGE

PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f... by
PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...
PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...Blue BRIDGE
649 views16 slides
BlueBRIDGE supporting education by
BlueBRIDGE supporting educationBlueBRIDGE supporting education
BlueBRIDGE supporting educationBlue BRIDGE
287 views29 slides
LME: LEARN & IOC Capacity Building Activities by
LME: LEARN & IOC Capacity Building ActivitiesLME: LEARN & IOC Capacity Building Activities
LME: LEARN & IOC Capacity Building ActivitiesBlue BRIDGE
251 views14 slides
Machine Learning methods to estimate the performance of aquafarms by
Machine Learning methods to estimate the performance of aquafarms Machine Learning methods to estimate the performance of aquafarms
Machine Learning methods to estimate the performance of aquafarms Blue BRIDGE
396 views26 slides
Environmental observation data to detect aquaculture structures: merging Cope... by
Environmental observation data to detect aquaculture structures: merging Cope...Environmental observation data to detect aquaculture structures: merging Cope...
Environmental observation data to detect aquaculture structures: merging Cope...Blue BRIDGE
564 views16 slides
Application of Earth Observation (EO) Data for Detection, Characterization an... by
Application of Earth Observation (EO) Data for Detection, Characterization an...Application of Earth Observation (EO) Data for Detection, Characterization an...
Application of Earth Observation (EO) Data for Detection, Characterization an...Blue BRIDGE
349 views16 slides

More from Blue BRIDGE(20)

PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f... by Blue BRIDGE
PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...
PerformFISH: Consumer Driven Production - Integrating Innovative Approaches f...
Blue BRIDGE649 views
BlueBRIDGE supporting education by Blue BRIDGE
BlueBRIDGE supporting educationBlueBRIDGE supporting education
BlueBRIDGE supporting education
Blue BRIDGE287 views
LME: LEARN & IOC Capacity Building Activities by Blue BRIDGE
LME: LEARN & IOC Capacity Building ActivitiesLME: LEARN & IOC Capacity Building Activities
LME: LEARN & IOC Capacity Building Activities
Blue BRIDGE251 views
Machine Learning methods to estimate the performance of aquafarms by Blue BRIDGE
Machine Learning methods to estimate the performance of aquafarms Machine Learning methods to estimate the performance of aquafarms
Machine Learning methods to estimate the performance of aquafarms
Blue BRIDGE396 views
Environmental observation data to detect aquaculture structures: merging Cope... by Blue BRIDGE
Environmental observation data to detect aquaculture structures: merging Cope...Environmental observation data to detect aquaculture structures: merging Cope...
Environmental observation data to detect aquaculture structures: merging Cope...
Blue BRIDGE564 views
Application of Earth Observation (EO) Data for Detection, Characterization an... by Blue BRIDGE
Application of Earth Observation (EO) Data for Detection, Characterization an...Application of Earth Observation (EO) Data for Detection, Characterization an...
Application of Earth Observation (EO) Data for Detection, Characterization an...
Blue BRIDGE349 views
Capacity building, validation and repeatability by Blue BRIDGE
Capacity building, validation and repeatabilityCapacity building, validation and repeatability
Capacity building, validation and repeatability
Blue BRIDGE251 views
Fostering global data management with public tuna fisheries data by Blue BRIDGE
Fostering global data management with public tuna fisheries dataFostering global data management with public tuna fisheries data
Fostering global data management with public tuna fisheries data
Blue BRIDGE281 views
Understanding biodiversity features in marine protected areas by Blue BRIDGE
Understanding biodiversity features in marine protected areasUnderstanding biodiversity features in marine protected areas
Understanding biodiversity features in marine protected areas
Blue BRIDGE228 views
Panel discussion on Global Repositories of Merged Public Data by Blue BRIDGE
Panel discussion on Global Repositories of Merged Public DataPanel discussion on Global Repositories of Merged Public Data
Panel discussion on Global Repositories of Merged Public Data
Blue BRIDGE192 views
Invasive species and climate change by Blue BRIDGE
Invasive species and climate changeInvasive species and climate change
Invasive species and climate change
Blue BRIDGE479 views
The BIG picture - Advanced data visualization for SDG, basic stock assessment... by Blue BRIDGE
The BIG picture - Advanced data visualization for SDG, basic stock assessment...The BIG picture - Advanced data visualization for SDG, basic stock assessment...
The BIG picture - Advanced data visualization for SDG, basic stock assessment...
Blue BRIDGE174 views
Global Record of Stocks and Fisheries (GRFS) by Blue BRIDGE
Global Record of Stocks and Fisheries (GRFS)Global Record of Stocks and Fisheries (GRFS)
Global Record of Stocks and Fisheries (GRFS)
Blue BRIDGE213 views
Projecting global fish stocks and catches up to 2100 by Blue BRIDGE
Projecting global fish stocks and catches up to 2100Projecting global fish stocks and catches up to 2100
Projecting global fish stocks and catches up to 2100
Blue BRIDGE184 views
BlueBRIDGE: Major Achievements & future vision by Blue BRIDGE
BlueBRIDGE: Major Achievements & future visionBlueBRIDGE: Major Achievements & future vision
BlueBRIDGE: Major Achievements & future vision
Blue BRIDGE216 views
Managing tuna fisheries data at a global scale: the Tuna Atlas VRE by Blue BRIDGE
Managing tuna fisheries data at a global scale: the Tuna Atlas VREManaging tuna fisheries data at a global scale: the Tuna Atlas VRE
Managing tuna fisheries data at a global scale: the Tuna Atlas VRE
Blue BRIDGE807 views
SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ... by Blue BRIDGE
SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...
SeaDataCloud – further developing the pan-European SeaDataNet infrastructure ...
Blue BRIDGE208 views
The BlueBRIDGE Project - Pasquale Pagano by Blue BRIDGE
The BlueBRIDGE Project - Pasquale PaganoThe BlueBRIDGE Project - Pasquale Pagano
The BlueBRIDGE Project - Pasquale Pagano
Blue BRIDGE163 views
Thematic clouds for EOSC : The Food Cloud and the Blue Cloud by Blue BRIDGE
Thematic clouds for EOSC: The Food Cloud and the Blue Cloud�Thematic clouds for EOSC: The Food Cloud and the Blue Cloud�
Thematic clouds for EOSC : The Food Cloud and the Blue Cloud
Blue BRIDGE403 views

Recently uploaded

Inawisdom Quick Sight by
Inawisdom Quick SightInawisdom Quick Sight
Inawisdom Quick SightPhilipBasford
8 views27 slides
Customer Data Cleansing Project.pptx by
Customer Data Cleansing Project.pptxCustomer Data Cleansing Project.pptx
Customer Data Cleansing Project.pptxNat O
6 views23 slides
Penetration testing by Burpsuite by
Penetration testing by  BurpsuitePenetration testing by  Burpsuite
Penetration testing by BurpsuiteAyonDebnathCertified
5 views19 slides
Applied physics letters journal.pdf by
Applied physics letters journal.pdfApplied physics letters journal.pdf
Applied physics letters journal.pdfaqsamukhtiyar88
5 views8 slides
Underfunded.pptx by
Underfunded.pptxUnderfunded.pptx
Underfunded.pptxvgarcia19
15 views7 slides
Best Home Security Systems.pptx by
Best Home Security Systems.pptxBest Home Security Systems.pptx
Best Home Security Systems.pptxmogalang
9 views16 slides

Recently uploaded(20)

Customer Data Cleansing Project.pptx by Nat O
Customer Data Cleansing Project.pptxCustomer Data Cleansing Project.pptx
Customer Data Cleansing Project.pptx
Nat O6 views
Underfunded.pptx by vgarcia19
Underfunded.pptxUnderfunded.pptx
Underfunded.pptx
vgarcia1915 views
Best Home Security Systems.pptx by mogalang
Best Home Security Systems.pptxBest Home Security Systems.pptx
Best Home Security Systems.pptx
mogalang9 views
Analytics Center of Excellence | Data CoE |Analytics CoE| WNS Triange by RNayak3
Analytics Center of Excellence | Data CoE |Analytics CoE| WNS TriangeAnalytics Center of Excellence | Data CoE |Analytics CoE| WNS Triange
Analytics Center of Excellence | Data CoE |Analytics CoE| WNS Triange
RNayak35 views
AZConf 2023 - Considerations for LLMOps: Running LLMs in production by SARADINDU SENGUPTA
AZConf 2023 - Considerations for LLMOps: Running LLMs in productionAZConf 2023 - Considerations for LLMOps: Running LLMs in production
AZConf 2023 - Considerations for LLMOps: Running LLMs in production
4_4_WP_4_06_ND_Model.pptx by d6fmc6kwd4
4_4_WP_4_06_ND_Model.pptx4_4_WP_4_06_ND_Model.pptx
4_4_WP_4_06_ND_Model.pptx
d6fmc6kwd47 views
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion by Bertram Ludäscher
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
LIVE OAK MEMORIAL PARK.pptx by ms2332always
LIVE OAK MEMORIAL PARK.pptxLIVE OAK MEMORIAL PARK.pptx
LIVE OAK MEMORIAL PARK.pptx
ms2332always8 views
DGIQ East 2023 AI Ethics SIG by Karen Lopez
DGIQ East 2023 AI Ethics SIGDGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIG
Karen Lopez5 views
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language... by patiladiti752
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
patiladiti7528 views
DGST Methodology Presentation.pdf by maddierlegum
DGST Methodology Presentation.pdfDGST Methodology Presentation.pdf
DGST Methodology Presentation.pdf
maddierlegum7 views
K-Drama Recommendation Using Python by FridaPutriassa
K-Drama Recommendation Using PythonK-Drama Recommendation Using Python
K-Drama Recommendation Using Python
FridaPutriassa7 views
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning by SARADINDU SENGUPTA
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning

The BlueBRIDGE approach to collaborative research

  • 1. BlueBRIDGE receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 675680 www.bluebridge-vres.eu The BlueBRIDGE approach to collaborative research Gianpaolo Coro CNR, Italy gianpaolo.coro@isti.cnr.it
  • 2. Context Progress in Information Technology has changed the paradigms of Science  The large and fast increase of volume and complexity of data requires new approaches to collect-curate-analyse the data  This requires new tools to guarantee exchange and longevity of the data and of the reapplication of the experiments
  • 3. Big Data • Large volume • High generation velocity • Large variety • Untrustworthy (veracity) • High complexity (variability) Big Data: a dataset with large volume, variety, generation velocity, containing complex and untrustworthy information that requires nonconventional methods to extract, manage and process information within a reasonable time.
  • 4. New Science Paradigms  Open Science: make scientific research, data and dissemination accessible to all levels of an inquiring society, amateur or professional. Keywords: Open Access, Open research, Open Notebook Science  E-Science: computationally intensive science is carried out in highly distributed network environments that use large data sets and require distributed computing and collaborative tools. Keywords: Provenance of the scientific process, Scientific workflows  Science 2.0: process and publish large data sets using a collaborative approach. Share from raw data to experimental results and processes. Support collaborative experiments and Reproducibility-Repeatability-Reusability (R-R-R) of Science. Keywords: collaborative and repeatable Science
  • 5. Requirements for IT systems • Support collaborative research and experimentation • Implement Reproducibility-Repeatability-Reusability of Science • Allow sharing data, processes and findings • Grant free access to the produced scientific knowledge • Tackle Big Data challenges • Sustainability: low operational costs, low maintenance prices • Manage heterogeneous data/processes access policies • Meet industrial processes requirements
  • 6. e-Infrastructures e-Infrastructures enable researchers at different locations across the world to collaborate in the context of their home institutions or in national or multinational scientific initiatives. • People can work together having shared access to unique or distributed scientific facilities (including data, instruments, computing and communications). Examples: Belief, http://www.beliefproject.org/ OpenAire, http://www.openaire.eu/ i-Marine, http://www.i-marine.eu/ EU-Brazil OpenBio, http://www.eubrazilopenbio.eu/
  • 7. Virtual Research Environments • Define sub-communities • Allow temporary dedicated assignment of computational, storage, and data resources • Manage policies • Support data and information sharing Integrates e-Infrastructure Unified Resource Space Enables VRE VRE VRE WPS External e-Infrastructures
  • 8. Virtual Research Environments Innovative, web-based, community-oriented, comprehensive, flexible, and secure working environments. • Communities are provided with applications to interact with the VRE services • Client services are provided both with APIs (Java, R) and simple HTTP-REST interfaces
  • 9. VREs Example The D4Science e-Infrastructure D4Science supports scientists in several domains 1. More than 25 000 taxonomic studies per month www.i-marine.eu 2. More than 60 000 species distribution maps produced and hosted www.d4science.eu 3. Used to build a pan- European geothermal energy map www.egip.d4science.org 4. Processing and management of heterogeneous environmental and Earth system data www.envriplus.eu 5. Enhances communication and exchange in Linguistic Studies, Humanities, Cultural Heritage, History and Archaeology www.parthenos-project.eu
  • 10. BlueBRIDGE VREs Stock Assessment assess the health status of fisheries stocks. http://www.bluebridge-vres.eu/services/stock- assessment CMSY model Marine Protected Areas reduce adverse impact of human activities (e.g. fishing, aquaculture, tourism) on ecosystems, and ensure these activities are properly embedded in policy frameworks. http://www.bluebridge-vres.eu/services/protected-area- impact-maps
  • 11. Education VREs Lecture-style: the course topics stress is different depending on the audience Interactive: after each explained topic, students do experiments Experimental: students reproduce the experiment shown by the teacher and possibly repeat it on their own data Social: students communicate via messaging or VRE discussion panel • 1 course/year In Pisa • 1 course/year In Paris • 12 courses In Copenhagen www.bluebridge-vres.eu International Council for the Exploration of the Sea • 38 courses All over the world +1000 attendees
  • 12. Social networking is key to share information in e-Infrastructure BlueBRIDGE offers a continuously updated list of events / news produced by users and applications User-shared News Application- shared News Share News BlueBRIDGE VREs: Social Networking
  • 13. A free-of-use folder-based file system allows managing and sharing information objects. Information objects can be • files, dataset, workflows, experiments, etc. • organized into folders • shared • disseminated via public URLs BlueBRIDGE VREs: The Workspace – an online files storage system
  • 14. Storage Databases Cloud storage Geospatial data Metadata generation and management Harmonisation Sharing Data management Cloud computing Elastic resources assignment Multi-platform: R, Java, Fortran Processing BlueBRIDGE Facilities: Overview
  • 16. • Experiments on Big Data • Sharing inputs and results • Save the provenance of experiments • Supports R-R-R of experiments WPS REST • Input/Out • Parameters • Provenance Cloud Computing Platform
  • 17. BlueBRIDGE computational capabilitiesProject resources:  6 Virtual Machines (VM) with 16 virtual CPU cores, 16GB of RAM and 100GB of storage  100 VMs with 2 virtual CPU cores, 8GB of RAM and 20GB of storage Processes:  ~ 200 algorithms hosted in all the VREs  ~ 20 contributing institutes  ~ 30,000 requests per month  ~ 2000 scientists/students in 44 countries using VREs  Programming languages: R, Java, Python, Fortran, Linux-compiled External providers (European Grid Infrastructure):  6 VMs: 8 virtual CPU cores, 16GB of RAM and 100GB of storage  2 VMs: 16 virtual CPU cores, 32GB of RAM and 100GB of storage  24 VMs: 2 virtual CPU cores, 8GB of RAM and 50GB of storage  5VMs: 4 virtual CPUs cores, 8GB of RAM and 80GB of disk
  • 18. Integrating new processes Integration: putting a script that works offline into the Cloud computing platform. Tools: https://wiki.gcube-system.org/gcube/How-to_Implement_Algorithms_for_the_Statistical_Manager https://wiki.gcube-system.org/gcube/Statistical_Algorithms_Importer R script Computing platform Web interface and Web service SAI - Importing tool Automatic
  • 19. Advantages  The process is available as-a-Service  Invoked via communication standards  Higher computational capabilities  Automatic creation of a Web interface  Provenance management  Storage of results on a high-availability system  Collaboration and sharing  Re-usability, e.g. from other software (e.g. QGIS)
  • 20. Collaborative experiments WS Shared online folders Inputs Outputs Results Computational system In the e-Infrastructure Through third party software
  • 21. Ensemble Model Implementation of an ensemble model approach to support advice and management in fisheries. Thorpe et al. (2015). Evaluation and management implications of uncertainty in a multispecies size structured model of population and community responses to fishing. Methods in Ecology and Evolution, 6(1), 49-58.  Diet Information  Life history diet information  Historical fishing scenarios  MSY fishing scenarios  Initial abundance values  Life history prior information  Total Biomass  Stock Spawning Biomass  Life history traits Input Output Process Python script
  • 22. EM Integration Download the python script and the user’s data Execute script Collect output Destroy local copies of I/O and script Save Output on the User’s Workspace, with provenance info Scientist’s provided script User’s data Infrastructure machine
  • 27. Scientific Workflow Script provider Updates the script on his private Workspace The service downloads the script on-the-fly A user executes an experiment on his/her data The output, the input and the parameters can be shared with another user This user can execute the experiment again and share the computation with the other user 1 2 3 4 5 6 7 89 10
  • 28. Limitations and requirements Input OutputScript Script Required Provided Issues:  Code is often designed for one precise data set  Often, prototype scripts have code that is not separable from the I/O In the context of e-Infrastructures and Science 2.0:  Modularity is necessary for integration  Scripts should be re-organised in a way they could be re-used on other data without changing the code Vs
  • 29. WS Self-consistent comp. products RepeatabilityProvenance Prov-O Reusability Use of standards Reproducibility Conclusions  E-Infrastructures endow processes with several Science 2.0 features  BlueBRIDGE offers an e-Infrastructure and resources to host processes and collaborate  Effort is required to algorithms providers to comply with service and generalisation requirements