SlideShare a Scribd company logo
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
SSHOC
Dataverse in the
European Open Science Cloud
Automated CI/CD Testing, Installation and Deployment
Slava Tykhonov (DANS-KNAW)
lead software engineer, DANS R&D
Dataverse Community Meeting 2021
14 June 2021
Harvard University
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Overview
● SSHOC in the European Open Science Cloud (EOSC)
● CESSDA Maturity Model
● Docker
● Kubernetes
● Serverless
● Community-driven QA plan
● CI/CD and test automation with pyDataverse
● Quality Assurance as a Service, EOSC Synergy
● Q&A
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Duration: 40 months
(January 2019 – 30 April 2022)
Partners: 47
(20 beneficiaries + 27 LTPs)
SSH ESFRI Landmarks and Projects
& international SSH data infrastructures
Project budget:
€ 14,455,594.08
Type of action & funding:
Research and Innovation action
(INFRAEOSC-04-2018)
Project website:
www.SSHOpenCloud.eu
Objectives:
• creating the social sciences and humanities (SSH) part of European Open Science Cloud (EOSC)
• maximising re-use through Open Science and FAIR principles (standards, common catalogue, access control, semantic techniques, training)
• interconnecting existing and new infrastructures (clustered cloud infrastructure)
• establishing appropriate governance model for SSH-EOSC
Project:
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
EOSC Synergy
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
DANS Data Stations - Future Data Services
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Interoperability in EOSC
● Technical interoperability defined as the “ability of different information technology
systems and software applications to communicate and exchange data”. It should allow
“to accept data from each other and perform a given task in an appropriate and
satisfactory manner without the need for extra operator intervention”.
● Semantic interoperability is “the ability of computer systems to transmit data with
unambiguous, shared meaning. Semantic interoperability is a requirement to enable
machine computable logic, inferencing, knowledge discovery, and data”.
● Organisational interoperability refers to the “way in which organisations align their
business processes, responsibilities and expectations to achieve commonly agreed and
mutually beneficial goals. Focus on the requirements of the user community by making
services available, easily identifiable, accessible and user-focused”.
● Legal interoperability covers “the broader environment of laws, policies, procedures
and cooperation agreements”
Source: EOSC Interoperability Framework v1.0
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
SSHOC task 5.2 Hosting and sharing data repositories
Objective
Development of a research data repository service on EOSC, for
SSH institutions currently without such a facility for their
designated communities
Deliverables
After 38 months: Data repository service running on EOSC
After 40 months: Report on principles of governance and
sustainability of the data repository service
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Services in European Open Science Cloud (EOSC)
● EOSC requires the level 8 of maturity (at
least)
● we need the highest quality of software to
be accepted as a service
● clear and transparent evaluation of
services is essential
● the evidence of technical maturity is the
key to success
● the limited warranty will allow to stop out-
of-warranty services
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse deployment in CESSDA Cloud
Docker Compose for the local development and testing
Kubernetes (K8s) for the Cloud deployment of services
(CI/CD pipeline with Jenkins and Helm)
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse Docker module (CESSDA Dataverse, 2018)
Source: https://github.com/IQSS/dataverse-docker
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Docker introduction
• Extremely powerful configuration tool
• Allows to install software on any platform (Linux, Mac, Windows)
• Any software can be installed from Docker as standalone container or container
delivering Microservices (database, search engine, core service)
• Docker allows to host unlimited amount of the same software tools on different
ports
• Docker can be used to organise multilingual interfaces, for example
• Deployments with orchestrator tools: Docker Swarm, Kubernetes
• Faster development and deployments
• Isolation of running containers allows to scale up apps
• Portability saves time to run the same image on the local computer or in the cloud
• Snapshotting allows to archive Docker images state
• Resource limitation can be adjusted
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse Kubernetes
Project maintained by Oliver Bertuch (FZ Julich) and available in Global Dataverse
Community Consortium github (GDCC)
Google Cloud, Amazon AWS, Microsoft Azure platforms supported
Open Source, community pull requests are welcome
http://github.com/IQSS/dataverse-kubernetes
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
How to deploy Dataverse on Google Cloud Platform
Resources needed to deploy Dataverse:
» GCP VM-instance(s)
» Kubernetes (K8S) cluster
» Docker images
» K8S Services
» K8S Deployments (workloads)
» Load Balancer or Ingress (SSL certificates)
» Mail relay (default port 25 is blocked)
» Persistent volumes (storage) for the Solr, Postgres, Dataverse and SSL services
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Integration with Google Cloud
Google Cloud Platform
» GCloud
a command-line (CLI) tool for running commands against Google Cloud Platform.
GCloud is part of the Google Cloud SDK
Kubernetes cluster
» Kubectl
A command line interface (CLI) for running commands against Kubernetes clusters
» Use of .yaml configuration files to describe the desired resources
Note: Kubernetes software is used by all major Cloud providers, the different is the storage
specification!
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse Cloud Architecture
Ingress
HTTP(S) Load Balancer
Kubernetes Engine
Dataverse Service
Kubernetes Cluster
K8S Cluster Node
Dataverse Deployment Dataverse Service
Solr Deployment Solr Service
PostgreSQL
Service
PostgreSQL Deployment
Users
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
How to scale up Kubernetes horizontally
Kubernetes Engine
Compute Engine
Dataverse Service
Kubernetes Cluster
K8S Cluster Node1
Users
K8S Cluster Node2
Docker Hub
Container Registry
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
The importance of Persistent Storage
Docker containers write files to disk (I/O) for state or storage, both in /data and
/docroot folders. If a Docker container is restarted for some reason, all data will be
lost.
Solution: mount Persistent storage into the container on external disk hosted in the
Cloud.
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
SQA process with Selenium tests for Dataverse
Selenium IDE allows
to create and replay
all UI tests in your
browser
Shared tests can be
reused by community
to increase
reproducibility
SQA for the service maturity = unit tests + integration tests
http://github.com/GlobalDataverseCommunityConsortium/dataverse-test-suite
18
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
CI/CD pipeline with SQAaaS (S)
1
2
3
git
push
Push GCP
container
registry
webhook
Create
docker
image
Kubernetes
Deployment
git clone
Jenkins pipeline (Jenkinsfile)
9
7
Run SQA
S 8
1. Developer pushes code to GitHub
2. Jenkins receives notification - build trigger
3. Jenkins clones the workspace
4. (S) Runs SQA tests and does FAIRness check
5. (S) Issuing digital badge according to the results
6. (S) SQAaaS API triggers appropriate workflow
7. Creates docker image if success
8. Pushes new docker image to container registry
9. Updates the kubernetes deployment
19
Source: EOSC Synergy project
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse App Store
Data preview: DDI Explorer, Spreadsheet/CSV, PDF, Text files, HTML, Images, video render, audio,
JSON, GeoJSON/Shapefiles/Map, XML
Interoperability: external controlled vocabularies (SKOSMOS, CESSDA CV Manager)
Data processing: NESSTAR DDI migration tool
Linked Data: RDF compliance including SPARQL endpoint (FDP)
Multilingual support: Weblate as a Service http://translation.cessda.eu
Federated login: eduGAIN, EGI Check-in, PIONIER ID
CLARIN Switchboard integration: Natural Language Processing tools
Visualization tools (maps, charts, timelines), Apache Superset integration
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
GUI translation with Weblate as a service
Source: SSHOC Weblate
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse and CLARIN tools integration
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Dataverse Apps in Serverless - Function as a Service
Infrastructure maintenance = human resources and money.
Are you sure you need a server to provide all of your services?
The Serverless application framework allows to upload your code to the Cloud
space and it will provision and run it without any extra efforts! Scaling is done
automatically based on the workfload.
Costs - only pay for the time when your application or computing process is running.
AWS Lambda can be used to spin up application based on triggers/events.
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Serverless support by leading Cloud Providers
Function as a Service (FaaS) model:
Main costs:
1. Number of requests: typically billed for every million requests per month
2. Compute time: this is a measurement of the duration of the life of the function and the
amount of memory provisioned, measured in GB-seconds (higher-memory execution will
cost more than lower-memory execution of the same duration)
Source: Serverless Showdown
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Serverless Use Cases for Dataverse
Remember: inactive Serverless application will require some additional time to
respond to a request. This initial delay encountered is known as a “cold start”.
● Can we run Dataverse instance as Serverless app?
Yes, but just for demo, Kubernetes setup is for Dataverse as a service
● Data Previewers in Serverless?
Ideal case, cheap and efficient!
● Data Computing tools?
Yes, will fit for services running on demand
● GIS tools integration with Dataverse?
Function as a Service (FaaS) is the best
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Practical example - Caching function for CVV
Linked Data Serverless Resolver implemented as Lambda Function and deployed
on Harvard AWS cloud
https://github.com/Dans-labs/ld-serverless-resolver
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Microsoft Visual Studio Code
Features:
● source-code editor made by Microsoft
for Windows, Linux and macOS
● The Integrated CLI (Command Line
Interface)
● support for debugging, syntax
highlighting, intelligent code
completion, snippets, code refactoring,
and embedded Git
● cloud-powered development
environments for any activity
Try it here: https://code.visualstudio.com
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Maturity of Dataverse core and apps in EOSC
Testing process follows the CESSDA maturity model
https://zenodo.org/record/2591055#.XKR6ny2B2u5
Important: every change of Dataverse functionality should be supplied with
unit tests, changes of external functionality should get Selenium scenarios.
Goal: automate all tests, workflows and integrations, follow the Software
Quality as a Service (SQaaS) framework and score as high as possible
according to the CESSDA maturity model.
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Our speakers
Stefan Kasberger is DevOp and Python Developer at AUSSDA (Austria) and involved
in the Dataverse community during last 3 years. He contributes Open Source tools in
Python to support Dataverse. He has done several data migrations into Dataverse.
Lately, he started to focus on creating standardized tests and test data for Dataverse
installations, to support automation/cloud/devops activities in the international
community. Stefan is creator of pyDataverse module.
Samuel Bernardo is Software Engineer at LIP in Portugal. Graduated in Lisbon
University with a degree in Computer Science and Engineering and finishing the
master degree dissertation. Worked on several projects around distributed systems,
data science and computing, Currently working in BigHPC and SQAaaS platforms
development in the EOSC Synergy and other European projects, such as Opencoasts
and WORSICA. Also a open source contributor for several years supporting the
community and free software.
This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782
Questions?
vyacheslav.tykhonov@dans.knaw.nl
https://www.sshopencloud.eu
info@sshopencloud.eu
@SSHOpenCloud
/in/sshopencloud
Join our community

More Related Content

What's hot

Technical integration of data repositories status and challenges
Technical integration of data repositories status and challengesTechnical integration of data repositories status and challenges
Technical integration of data repositories status and challenges
vty
 
External controlled vocabularies support in Dataverse
External controlled vocabularies support in DataverseExternal controlled vocabularies support in Dataverse
External controlled vocabularies support in Dataverse
vty
 
Controlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repositoryControlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repository
vty
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
vty
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
vty
 
Setting up Dataverse repository for research data
Setting up Dataverse repository for research dataSetting up Dataverse repository for research data
Setting up Dataverse repository for research data
vty
 
External CV support in Dataverse 5.7
External CV support in Dataverse 5.7External CV support in Dataverse 5.7
External CV support in Dataverse 5.7
vty
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
vty
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
vty
 
CLARIAH CMDI use case and flexible metadata schemes
CLARIAH CMDI use case and flexible metadata schemesCLARIAH CMDI use case and flexible metadata schemes
CLARIAH CMDI use case and flexible metadata schemes
Vyacheslav Tykhonov
 
Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)
vty
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
vty
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Andrea Scharnhorst
 
Flexible metadata schemes for research data repositories - Clarin Conference...
Flexible metadata schemes for research data repositories  - Clarin Conference...Flexible metadata schemes for research data repositories  - Clarin Conference...
Flexible metadata schemes for research data repositories - Clarin Conference...
Vyacheslav Tykhonov
 
Fighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial IntelligenceFighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial Intelligence
vty
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloud
vty
 
SSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science CloudSSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science Cloud
vty
 
Review on Big Data Security in Hadoop
Review on Big Data Security in HadoopReview on Big Data Security in Hadoop
Review on Big Data Security in Hadoop
IRJET Journal
 
DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
vty
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigm
Jonathan Challener
 

What's hot (20)

Technical integration of data repositories status and challenges
Technical integration of data repositories status and challengesTechnical integration of data repositories status and challenges
Technical integration of data repositories status and challenges
 
External controlled vocabularies support in Dataverse
External controlled vocabularies support in DataverseExternal controlled vocabularies support in Dataverse
External controlled vocabularies support in Dataverse
 
Controlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repositoryControlled vocabularies and ontologies in Dataverse data repository
Controlled vocabularies and ontologies in Dataverse data repository
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC,  Service QA and DataverseIntegration of WORSICA’s thematic service in EOSC,  Service QA and Dataverse
Integration of WORSICA’s thematic service in EOSC, Service QA and Dataverse
 
Setting up Dataverse repository for research data
Setting up Dataverse repository for research dataSetting up Dataverse repository for research data
Setting up Dataverse repository for research data
 
External CV support in Dataverse 5.7
External CV support in Dataverse 5.7External CV support in Dataverse 5.7
External CV support in Dataverse 5.7
 
Metaverse for Dataverse
Metaverse for DataverseMetaverse for Dataverse
Metaverse for Dataverse
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2Dataverse SSHOC enrichment of DDI support at EDDI'19 2
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
 
CLARIAH CMDI use case and flexible metadata schemes
CLARIAH CMDI use case and flexible metadata schemesCLARIAH CMDI use case and flexible metadata schemes
CLARIAH CMDI use case and flexible metadata schemes
 
Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)Running Dataverse repository in the European Open Science Cloud (EOSC)
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...Building an electronic repository and archives on Dataverse in the European O...
Building an electronic repository and archives on Dataverse in the European O...
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
 
Flexible metadata schemes for research data repositories - Clarin Conference...
Flexible metadata schemes for research data repositories  - Clarin Conference...Flexible metadata schemes for research data repositories  - Clarin Conference...
Flexible metadata schemes for research data repositories - Clarin Conference...
 
Fighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial IntelligenceFighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial Intelligence
 
Dataverse in the European Open Science Cloud
Dataverse in the European Open Science CloudDataverse in the European Open Science Cloud
Dataverse in the European Open Science Cloud
 
SSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science CloudSSHOC Dataverse in the European Open Science Cloud
SSHOC Dataverse in the European Open Science Cloud
 
Review on Big Data Security in Hadoop
Review on Big Data Security in HadoopReview on Big Data Security in Hadoop
Review on Big Data Security in Hadoop
 
DataverseEU as multilingual repository
DataverseEU as multilingual repositoryDataverseEU as multilingual repository
DataverseEU as multilingual repository
 
The path to an hybrid open source paradigm
The path to an hybrid open source paradigmThe path to an hybrid open source paradigm
The path to an hybrid open source paradigm
 

Similar to Automated CI/CD testing, installation and deployment of Dataverse infrastructure on a Cloud

Dataverse repository for research data in the COVID-19 Museum
Dataverse repository for research data  in the COVID-19 MuseumDataverse repository for research data  in the COVID-19 Museum
Dataverse repository for research data in the COVID-19 Museum
vty
 
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Péter Király
 
Data Processing and Analysis
Data Processing and AnalysisData Processing and Analysis
Data Processing and Analysis
EUDAT
 
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
vty
 
2. EOSC-hub (Daan Broeder, CLARIN ERIC)
2. EOSC-hub (Daan Broeder, CLARIN ERIC)2. EOSC-hub (Daan Broeder, CLARIN ERIC)
2. EOSC-hub (Daan Broeder, CLARIN ERIC)
SSHOC
 
Deep Hybrid DataCloud
Deep Hybrid DataCloudDeep Hybrid DataCloud
Deep Hybrid DataCloud
EOSC-hub project
 
Docker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to DockerDocker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to Docker
Docker, Inc.
 
Docker and containerization
Docker and containerizationDocker and containerization
Docker and containerization
Amulya Saxena
 
DEEP: a user success story
DEEP: a user success storyDEEP: a user success story
DEEP: a user success story
EOSC-hub project
 
Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...
Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...
Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...
Ashnikbiz
 
Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...
Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...
Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...
Gilles Fedak
 
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
Paris Open Source Summit
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution Framework
EUDAT
 
citus™ iot ecosystem
citus™ iot ecosystemcitus™ iot ecosystem
citus™ iot ecosystem
DUONG Dinh Cuong
 
Docker Application to Scientific Computing
Docker Application to Scientific ComputingDocker Application to Scientific Computing
Docker Application to Scientific Computing
Peter Bryzgalov
 
Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...
Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...
Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...
Ashnikbiz
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
OCCIware
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
Marc Dutoo
 
OCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demo
OCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demoOCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demo
OCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demo
Marc Dutoo
 
Containers - Portable, repeatable user-oriented application delivery. Build, ...
Containers - Portable, repeatable user-oriented application delivery. Build, ...Containers - Portable, repeatable user-oriented application delivery. Build, ...
Containers - Portable, repeatable user-oriented application delivery. Build, ...
Walid Shaari
 

Similar to Automated CI/CD testing, installation and deployment of Dataverse infrastructure on a Cloud (20)

Dataverse repository for research data in the COVID-19 Museum
Dataverse repository for research data  in the COVID-19 MuseumDataverse repository for research data  in the COVID-19 Museum
Dataverse repository for research data in the COVID-19 Museum
 
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
Requirements of DARIAH community for a Dataverse repository (SSHOC 2020)
 
Data Processing and Analysis
Data Processing and AnalysisData Processing and Analysis
Data Processing and Analysis
 
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...DataverseEU: Building Multilingual infrastructure  for the Social Sciences in...
DataverseEU: Building Multilingual infrastructure for the Social Sciences in...
 
2. EOSC-hub (Daan Broeder, CLARIN ERIC)
2. EOSC-hub (Daan Broeder, CLARIN ERIC)2. EOSC-hub (Daan Broeder, CLARIN ERIC)
2. EOSC-hub (Daan Broeder, CLARIN ERIC)
 
Deep Hybrid DataCloud
Deep Hybrid DataCloudDeep Hybrid DataCloud
Deep Hybrid DataCloud
 
Docker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to DockerDocker Bday #5, SF Edition: Introduction to Docker
Docker Bday #5, SF Edition: Introduction to Docker
 
Docker and containerization
Docker and containerizationDocker and containerization
Docker and containerization
 
DEEP: a user success story
DEEP: a user success storyDEEP: a user success story
DEEP: a user success story
 
Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...
Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...
Docker Enterprise Edition Overview by Steven Thwaites, Technical Solutions En...
 
Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...
Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...
Devcon3 : iExec Allowing Scalable, Efficient, and Virtualized Off-chain Execu...
 
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
#OSSPARIS17 - Développeurs, urbanisez la consommation de vos Clouds et APIs a...
 
EUDAT Generic Execution Framework
EUDAT Generic Execution FrameworkEUDAT Generic Execution Framework
EUDAT Generic Execution Framework
 
citus™ iot ecosystem
citus™ iot ecosystemcitus™ iot ecosystem
citus™ iot ecosystem
 
Docker Application to Scientific Computing
Docker Application to Scientific ComputingDocker Application to Scientific Computing
Docker Application to Scientific Computing
 
Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...
Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...
Docker EE 2.0 choice security agility by Erik Tan,Tech Insights Singapore - 2...
 
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...
 
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...
 
OCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demo
OCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demoOCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demo
OCCIware @ Cloud Computing World 2016 - year 1 milestone & Linked Data demo
 
Containers - Portable, repeatable user-oriented application delivery. Build, ...
Containers - Portable, repeatable user-oriented application delivery. Build, ...Containers - Portable, repeatable user-oriented application delivery. Build, ...
Containers - Portable, repeatable user-oriented application delivery. Build, ...
 

More from vty

Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
vty
 
Decentralisation and knowledge graphs
Decentralisation and knowledge graphs Decentralisation and knowledge graphs
Decentralisation and knowledge graphs
vty
 
Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure
vty
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
vty
 
Data standardization process for social sciences and humanities
Data standardization process for social sciences and humanitiesData standardization process for social sciences and humanities
Data standardization process for social sciences and humanities
vty
 
Development in Dataverse SSHOC project
Development in Dataverse SSHOC projectDevelopment in Dataverse SSHOC project
Development in Dataverse SSHOC project
vty
 

More from vty (6)

Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs Decentralised identifiers and knowledge graphs
Decentralised identifiers and knowledge graphs
 
Decentralisation and knowledge graphs
Decentralisation and knowledge graphs Decentralisation and knowledge graphs
Decentralisation and knowledge graphs
 
Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure Decentralised identifiers for CLARIAH infrastructure
Decentralised identifiers for CLARIAH infrastructure
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Data standardization process for social sciences and humanities
Data standardization process for social sciences and humanitiesData standardization process for social sciences and humanities
Data standardization process for social sciences and humanities
 
Development in Dataverse SSHOC project
Development in Dataverse SSHOC projectDevelopment in Dataverse SSHOC project
Development in Dataverse SSHOC project
 

Recently uploaded

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
sanjana502982
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 

Recently uploaded (20)

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 

Automated CI/CD testing, installation and deployment of Dataverse infrastructure on a Cloud

  • 1. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 SSHOC Dataverse in the European Open Science Cloud Automated CI/CD Testing, Installation and Deployment Slava Tykhonov (DANS-KNAW) lead software engineer, DANS R&D Dataverse Community Meeting 2021 14 June 2021 Harvard University
  • 2. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Overview ● SSHOC in the European Open Science Cloud (EOSC) ● CESSDA Maturity Model ● Docker ● Kubernetes ● Serverless ● Community-driven QA plan ● CI/CD and test automation with pyDataverse ● Quality Assurance as a Service, EOSC Synergy ● Q&A
  • 3. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Duration: 40 months (January 2019 – 30 April 2022) Partners: 47 (20 beneficiaries + 27 LTPs) SSH ESFRI Landmarks and Projects & international SSH data infrastructures Project budget: € 14,455,594.08 Type of action & funding: Research and Innovation action (INFRAEOSC-04-2018) Project website: www.SSHOpenCloud.eu Objectives: • creating the social sciences and humanities (SSH) part of European Open Science Cloud (EOSC) • maximising re-use through Open Science and FAIR principles (standards, common catalogue, access control, semantic techniques, training) • interconnecting existing and new infrastructures (clustered cloud infrastructure) • establishing appropriate governance model for SSH-EOSC Project:
  • 4. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 EOSC Synergy
  • 5. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 DANS Data Stations - Future Data Services
  • 6. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Interoperability in EOSC ● Technical interoperability defined as the “ability of different information technology systems and software applications to communicate and exchange data”. It should allow “to accept data from each other and perform a given task in an appropriate and satisfactory manner without the need for extra operator intervention”. ● Semantic interoperability is “the ability of computer systems to transmit data with unambiguous, shared meaning. Semantic interoperability is a requirement to enable machine computable logic, inferencing, knowledge discovery, and data”. ● Organisational interoperability refers to the “way in which organisations align their business processes, responsibilities and expectations to achieve commonly agreed and mutually beneficial goals. Focus on the requirements of the user community by making services available, easily identifiable, accessible and user-focused”. ● Legal interoperability covers “the broader environment of laws, policies, procedures and cooperation agreements” Source: EOSC Interoperability Framework v1.0
  • 7. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 SSHOC task 5.2 Hosting and sharing data repositories Objective Development of a research data repository service on EOSC, for SSH institutions currently without such a facility for their designated communities Deliverables After 38 months: Data repository service running on EOSC After 40 months: Report on principles of governance and sustainability of the data repository service
  • 8. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Services in European Open Science Cloud (EOSC) ● EOSC requires the level 8 of maturity (at least) ● we need the highest quality of software to be accepted as a service ● clear and transparent evaluation of services is essential ● the evidence of technical maturity is the key to success ● the limited warranty will allow to stop out- of-warranty services
  • 9. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse deployment in CESSDA Cloud Docker Compose for the local development and testing Kubernetes (K8s) for the Cloud deployment of services (CI/CD pipeline with Jenkins and Helm)
  • 10. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse Docker module (CESSDA Dataverse, 2018) Source: https://github.com/IQSS/dataverse-docker
  • 11. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Docker introduction • Extremely powerful configuration tool • Allows to install software on any platform (Linux, Mac, Windows) • Any software can be installed from Docker as standalone container or container delivering Microservices (database, search engine, core service) • Docker allows to host unlimited amount of the same software tools on different ports • Docker can be used to organise multilingual interfaces, for example • Deployments with orchestrator tools: Docker Swarm, Kubernetes • Faster development and deployments • Isolation of running containers allows to scale up apps • Portability saves time to run the same image on the local computer or in the cloud • Snapshotting allows to archive Docker images state • Resource limitation can be adjusted
  • 12. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse Kubernetes Project maintained by Oliver Bertuch (FZ Julich) and available in Global Dataverse Community Consortium github (GDCC) Google Cloud, Amazon AWS, Microsoft Azure platforms supported Open Source, community pull requests are welcome http://github.com/IQSS/dataverse-kubernetes
  • 13. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 How to deploy Dataverse on Google Cloud Platform Resources needed to deploy Dataverse: » GCP VM-instance(s) » Kubernetes (K8S) cluster » Docker images » K8S Services » K8S Deployments (workloads) » Load Balancer or Ingress (SSL certificates) » Mail relay (default port 25 is blocked) » Persistent volumes (storage) for the Solr, Postgres, Dataverse and SSL services
  • 14. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Integration with Google Cloud Google Cloud Platform » GCloud a command-line (CLI) tool for running commands against Google Cloud Platform. GCloud is part of the Google Cloud SDK Kubernetes cluster » Kubectl A command line interface (CLI) for running commands against Kubernetes clusters » Use of .yaml configuration files to describe the desired resources Note: Kubernetes software is used by all major Cloud providers, the different is the storage specification!
  • 15. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse Cloud Architecture Ingress HTTP(S) Load Balancer Kubernetes Engine Dataverse Service Kubernetes Cluster K8S Cluster Node Dataverse Deployment Dataverse Service Solr Deployment Solr Service PostgreSQL Service PostgreSQL Deployment Users
  • 16. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 How to scale up Kubernetes horizontally Kubernetes Engine Compute Engine Dataverse Service Kubernetes Cluster K8S Cluster Node1 Users K8S Cluster Node2 Docker Hub Container Registry
  • 17. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 The importance of Persistent Storage Docker containers write files to disk (I/O) for state or storage, both in /data and /docroot folders. If a Docker container is restarted for some reason, all data will be lost. Solution: mount Persistent storage into the container on external disk hosted in the Cloud.
  • 18. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 SQA process with Selenium tests for Dataverse Selenium IDE allows to create and replay all UI tests in your browser Shared tests can be reused by community to increase reproducibility SQA for the service maturity = unit tests + integration tests http://github.com/GlobalDataverseCommunityConsortium/dataverse-test-suite 18
  • 19. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 CI/CD pipeline with SQAaaS (S) 1 2 3 git push Push GCP container registry webhook Create docker image Kubernetes Deployment git clone Jenkins pipeline (Jenkinsfile) 9 7 Run SQA S 8 1. Developer pushes code to GitHub 2. Jenkins receives notification - build trigger 3. Jenkins clones the workspace 4. (S) Runs SQA tests and does FAIRness check 5. (S) Issuing digital badge according to the results 6. (S) SQAaaS API triggers appropriate workflow 7. Creates docker image if success 8. Pushes new docker image to container registry 9. Updates the kubernetes deployment 19 Source: EOSC Synergy project
  • 20. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse App Store Data preview: DDI Explorer, Spreadsheet/CSV, PDF, Text files, HTML, Images, video render, audio, JSON, GeoJSON/Shapefiles/Map, XML Interoperability: external controlled vocabularies (SKOSMOS, CESSDA CV Manager) Data processing: NESSTAR DDI migration tool Linked Data: RDF compliance including SPARQL endpoint (FDP) Multilingual support: Weblate as a Service http://translation.cessda.eu Federated login: eduGAIN, EGI Check-in, PIONIER ID CLARIN Switchboard integration: Natural Language Processing tools Visualization tools (maps, charts, timelines), Apache Superset integration
  • 21. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 GUI translation with Weblate as a service Source: SSHOC Weblate
  • 22. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse and CLARIN tools integration
  • 23. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Dataverse Apps in Serverless - Function as a Service Infrastructure maintenance = human resources and money. Are you sure you need a server to provide all of your services? The Serverless application framework allows to upload your code to the Cloud space and it will provision and run it without any extra efforts! Scaling is done automatically based on the workfload. Costs - only pay for the time when your application or computing process is running. AWS Lambda can be used to spin up application based on triggers/events.
  • 24. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Serverless support by leading Cloud Providers Function as a Service (FaaS) model: Main costs: 1. Number of requests: typically billed for every million requests per month 2. Compute time: this is a measurement of the duration of the life of the function and the amount of memory provisioned, measured in GB-seconds (higher-memory execution will cost more than lower-memory execution of the same duration) Source: Serverless Showdown
  • 25. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Serverless Use Cases for Dataverse Remember: inactive Serverless application will require some additional time to respond to a request. This initial delay encountered is known as a “cold start”. ● Can we run Dataverse instance as Serverless app? Yes, but just for demo, Kubernetes setup is for Dataverse as a service ● Data Previewers in Serverless? Ideal case, cheap and efficient! ● Data Computing tools? Yes, will fit for services running on demand ● GIS tools integration with Dataverse? Function as a Service (FaaS) is the best
  • 26. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Practical example - Caching function for CVV Linked Data Serverless Resolver implemented as Lambda Function and deployed on Harvard AWS cloud https://github.com/Dans-labs/ld-serverless-resolver
  • 27. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Microsoft Visual Studio Code Features: ● source-code editor made by Microsoft for Windows, Linux and macOS ● The Integrated CLI (Command Line Interface) ● support for debugging, syntax highlighting, intelligent code completion, snippets, code refactoring, and embedded Git ● cloud-powered development environments for any activity Try it here: https://code.visualstudio.com
  • 28. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Maturity of Dataverse core and apps in EOSC Testing process follows the CESSDA maturity model https://zenodo.org/record/2591055#.XKR6ny2B2u5 Important: every change of Dataverse functionality should be supplied with unit tests, changes of external functionality should get Selenium scenarios. Goal: automate all tests, workflows and integrations, follow the Software Quality as a Service (SQaaS) framework and score as high as possible according to the CESSDA maturity model.
  • 29. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Our speakers Stefan Kasberger is DevOp and Python Developer at AUSSDA (Austria) and involved in the Dataverse community during last 3 years. He contributes Open Source tools in Python to support Dataverse. He has done several data migrations into Dataverse. Lately, he started to focus on creating standardized tests and test data for Dataverse installations, to support automation/cloud/devops activities in the international community. Stefan is creator of pyDataverse module. Samuel Bernardo is Software Engineer at LIP in Portugal. Graduated in Lisbon University with a degree in Computer Science and Engineering and finishing the master degree dissertation. Worked on several projects around distributed systems, data science and computing, Currently working in BigHPC and SQAaaS platforms development in the EOSC Synergy and other European projects, such as Opencoasts and WORSICA. Also a open source contributor for several years supporting the community and free software.
  • 30. This project is funded from the EU Horizon 2020 Research and Innovation Programme (2014-2020) under Grant Agreement No. 823782 Questions? vyacheslav.tykhonov@dans.knaw.nl https://www.sshopencloud.eu info@sshopencloud.eu @SSHOpenCloud /in/sshopencloud Join our community