SlideShare a Scribd company logo
1 of 31
Dockerizing a multi-
component Open Data app
Athens Docker Meetup, June 2016
Dimitris Negkas, Stergios Tsiafoulis
dimneg@gmail.com, s.tsiafoulis@gmail.com
Description and Scope
LinkedEconomy (http://linkedeconomy.org/).
 is a publicly available web platform and linked data
repository.
 its scope is to transform, curate, aggregate,
interlink and publish economic data in machine-
readable format, to enable
 citizens awareness
 research with unprecedented data
 evidence-based policy
Data Sources
 Sources Currently used:
 Transparency – DIAVGEIA
 Central Electronic Registry of Public Procurement - E-
Procurement
 National Strategic Reference Framework (NSRF)
 Central Market of Thessaloniki (CMT)
 e-Prices
 Fuel Prices
 Municipality of Athens, Municipality of Thessaloniki
 Government of Australia
Data growth
 we use Open Link Virtuoso for 15 different sources
of nearly 1B triples
 we host 27 datasets in CKAN from 15 organizations
 data is increased respectively each month
Data processing
 Each data source is separately handled and processed as its
available data are not uniformly provided or in machine-
readable format.
 Diavgeia, “NSRF” and Observatories for product and fuel
prices provide a rich API interface that can be easily
queried in order to provide machine-readable data in JSON
format.
 In the cases of E-Procurement, “CMT” and “Municipalities
of Athens and Thessaloniki” there is no API available.
Thus, we have developed a software module, which gathers
online information in an automated way, storing it in a
machine-readable format.
General Architecture
 Process model
 Open economic data related to public budgeting,
spending and prices are characterized of high
volume, velocity, variety and veracity
 We have to build custom components under the
common logic of transforming static data to
linked open data streams.
Process model: Nucleus
 The nucleus of our
approach is semantic
modelling, data
enrichment and
interconnections.
 Data are stored in raw
(as harvested from
sources), in RDF and
json formats.
Process model : Data distribution
 Enriched data are
distributed though five
channels:
1. Data dumps (CKAN),
2. SPARQL queries,
3. Web,
4. Social media
5. Structured inputs to
Business Intelligence (BI)
systems.
 Additionally, data can be
further analysed and
exchanged with relevant
platforms (e.g. SPARQL to
R).
Process model : Validation and
messenger
 The validation
component runs
throughout the whole
process in order to
safeguard high data
quality by detecting
errors.
 The messaging
component works as an
internal messaging and
alert system for all
components.
Process flow
Infrastructure
Functionalities /
Components Services / Data sources
VM1 linkedeconomy.org apache, php, mysql, drupal
VM2 SPARQL endpoint, demo site OLV, apache, php, mysql, drupal
VM3 Harvester
CouchDB, Lucene, apache, mysql / CKAN
(Greek Datasets)
VM4 Harvester, Messenger mysql, LinkedEconomy dropbox
VM5 Storage - Secondary triplestore CouchDB, OLV, CouchDB-Lucene, docker
VM6 Harvester
apache, php, mysql, drupal / CKAN (Foreign
Datasets)
VM7 SPARQL endpoint OLV (Foreign graphs)
VM8 Management JIRA, mysql, tomcat
VM9 Dashboard front-end, CMS, INSPINIA
VM10 System administration VPN, firewalls, etc.
Physical Storage - Core triplestore OLV (Greek graphs)
As core infrastructure we use ~okeanos, which is an established cloud-based
service provided for the Greek research and academic community.
LinkedEconomy
CKAN
“Hottest” Prices per municipality
Supermarkets Geoinformation
Application System
Small Applications
Java, Php and UNIX Scripts
Di@vgeia
KHMDHS
Virtuoso
CouchDB
Drupal
MySql
ePrices
CKAN
fuelPricesQGIS
Dockerize the System
Di@vgeia
KHMDHS
ePrices
Virtuoso
Drupal
MySql
QGIS Desktop
CouchDB
QGIS Server
Small Applications
CKAN
With Compose 2
Docker MySQL
 version: '2'
 services:
 mysql:
 build: ./mysql-docker/5.6
 container_name: eLodDrupalmySQL
 volumes:
 - /mysql_drupal:/var/lib/mysql
 environment:
 - MYSQL_DATABASE=drupalelod
 - MYSQL_ROOT_PASSWORD=eLodmysqlpass
 restart: on-failure
Save your data !!
Will build the image from
your directory
Do not use flag “always”
in your development
environment!
Docker Drupal
 drupal:
 build: ./docker-drupal
 command:
 - /start.sh
 depends_on:
 - mysql
 container_name: eLodDrupal
 #image: eLodDrupal
 ports:
 - "8081:80"
 volumes:
 - "/data_drupal:/var/www/html"
 links:
 - "mysql"
 environment:
 - MYSQL_DATABASE=drupalelod
 - MYSQL_USER=root
 - MYSQL_PASSWORD=eLodmysqlpass
 - DRUPAL_ADMIN_PW=eLODDR
 - DRUPAL_ADMIN=admin
 - MYSQL_HOST=eLodDrupalmySQL
 - DRUPAL_ADMIN_EMAIL=stetsiafoulis@gmail.com
 restart: on-failure
Will start the service only
after MySQL service
Will link the container
with MySQL container
Docker Virtuoso
 virtuoso:
 build: ./docker-virtuoso
 container_name: eLodVirtuoso
 ports:
 - "8890:8890"
 volumes:
 - /virtuoso/db:/var/lib/virtuoso/db
 environment:
 - DBA_PASSWORD=eLodVir
 - SPARQL_UPDATE=true
 - DEFAULT_GRAPH=http://localhost:8890/DAV
 restart: on-failure
Docker QGIS
 qgisdesktop:
 #image: kartoza/qgis-desktop:2.14
 build: ./qgis-desktop/2.14
 hostname: qgis-server
 volumes:
 #Wherever you want to mount your data from
 - ./gis:/gis
 #Unix socket for X11
 - "/tmp/.X11-unix:/tmp/.X11-unix"
 links:
 - db:db
 environment:
 - DISPLAY=unix:1
 command: /usr/bin/qgis
Build the system
 Clone the repository from github
https://github.com/stetsiafoulis/eLOD
 Create the directories where you are going to link your
data
 Enter docker-compose up -d and that’s it !!
Why Docker ?
o Portable
o Lightweight
o Move to different cloud infrastructures
and to Physical servers
o Run on Virtual Machines for
development and testing
o Easily Scale
o Easy Delivery and deployment
o Run Anywhere (regardless host distro,
physical, cloud or not )
o Run Anything
What’s Next ??
Scaling per Source
Di@ygeia KHMDHS
Virtuoso
Drupal
MySql
QGIS Desktop
CouchDB
QGIS Server
Small Applications
Virtuoso
Drupal
MySql
CouchDB
QGIS Server
Small ApplicationsQGIS Desktop
Run Small Apps through Docker
API
Small Applications
Next Steps - Swarm
Virtuoso
Drupal
MySql
CouchDB
QGIS Server
Cluster management
Scaling
State reconciliation
Multi-host networking
Service discovery
Load balancing
Next Steps - Consul
Health CheckingService Discovery
Multi Datacenter support
Any Questions ??
Appendix - Data Sources links
 LinkedEconomy (http://linkedeconomy.org/).
 linkedeconomy@gmail.com
 Sources Currently used:
 Transparency - DIAVGEIA: https://diavgeia.gov.gr
 Central Electronic Registry of Public Procurement - E-Procurement (KHDMHS):
http://www.eprocurement.gov.gr
 National Strategic Reference Framework (NSRF):https://www.espa.gr/en
 Central Market of Thessaloniki (CMT):http://www.kath.gr/
 e-Prices: http://www.e-prices.gr/
 Fuel Prices: http://www.fuelprices.gr/
 Municipality of Athens: https://www.cityofathens.gr/khe/proypologismos
 Municipality of Thessaloniki:
http://www.thessaloniki.gr/portal/page/portal/DioikitikesYpiresies/GenDnsiDioikOikonYpiresion/DnsiDiafanEksipirDimoton/Tmima
Diafaneias/AnoiktiDdiathesiDedomenon/DimosiefsiEktelesisProipologismou/ektelesi-proypologismou
 Government of Australia: http://data.gov.au/

More Related Content

What's hot

Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Ajeet Singh Raina
 
Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica Docker, Inc.
 
Docker 1.5
Docker 1.5Docker 1.5
Docker 1.5rajdeep
 
Docker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing optionsDocker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing optionsSreenivas Makam
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveMadhu Venugopal
 
Cloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherCloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherDocker, Inc.
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking OverviewSreenivas Makam
 
Docker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-PlaneDocker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-PlaneMadhu Venugopal
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)DoiT International
 
Consul and docker swarm cluster
Consul and docker swarm clusterConsul and docker swarm cluster
Consul and docker swarm clusterEueung Mulyana
 
Driving containerd operations with gRPC
Driving containerd operations with gRPCDriving containerd operations with gRPC
Driving containerd operations with gRPCDocker, Inc.
 
Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015Van Phuc
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes NetworkingCJ Cullen
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeAcademy
 
Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Chris Tankersley
 
Docker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMDocker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMNeependra Khare
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101HungWei Chiu
 
Load Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS ClusterLoad Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS ClusterKevin Jones
 

What's hot (19)

Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
Service Discovery & Load-Balancing under Docker 1.12.0 @ Docker Meetup #22
 
Docker Intro
Docker IntroDocker Intro
Docker Intro
 
Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica Orchestrating Least Privilege by Diogo Monica
Orchestrating Least Privilege by Diogo Monica
 
Docker 1.5
Docker 1.5Docker 1.5
Docker 1.5
 
Docker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing optionsDocker Networking Tip - Load balancing options
Docker Networking Tip - Load balancing options
 
DCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep diveDCUS17 : Docker networking deep dive
DCUS17 : Docker networking deep dive
 
Cloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross BoucherCloning Running Servers with Docker and CRIU by Ross Boucher
Cloning Running Servers with Docker and CRIU by Ross Boucher
 
Docker Networking Overview
Docker Networking OverviewDocker Networking Overview
Docker Networking Overview
 
Docker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-PlaneDocker summit : Docker Networking Control-plane & Data-Plane
Docker summit : Docker Networking Control-plane & Data-Plane
 
Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)Kubernetes - State of the Union (Q1-2016)
Kubernetes - State of the Union (Q1-2016)
 
Consul and docker swarm cluster
Consul and docker swarm clusterConsul and docker swarm cluster
Consul and docker swarm cluster
 
Driving containerd operations with gRPC
Driving containerd operations with gRPCDriving containerd operations with gRPC
Driving containerd operations with gRPC
 
Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015Docker network Present in VietNam DockerDay 2015
Docker network Present in VietNam DockerDay 2015
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
 
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
KubeCon EU 2016: Creating an Advanced Load Balancing Solution for Kubernetes ...
 
Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016Docker for PHP Developers - ZendCon 2016
Docker for PHP Developers - ZendCon 2016
 
Docker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBMDocker Multi Host Networking, Rachit Arora, IBM
Docker Multi Host Networking, Rachit Arora, IBM
 
Load Balancing 101
Load Balancing 101Load Balancing 101
Load Balancing 101
 
Load Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS ClusterLoad Balancing Applications with NGINX in a CoreOS Cluster
Load Balancing Applications with NGINX in a CoreOS Cluster
 

Similar to Dockerizing a multi-component Open Data app

OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OW2
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware
 
OCCIware@OW2con 2016
OCCIware@OW2con 2016OCCIware@OW2con 2016
OCCIware@OW2con 2016Marc Dutoo
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...Amazon Web Services
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioCHAKER ALLAOUI
 
CargoChain Brochure - Technology
CargoChain Brochure - TechnologyCargoChain Brochure - Technology
CargoChain Brochure - TechnologyCargoChain
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirLuciano Resende
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Data integration
Data integrationData integration
Data integrationBallerina
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik SonejiOSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik SonejiNETWAYS
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy snehal parikh
 

Similar to Dockerizing a multi-component Open Data app (20)

Linked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter HaaseLinked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter Haase
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
OCCIware@OW2con 2016
OCCIware@OW2con 2016OCCIware@OW2con 2016
OCCIware@OW2con 2016
 
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
(BDT302) Big Data Beyond Hadoop: Running Mahout, Giraph, and R on Amazon EMR ...
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
Sdmx9 webservices
Sdmx9 webservicesSdmx9 webservices
Sdmx9 webservices
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
Jacob Keecheril
Jacob KeecherilJacob Keecheril
Jacob Keecheril
 
CargoChain Brochure - Technology
CargoChain Brochure - TechnologyCargoChain Brochure - Technology
CargoChain Brochure - Technology
 
Getting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache BahirGetting insights from IoT data with Apache Spark and Apache Bahir
Getting insights from IoT data with Apache Spark and Apache Bahir
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Data integration
Data integrationData integration
Data integration
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik SonejiOSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
OSDC 2019 | Democratizing Data at Go-JEK by Maulik Soneji
 
Ss eb29
Ss eb29Ss eb29
Ss eb29
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Intro to web dev
Intro to web devIntro to web dev
Intro to web dev
 
Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy Hadoop Integration with Microstrategy
Hadoop Integration with Microstrategy
 

Recently uploaded

What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 

Recently uploaded (20)

What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 

Dockerizing a multi-component Open Data app

  • 1. Dockerizing a multi- component Open Data app Athens Docker Meetup, June 2016 Dimitris Negkas, Stergios Tsiafoulis dimneg@gmail.com, s.tsiafoulis@gmail.com
  • 2. Description and Scope LinkedEconomy (http://linkedeconomy.org/).  is a publicly available web platform and linked data repository.  its scope is to transform, curate, aggregate, interlink and publish economic data in machine- readable format, to enable  citizens awareness  research with unprecedented data  evidence-based policy
  • 3. Data Sources  Sources Currently used:  Transparency – DIAVGEIA  Central Electronic Registry of Public Procurement - E- Procurement  National Strategic Reference Framework (NSRF)  Central Market of Thessaloniki (CMT)  e-Prices  Fuel Prices  Municipality of Athens, Municipality of Thessaloniki  Government of Australia
  • 4. Data growth  we use Open Link Virtuoso for 15 different sources of nearly 1B triples  we host 27 datasets in CKAN from 15 organizations  data is increased respectively each month
  • 5. Data processing  Each data source is separately handled and processed as its available data are not uniformly provided or in machine- readable format.  Diavgeia, “NSRF” and Observatories for product and fuel prices provide a rich API interface that can be easily queried in order to provide machine-readable data in JSON format.  In the cases of E-Procurement, “CMT” and “Municipalities of Athens and Thessaloniki” there is no API available. Thus, we have developed a software module, which gathers online information in an automated way, storing it in a machine-readable format.
  • 6. General Architecture  Process model  Open economic data related to public budgeting, spending and prices are characterized of high volume, velocity, variety and veracity  We have to build custom components under the common logic of transforming static data to linked open data streams.
  • 7. Process model: Nucleus  The nucleus of our approach is semantic modelling, data enrichment and interconnections.  Data are stored in raw (as harvested from sources), in RDF and json formats.
  • 8. Process model : Data distribution  Enriched data are distributed though five channels: 1. Data dumps (CKAN), 2. SPARQL queries, 3. Web, 4. Social media 5. Structured inputs to Business Intelligence (BI) systems.  Additionally, data can be further analysed and exchanged with relevant platforms (e.g. SPARQL to R).
  • 9. Process model : Validation and messenger  The validation component runs throughout the whole process in order to safeguard high data quality by detecting errors.  The messaging component works as an internal messaging and alert system for all components.
  • 11. Infrastructure Functionalities / Components Services / Data sources VM1 linkedeconomy.org apache, php, mysql, drupal VM2 SPARQL endpoint, demo site OLV, apache, php, mysql, drupal VM3 Harvester CouchDB, Lucene, apache, mysql / CKAN (Greek Datasets) VM4 Harvester, Messenger mysql, LinkedEconomy dropbox VM5 Storage - Secondary triplestore CouchDB, OLV, CouchDB-Lucene, docker VM6 Harvester apache, php, mysql, drupal / CKAN (Foreign Datasets) VM7 SPARQL endpoint OLV (Foreign graphs) VM8 Management JIRA, mysql, tomcat VM9 Dashboard front-end, CMS, INSPINIA VM10 System administration VPN, firewalls, etc. Physical Storage - Core triplestore OLV (Greek graphs) As core infrastructure we use ~okeanos, which is an established cloud-based service provided for the Greek research and academic community.
  • 13. CKAN
  • 14. “Hottest” Prices per municipality
  • 16. Application System Small Applications Java, Php and UNIX Scripts Di@vgeia KHMDHS Virtuoso CouchDB Drupal MySql ePrices CKAN fuelPricesQGIS
  • 17. Dockerize the System Di@vgeia KHMDHS ePrices Virtuoso Drupal MySql QGIS Desktop CouchDB QGIS Server Small Applications CKAN
  • 19. Docker MySQL  version: '2'  services:  mysql:  build: ./mysql-docker/5.6  container_name: eLodDrupalmySQL  volumes:  - /mysql_drupal:/var/lib/mysql  environment:  - MYSQL_DATABASE=drupalelod  - MYSQL_ROOT_PASSWORD=eLodmysqlpass  restart: on-failure Save your data !! Will build the image from your directory Do not use flag “always” in your development environment!
  • 20. Docker Drupal  drupal:  build: ./docker-drupal  command:  - /start.sh  depends_on:  - mysql  container_name: eLodDrupal  #image: eLodDrupal  ports:  - "8081:80"  volumes:  - "/data_drupal:/var/www/html"  links:  - "mysql"  environment:  - MYSQL_DATABASE=drupalelod  - MYSQL_USER=root  - MYSQL_PASSWORD=eLodmysqlpass  - DRUPAL_ADMIN_PW=eLODDR  - DRUPAL_ADMIN=admin  - MYSQL_HOST=eLodDrupalmySQL  - DRUPAL_ADMIN_EMAIL=stetsiafoulis@gmail.com  restart: on-failure Will start the service only after MySQL service Will link the container with MySQL container
  • 21. Docker Virtuoso  virtuoso:  build: ./docker-virtuoso  container_name: eLodVirtuoso  ports:  - "8890:8890"  volumes:  - /virtuoso/db:/var/lib/virtuoso/db  environment:  - DBA_PASSWORD=eLodVir  - SPARQL_UPDATE=true  - DEFAULT_GRAPH=http://localhost:8890/DAV  restart: on-failure
  • 22. Docker QGIS  qgisdesktop:  #image: kartoza/qgis-desktop:2.14  build: ./qgis-desktop/2.14  hostname: qgis-server  volumes:  #Wherever you want to mount your data from  - ./gis:/gis  #Unix socket for X11  - "/tmp/.X11-unix:/tmp/.X11-unix"  links:  - db:db  environment:  - DISPLAY=unix:1  command: /usr/bin/qgis
  • 23. Build the system  Clone the repository from github https://github.com/stetsiafoulis/eLOD  Create the directories where you are going to link your data  Enter docker-compose up -d and that’s it !!
  • 24. Why Docker ? o Portable o Lightweight o Move to different cloud infrastructures and to Physical servers o Run on Virtual Machines for development and testing o Easily Scale o Easy Delivery and deployment o Run Anywhere (regardless host distro, physical, cloud or not ) o Run Anything
  • 26. Scaling per Source Di@ygeia KHMDHS Virtuoso Drupal MySql QGIS Desktop CouchDB QGIS Server Small Applications Virtuoso Drupal MySql CouchDB QGIS Server Small ApplicationsQGIS Desktop
  • 27. Run Small Apps through Docker API Small Applications
  • 28. Next Steps - Swarm Virtuoso Drupal MySql CouchDB QGIS Server Cluster management Scaling State reconciliation Multi-host networking Service discovery Load balancing
  • 29. Next Steps - Consul Health CheckingService Discovery Multi Datacenter support
  • 31. Appendix - Data Sources links  LinkedEconomy (http://linkedeconomy.org/).  linkedeconomy@gmail.com  Sources Currently used:  Transparency - DIAVGEIA: https://diavgeia.gov.gr  Central Electronic Registry of Public Procurement - E-Procurement (KHDMHS): http://www.eprocurement.gov.gr  National Strategic Reference Framework (NSRF):https://www.espa.gr/en  Central Market of Thessaloniki (CMT):http://www.kath.gr/  e-Prices: http://www.e-prices.gr/  Fuel Prices: http://www.fuelprices.gr/  Municipality of Athens: https://www.cityofathens.gr/khe/proypologismos  Municipality of Thessaloniki: http://www.thessaloniki.gr/portal/page/portal/DioikitikesYpiresies/GenDnsiDioikOikonYpiresion/DnsiDiafanEksipirDimoton/Tmima Diafaneias/AnoiktiDdiathesiDedomenon/DimosiefsiEktelesisProipologismou/ektelesi-proypologismou  Government of Australia: http://data.gov.au/

Editor's Notes

  1. Open economic data related to public budgeting, spending and prices are characterized by high volume, velocity, variety and veracity.
  2. 10 virtual machines with memory and storage capacities that span from 2GB to 8GB RAM and 20GB to 100GB respectively, as well as a non-commodity (physical) server of 12 CPUs, 64GB RAM and a storage capacity of more than 4TB.
  3. This map shows which municipalities are the most expensive on a specific product ie. Milk, fruits, or petrol etc The scale of the color gives a perception of the price of the product to a municipality.. More red more expensive.
  4. Also we are using QGIS in order to display on the map geoinformation of the supermarkets or other POIs
  5. The system consists of : CKAN data portal, Drupal, Virtuoso, MySQLs, QGIS server, CouchDB and many scripts of different technologies and scope. We are using such a system of apps in order to elaborate information from different data sources. As we mentioned before the system is established on a cloud-based infrastructure ~okeanos. There is a need in some cases to move the system or back it– up on different cloud or physical infrastructures. Here is where Docker came and help us to achieve that , almost very easily and without many efforts.
  6. We started to dockerize the services one by one until we decided use the new Compose 2. Compose creates the entire system with a single command. docker-compose up –d And not only that, also it creates an internal network and attaches the containers to that automatically.
  7. Policy no Do not automatically restart the container when it exits. This is the default. on-failure[:max-retries] Restart only if the container exits with a non-zero exit status. Optionally, limit the number of restart retries the Docker daemon attempts. always Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely. The container will also always start on daemon startup, regardless of the current state of the container. unless-stopped Always restart the container regardless of the exit status, but do not start it on daemon startup if the container has been put to a stopped state before. An ever increasing delay (double the previous delay, starting at 100 milliseconds) is added before each restart to prevent flooding the server. This means the daemon will wait for 100 ms, then 200 ms, 400, 800, 1600, and so on until either the on-failure limit is hit, or when you docker stop or docker rm -f the container. If a container is successfully restarted (the container is started and runs for at least 10 seconds), the delay is reset to its default value of 100 ms. You can specify the maximum amount of times Docker will try to restart the container when using the on-failure policy. The default is that Docker will try forever to restart the container. The number of (attempted) restarts for a container can be obtained via docker inspect. For example, to get the number of restarts for container “my-container”;
  8. Cluster management integrated with Docker Engine: Use the Docker Engine CLI to create a Swarm of Docker Engines where you can deploy application services. You don’t need additional orchestration software to create or manage a Swarm. Decentralized design: Instead of handling differentiation between node roles at deployment time, the Docker Engine handles any specialization at runtime. You can deploy both kinds of nodes, managers and workers, using the Docker Engine. This means you can build an entire Swarm from a single disk image. Declarative service model: Docker Engine uses a declarative approach to let you define the desired state of the various services in your application stack. For example, you might describe an application comprised of a web front end service with message queueing services and a database backend. Scaling: For each service, you can declare the number of tasks you want to run. When you scale up or down, the swarm manager automatically adapts by adding or removing tasks to maintain the desired state. Desired state reconciliation: The swarm manager node constantly monitors the cluster state and reconciles any differences between the actual state your expressed desired state. For example, if you set up a service to run 10 replicas of a container, and a worker machine hosting two of those replicas crashes, the manager will create two new replicas to replace the ones that crashed. The swarm manager assigns the new replicas to workers that are running and available. Multi-host networking: You can specify an overlay network for your services. The swarm manager automatically assigns addresses to the containers on the overlay network when it initializes or updates the application. Service discovery: Swarm manager nodes assign each service in the swarm a unique DNS name and load balances running containers. You can query every container running in the swarm through a DNS server embedded in the swarm. Load balancing: You can expose the ports for services to an external load balancer. Internally, the swarm lets you specify how to distribute service containers between nodes. Secure by default: Each node in the swarm enforces TLS mutual authentication and encryption to secure communications between itself and all other nodes. You have the option to use self-signed root certificates or certificates from a custom root CA. Rolling updates: At rollout time you can apply service updates to nodes incrementally. The swarm manager lets you control the delay between service deployment to different sets of nodes. If anything goes wrong, you can roll-back a task to a previous version of the service.
  9. What is Consul? Consul has multiple components, but as a whole, it is a tool for discovering and configuring services in your infrastructure. It provides several key features: Service Discovery: Clients of Consul can provide a service, such as api or mysql, and other clients can use Consul to discover providers of a given service. Using either DNS or HTTP, applications can easily find the services they depend upon. Health Checking: Consul clients can provide any number of health checks, either associated with a given service ("is the webserver returning 200 OK"), or with the local node ("is memory utilization below 90%"). This information can be used by an operator to monitor cluster health, and it is used by the service discovery components to route traffic away from unhealthy hosts. Key/Value Store: Applications can make use of Consul's hierarchical key/value store for any number of purposes, including dynamic configuration, feature flagging, coordination, leader election, and more. The simple HTTP API makes it easy to use. Multi Datacenter: Consul supports multiple datacenters out of the box. This means users of Consul do not have to worry about building additional layers of abstraction to grow to multiple regions.