DICE
Horizon 2020 Research & Innovation Action
Grant Agreement no. 644869
http://www.dice-h2020.eu
Funded by the Horizon 2020
Framework Programme of the European Union
DICE:
quality Big Data made easy
Matej Artač @matej_artac
XLAB Research @xlab_research
Michele Guerriero, Damian A. Tamburri
Politecnico di Milano
Agenda
o Introduction
o About DICE
 Developing Data-Intensive Cloud Applications with
Iterative Quality Enhancements
o DICER
o TOSCA
o Cloudify
o DICE delivery tools
2
Introduction
3
Building blocks for DIAs today
4
Coordinator (Kafka)
Orchestrator (Hadoop Cluster)
Data
Store
Batch
Layer
Speed
Layer
Serving
Layer
Serving
Layer
Serving
Layer
Data
Source
Data
Source
Distributed
computation
Data streaming
HDFS
Distributed storage
Lambda architecture Cloud infrastructure
What problems EU SMEs face?
5
Traditional market:
Legacy software systems
Customers with legacy
data now ask for Big
Data technologies
Growth in sight, but …
Learning curves
Initial prototyping
Risk of failure
(+ others…)
Fast-paced market
Traditional approach to deployment
o Spend time studying
documentation
o Use trial and error to set up
a working cluster
o Use incompatible public
cookbooks
o Repeat for each change
o Keep the Big Data cluster
fixed for fear of breaking it
days
What is in the box?
o DICER
 model your application
 create a TOSCA blueprint
o Cloudify
 Pure-play orchestration & automation
o DICE deployment tool
 alternative front-end for Cloudify
o TOSCA library
 worries about deploying Big Data
services so that you don’t have to
7
DICE
Developing Data-Intensive Cloud Applications with
Iterative Quality Enhancements
8
The Rapid Growth of Big Data
9
o Software market rapidly shifting to Big data
 27% compound annual growth rate through 2017 (IDC)
 Popular technologies such as Spark, Hadoop, and NoSQL
boost Big Data adoption and revenues from new services
Business issue: 65% of Big data projects still fail (CapGemini’15)
Source: IDC Source: Wikibon
DevOps toolchains in innovation
10
Application Release Automation
Continuous Delivery
DevOps Toolchain
DICE Mission and Partners
 ICT 9 Call/2014 – Software engineering
 9 partners (Academia & SMEs), 7 EU countries
11
Mission: support SMEs in developing high-quality
cloud-based data-intensive applications (DIAs)
(IEAT)
(IMP)
(PMI)
(ZAR) (NETF)
(XLAB)
(ATC)
(FLEXI)(PRO)
Ingredients of the DICE approach
o DevOps
o Model-Driven Engineering
12
Dev Ops Dev Ops
Analysis
Deployment blueprint
DICE incremental modeling and analysis
13
DICE Platform Independent Model (DPIM)
DICE Technology Specific Model (DTSM)
DICE Deployment Specific Model (DDSM)
is implemented by
is deployed onto
TOSCA
blueprint
Analysis
Analysis
Analysis &
Optimization
M2M transformation
M2M transformation
M2T transformation
DICEMethodology
DICE deployment, monitoring and testing
14
Deployment
Testbed
Monitoring
Fault Injection
Quality
Testing
Trace
Checking
Enhancement
Anomaly
Detection
Running
DIA
Comp
MW
VM
Running
DIA
Comp
MW
VM
Configuration
optimization
TOSCA
blueprint
DICEMethodology
DICE architecture
15
IDE
based on
Eclipse
Profile
Simulat
ion
Optimiz
ation
Verifica
tion
Repository
& CI
Configuration
Optimization
Delivery
Running
DIA
Comp
Running
DIA
Comp
Monitoring
Trace
Checking
Enhancement
Anomaly
Detection
Fault Injection
(Resilience)
Quality
Testing
MW
VM
MW
VM
MW
VM
Running
DIA
Comp
DICER
XLAB delivery tools
16
Delivery
Running
DIA
Comp
Running
DIA
Comp
MW
VM
MW
VM
MW
VM
Running
DIA
Comp
DICER
DICER
Create actionable deployment diagrams of
Data Intensive Applications
17
DICER
o Assisted Component-based infrastructure design
o 100% automation of model transformations
18
Bordeaux Plenary
DICER in action: Assisted modelling!
19
1. <<Jerry: I’m modelling this DDSM thingie… I don’t know what I’m doing!>>
2. <<DICE: Hey Jerry! You’re
missing this piece here… And
there!>>
…
Bordeaux Plenary
DICER in action: Assisted modelling!
20
3. <<DICE: ok that’s better now… Carry on…>>
OASIS TOSCA
OASIS Topology and Orchestration Specification
for Cloud Applications
21
What is TOSCA?
o Open standard
o Enabling a unique Cloud
eco-system
o Supported by a large
and growing number of
international industry
leaders
22
Associated Companies
TOSCA is an Intent Model which is declarative
(integration points for imperative)
TOSCA Domain-Specific Language
Information Models
Typically, used to model a constrained
domain that can be described by a
closed set of entity types, properties,
relationships and operations.
Data Models
Typically, describe the structure
(format), enabling manipulation (via
interfaces) of the data stored in data
management systems assuring
integrity.
• Topology
• Composition
• Requirements - Capabilities
• State (Nodes, Relationships)
• Lifecycle (Management)
• Policy
Intent Model Adds:
• Structure
• Format
• interfaces
• Types, Relationships
• Properties
• Operations
 TOSCA can work with
imperative scripts
(e.g., Ansible, Chef,
Bash, Ant, etc.)
 TOSCA can include
other data models
(e.g., JSON, YANG)
Tier (Group Type)
TOSCA is used first and foremost to describe the topology of the deployment view for
cloud applications and services
Topology – Nodes and Relationships
24
source_resource
Node_Type_A
target_resource
Node_Type_B
Requirement
connect_relationship
ConnectsTo
Capability
Nodes - are the resources
or components that will be
materialized or consumed
in the deployment
topology
Relationships
express the dependencies
between the nodes (not the
traffic flow)
Requirement - Capability
Relationships can be
customized to match specific
source requirements to target
capabilities
Groups
Create Logical,
Management or Policy
groups (1 or more nodes)
 Node templates to describe components in the topology
structure
 Relationship templates to describe connections,
dependencies, deployment ordering
Application Tier
(container)
Application
Tier
(container)
Composition – different service templates can be
“wired” together
25
Logging/Monitoring Tier (ELK)
nodejs
WebServer
app_server
Compute
paypal_pizza
store
WebApplication
collectd
logstash
SoftwareComponent
Requirements
Container
Capabilities
log_endpoint
logstash_server
Compute
Capabilities
Container
elasticsearch
SoftwareComponent
Requirements
Container
Capabilities
search_endpoint
elasticsearch
_server
Compute
Capabilities
kibana
SoftwareComponent
Requirements
Container
kibana_server
Compute
Capabilities
search_endpoint
ConnectsTo
HostedOn HostedOn HostedOn
ConnectsTo
mongo_dbms
DBMS
mongo_server
Compute
mongo_db
Database
rsyslog
search_endpoint
ContainerContainer
ConnectsTo
Enabling the description of complex, multi-tier (hybrid) Cloud applications
Cloudify
26
27
Introducing Cloudify
Open | Extensible | Simple
REPEATABLE
28
Application
Blueprint
(TOSCA)
IaaS
Plugins
Container
Plugins
Conf Mgmt
Plugins
● Provision
● Configure
● Monitor
● Manage Monitoring &
Alarming
29
30
Cloudify Key Aspects
Open Source
Open source is key
to drive innovation
and create superb
quality software.
Open Standard
Open standard-based
TOSCA Spec for
application blueprints
allows vendor
neutrality, and
enables collaboration.
Future Proof
Try new emerging
technologies while
using stable in place
existing ones.
31
The only constant is change”
-
Unknown
DICE Delivery Tools
32
Components of Delivery Tools
33
RESTful API
IaaS
Web GUI
Technology
Library
Delivery Service
34
Container 1
Blueprint
A
Platform
params
Container 2
Blueprint
B
Blueprint
B.2
Platform
params
Container 15
Blueprint
B.2
TOSCA technology library
o A plug-in for Cloudify
o A single import line in the TOSCA blueprint
o Node types + Chef cookbooks for Big Data services
o Unified across supported IaaS vendors
35©DICE
Deploy your own Big Data services
DevOps approach:
o Describe your Big Data
cluster and application in a
blueprint
o Store and maintain the
blueprint in your VCS with
the application’s code: IasC
o Rely on orchestrators and
configuration managers for
executing deployments
36
hours
Demo – DevOps
42
Conclusion
43
Conclusion
o DICE tools remove barriers to Big Data
o DICE technology library simplifies blueprints
o TOSCA blueprints describe infrastructure as code
o Enabled Continuous Integration and Continuous
Delivery
44
Links
o Cloudify: http://getcloudify.org
o DICE H2020: http://www.dice-h2020.eu/
o DICE deployment service:
https://github.com/dice-project/DICE-Deployment-Service
o Big Data blueprint examples:
https://github.com/dice-project/DICE-Deployment-Examples
o DICER:
https://github.com/dice-project/DICER
45
Follow us
o DICE project: @diceh2020
o Cloudify:
@CloudifySource – ilanadl@getcloudify.org
o User groups:
https://groups.google.com/forum/#!forum/cloudify-users
o Webinars:
http://getcloudify.org/webinars.html
o Matej Artač: @matej_artac – matej.artac@xlab.si
o XLAB Research: @xlab_research
46
Q & A
47

DICE & Cloudify – Quality Big Data Made Easy

  • 1.
    DICE Horizon 2020 Research& Innovation Action Grant Agreement no. 644869 http://www.dice-h2020.eu Funded by the Horizon 2020 Framework Programme of the European Union DICE: quality Big Data made easy Matej Artač @matej_artac XLAB Research @xlab_research Michele Guerriero, Damian A. Tamburri Politecnico di Milano
  • 2.
    Agenda o Introduction o AboutDICE  Developing Data-Intensive Cloud Applications with Iterative Quality Enhancements o DICER o TOSCA o Cloudify o DICE delivery tools 2
  • 3.
  • 4.
    Building blocks forDIAs today 4 Coordinator (Kafka) Orchestrator (Hadoop Cluster) Data Store Batch Layer Speed Layer Serving Layer Serving Layer Serving Layer Data Source Data Source Distributed computation Data streaming HDFS Distributed storage Lambda architecture Cloud infrastructure
  • 5.
    What problems EUSMEs face? 5 Traditional market: Legacy software systems Customers with legacy data now ask for Big Data technologies Growth in sight, but … Learning curves Initial prototyping Risk of failure (+ others…) Fast-paced market
  • 6.
    Traditional approach todeployment o Spend time studying documentation o Use trial and error to set up a working cluster o Use incompatible public cookbooks o Repeat for each change o Keep the Big Data cluster fixed for fear of breaking it days
  • 7.
    What is inthe box? o DICER  model your application  create a TOSCA blueprint o Cloudify  Pure-play orchestration & automation o DICE deployment tool  alternative front-end for Cloudify o TOSCA library  worries about deploying Big Data services so that you don’t have to 7
  • 8.
    DICE Developing Data-Intensive CloudApplications with Iterative Quality Enhancements 8
  • 9.
    The Rapid Growthof Big Data 9 o Software market rapidly shifting to Big data  27% compound annual growth rate through 2017 (IDC)  Popular technologies such as Spark, Hadoop, and NoSQL boost Big Data adoption and revenues from new services Business issue: 65% of Big data projects still fail (CapGemini’15) Source: IDC Source: Wikibon
  • 10.
    DevOps toolchains ininnovation 10 Application Release Automation Continuous Delivery DevOps Toolchain
  • 11.
    DICE Mission andPartners  ICT 9 Call/2014 – Software engineering  9 partners (Academia & SMEs), 7 EU countries 11 Mission: support SMEs in developing high-quality cloud-based data-intensive applications (DIAs) (IEAT) (IMP) (PMI) (ZAR) (NETF) (XLAB) (ATC) (FLEXI)(PRO)
  • 12.
    Ingredients of theDICE approach o DevOps o Model-Driven Engineering 12 Dev Ops Dev Ops Analysis Deployment blueprint
  • 13.
    DICE incremental modelingand analysis 13 DICE Platform Independent Model (DPIM) DICE Technology Specific Model (DTSM) DICE Deployment Specific Model (DDSM) is implemented by is deployed onto TOSCA blueprint Analysis Analysis Analysis & Optimization M2M transformation M2M transformation M2T transformation DICEMethodology
  • 14.
    DICE deployment, monitoringand testing 14 Deployment Testbed Monitoring Fault Injection Quality Testing Trace Checking Enhancement Anomaly Detection Running DIA Comp MW VM Running DIA Comp MW VM Configuration optimization TOSCA blueprint DICEMethodology
  • 15.
    DICE architecture 15 IDE based on Eclipse Profile Simulat ion Optimiz ation Verifica tion Repository &CI Configuration Optimization Delivery Running DIA Comp Running DIA Comp Monitoring Trace Checking Enhancement Anomaly Detection Fault Injection (Resilience) Quality Testing MW VM MW VM MW VM Running DIA Comp DICER
  • 16.
  • 17.
    DICER Create actionable deploymentdiagrams of Data Intensive Applications 17
  • 18.
    DICER o Assisted Component-basedinfrastructure design o 100% automation of model transformations 18
  • 19.
    Bordeaux Plenary DICER inaction: Assisted modelling! 19 1. <<Jerry: I’m modelling this DDSM thingie… I don’t know what I’m doing!>> 2. <<DICE: Hey Jerry! You’re missing this piece here… And there!>> …
  • 20.
    Bordeaux Plenary DICER inaction: Assisted modelling! 20 3. <<DICE: ok that’s better now… Carry on…>>
  • 21.
    OASIS TOSCA OASIS Topologyand Orchestration Specification for Cloud Applications 21
  • 22.
    What is TOSCA? oOpen standard o Enabling a unique Cloud eco-system o Supported by a large and growing number of international industry leaders 22 Associated Companies
  • 23.
    TOSCA is anIntent Model which is declarative (integration points for imperative) TOSCA Domain-Specific Language Information Models Typically, used to model a constrained domain that can be described by a closed set of entity types, properties, relationships and operations. Data Models Typically, describe the structure (format), enabling manipulation (via interfaces) of the data stored in data management systems assuring integrity. • Topology • Composition • Requirements - Capabilities • State (Nodes, Relationships) • Lifecycle (Management) • Policy Intent Model Adds: • Structure • Format • interfaces • Types, Relationships • Properties • Operations  TOSCA can work with imperative scripts (e.g., Ansible, Chef, Bash, Ant, etc.)  TOSCA can include other data models (e.g., JSON, YANG)
  • 24.
    Tier (Group Type) TOSCAis used first and foremost to describe the topology of the deployment view for cloud applications and services Topology – Nodes and Relationships 24 source_resource Node_Type_A target_resource Node_Type_B Requirement connect_relationship ConnectsTo Capability Nodes - are the resources or components that will be materialized or consumed in the deployment topology Relationships express the dependencies between the nodes (not the traffic flow) Requirement - Capability Relationships can be customized to match specific source requirements to target capabilities Groups Create Logical, Management or Policy groups (1 or more nodes)  Node templates to describe components in the topology structure  Relationship templates to describe connections, dependencies, deployment ordering
  • 25.
    Application Tier (container) Application Tier (container) Composition –different service templates can be “wired” together 25 Logging/Monitoring Tier (ELK) nodejs WebServer app_server Compute paypal_pizza store WebApplication collectd logstash SoftwareComponent Requirements Container Capabilities log_endpoint logstash_server Compute Capabilities Container elasticsearch SoftwareComponent Requirements Container Capabilities search_endpoint elasticsearch _server Compute Capabilities kibana SoftwareComponent Requirements Container kibana_server Compute Capabilities search_endpoint ConnectsTo HostedOn HostedOn HostedOn ConnectsTo mongo_dbms DBMS mongo_server Compute mongo_db Database rsyslog search_endpoint ContainerContainer ConnectsTo Enabling the description of complex, multi-tier (hybrid) Cloud applications
  • 26.
  • 27.
    27 Introducing Cloudify Open |Extensible | Simple REPEATABLE
  • 28.
  • 29.
  • 30.
    30 Cloudify Key Aspects OpenSource Open source is key to drive innovation and create superb quality software. Open Standard Open standard-based TOSCA Spec for application blueprints allows vendor neutrality, and enables collaboration. Future Proof Try new emerging technologies while using stable in place existing ones.
  • 31.
    31 The only constantis change” - Unknown
  • 32.
  • 33.
    Components of DeliveryTools 33 RESTful API IaaS Web GUI Technology Library
  • 34.
    Delivery Service 34 Container 1 Blueprint A Platform params Container2 Blueprint B Blueprint B.2 Platform params Container 15 Blueprint B.2
  • 35.
    TOSCA technology library oA plug-in for Cloudify o A single import line in the TOSCA blueprint o Node types + Chef cookbooks for Big Data services o Unified across supported IaaS vendors 35©DICE
  • 36.
    Deploy your ownBig Data services DevOps approach: o Describe your Big Data cluster and application in a blueprint o Store and maintain the blueprint in your VCS with the application’s code: IasC o Rely on orchestrators and configuration managers for executing deployments 36 hours
  • 37.
  • 38.
  • 39.
    Conclusion o DICE toolsremove barriers to Big Data o DICE technology library simplifies blueprints o TOSCA blueprints describe infrastructure as code o Enabled Continuous Integration and Continuous Delivery 44
  • 40.
    Links o Cloudify: http://getcloudify.org oDICE H2020: http://www.dice-h2020.eu/ o DICE deployment service: https://github.com/dice-project/DICE-Deployment-Service o Big Data blueprint examples: https://github.com/dice-project/DICE-Deployment-Examples o DICER: https://github.com/dice-project/DICER 45
  • 41.
    Follow us o DICEproject: @diceh2020 o Cloudify: @CloudifySource – ilanadl@getcloudify.org o User groups: https://groups.google.com/forum/#!forum/cloudify-users o Webinars: http://getcloudify.org/webinars.html o Matej Artač: @matej_artac – matej.artac@xlab.si o XLAB Research: @xlab_research 46
  • 42.

Editor's Notes

  • #13 As models are the basis for analyses As incremental reasoning on models can lead to the automated creation of deployment blueprints
  • #25 Node templates to describe components in the topology structure Relationship templates to describe connections, dependencies, deployment ordering
  • #26 Example: Connect a Logging / Monitoring Service composed of ElasticSearch, LogStash and Kibana (ELK)
  • #35 Container is a logical unit of deployment to enable focused continuous delivery. Delivery Service handles platform parameters (e.g., credentials for the IaaS, OS image and flavour IDs, location of a monitoring service, etc.)
  • #40 Orchestration engine and blueprint visualization provided by Cloudify. Blueprint generated by DICER.
  • #41 Orchestration engine and blueprint visualization provided by Cloudify. Blueprint generated by DICER.