SlideShare a Scribd company logo
1 of 42
Download to read offline
www.modeliosoft.com
Model driven engineering for big data
management systems
Marcos ALMEIDA marcos.almeida@softeam.fr
Sarah DAHAB sarah.dahab@telecom-sudparis.eu
Andrey SADOVYKH andrey.sadovykh@softeam.fr
1
Outlines
Introduction
Model-driven
development
Big Data
Juniper
Sample
application
Conclusions
www.modeliosoft.com 2
20 ME
2006
17,5 ME
2005
70 ME
2013
Paris
Rennes
Nantes
Sophia
SOFTEAM – a French IT services / Software vendor
•SOFTEAM, a growing
company
 25 years’ experience
 900 experts
 Regular growth
• Specialist in OO technologies,
new architectures,
methodologies
• Banking, Defense, Telecom, …
www.modeliosoft.com 3
23 ME
2008
Modelio for Software
and System Engineering
• UML editor with 20 years’ history
o CloudML
o SysML
o MARTE
o Code generation
o Documentation
o Teamwork
www.modeliosoft.com 4
• Available under open
source at Modelio.org
MODEL-DRIVEN DEVELOPMENT
www.modeliosoft.com 5
It is all about models … Starting with UML
www.modeliosoft.com 6
Requirements
UML Use
Cases
Architecture
UML
Components
and Classes
Design
Refined
Classes
or Domain
Specific
Language
Implementation
Code
generation
Java, C++,
Frameworks
Model = Code
www.modeliosoft.com 7
Typical example: Control system for a frigate
• 800+ components
• Developed by 100+ engineers
• 1M+ LOC
• MDD fosters Productivity and Quality with
o Code generation
o Components reuse
o Tracing
o Automation
www.modeliosoft.com 8
Curious DSL example: Ruby on Rails
Haml HTML
%br{:clear => left’} <br clear=”left”/>
%p.foo Hello <p class=”foo”>Hello</p>
%p#foo Hello <p id=”foo”>Hello</p>
.foo <div class=”foo”>...</div>
#foo.bar <div id=”foo” class=”bar”>...</div>
www.modeliosoft.com 9
Feature: User can manually add movie
Scenario: Add a movie
Given I am on the RottenPotatoes home page
When I follow "Add new movie"
Then I should be on the Create New Movie page
When I fill in "Title" with "Men In Black"
And I should see "Men In Black"
Cucumber
and Capybara
HAML
What do we get from MDD?
Pros
• Design once, deploy
everywhere!
• Write your
transformation once,
transform anything!
Cons
• Transformations are
hard to write…
• How to make sure they
are CORRECT? i.e.
– Is there any
data/semantic loss?
www.modeliosoft.com 10
BIG DATA
www.modeliosoft.com 11
Volume, variety, velocity
1. @-mails sent
every second : 2,9
million
2. Video uploaded to
YouTube every
minute: 25 hours
3. Data processed by
Google every day:
24 petabytes
4. Tweets per day:
50 million
5. Products ordered
on Amazon per
second: 73 items
www.modeliosoft.com 12
Only 0,5 % of data is analyzed
• In 2012, 2 837EB generated
- just 0,5% actually
analyzed.
That still amounts to 14EB
(or 14.185 million
terabytes)
Source: IDC & EMC
www.modeliosoft.com 13
The main problem is Heterogeneity!
• Many different database management systems
o Ex:
• MySQL (www.mysql.com/),
• Big Table (http://research.google.com/archive/bigtable.html)
• SimpleDB (http://aws.amazon.com/simpledb/)
• Memcached (http://memcached.org/)
• …
• Many underlying data representation paradigms
o Ex:
• Relational Databases
• Key-value Stores
• Object-oriented Databases
• Big Tables
• …
www.modeliosoft.com 14
The basis of our solution is MDE… Why?
• Separating the problem from the solution
o In JUNIPER we model the solution
• Fostering automation
o Analysis
o Code generation
www.modeliosoft.com 15
Business
Objects Transformation
HDFS
MySQL
MongoDB
Abstract Models
Specific Models / code
Transformation
Transformation
Understanding the problem… Why is it so
HARD? (1/2)
• Target Technologies based on different paradigms
• Example:
www.modeliosoft.com 16
A
B
JPA
@Entity
public class A {
@Basic
public B getB(){
…
}
…
}
SQL
create table A (…)
create table B (…)
create table A_B (…)
Understanding the problem… Why is it so
HARD? (2/2)
• Target structure is variable
• Example:
www.modeliosoft.com 17
A
B
ER
NoSQL
A
BAB
Here A and B
are
independent
entities
Here, for
performance
reasons, B is
embedded in A
A
B
Illustration: comparative features of MongoDB and
PostgreSQL
www.modeliosoft.com 18
Our solution: a component based approach to NoSQL
heterogeneity
• Generic model transformation chain
oIntegrated to other Juniper tools
•Audit rules
•Model to model transformations
•Code generators
• Database specific instantiations
oApplication architecture modelling
oData modelling
oHardware architecture (deployment) modelling
www.modeliosoft.com 19
www.modeliosoft.com 20
The Juniper FP7 EU project
Website: http://www.juniper-project.org/
Start Date: 2012-12-01
Duration: 36 months
Total cost: 4 M€
www.modeliosoft.com 21
JUNIPER integrates Big Data technologies over MPI
www.modeliosoft.com 22
DOCs
StreamsDBs
Data Processing
Stage 1 Stage N
Business
Intelligence
Analytical
DBs
Visualization
dbdb
DOCsDOCs
Data Processing in JUNIPER
S1
S3
S2
Analytical
DBs
mpi
mpi
mpi
mpi
FPGA-enabled
nodes
Hadoop
HPC
Modelling in Juniper
www.modeliosoft.com 23
Models
High level
Architecture
(Nodes,Programs,
Streams…)
Real-time
constraints
Java
Code Code
Generation (+MPI initialization, communication, etc)
Reverse
Engineering
Schedulability
Analysis
Tool
Scheduling
Advisor
Measurements &
Advice
Deployment
Scripts
ConfigurationModel
Export
Code
Generation
Mapping Programming Model, UML and MARTE
www.modeliosoft.com 24
JUNIPER
Program
Channel
Cloud Node
Programming
Model
UML MARTE
Modelling the application and real-time constraints
www.modeliosoft.com 25
Real-time constrains
- response time
- bandwidth
Big Data flow
JUNIPER
Programs
Modelling the hardware infrastructure at a high level
www.modeliosoft.com 26
Cloud Node
CPU with 4 cores Hard drive
MPI code generation
www.modeliosoft.com 27
Code
Generation
JUNIPER
Application
Model
Overview of the Juniper programming model concepts
Next step: integrating data modelling to the programming model
www.modeliosoft.com 28
Business data modelling in Juniper
• Example
www.modeliosoft.com 29
Uniquely identifying
pieces of data
Partitioning data
In different nodes
Business data modelling in Juniper
• Concepts
www.modeliosoft.com 30
Approach taken for dealing with heterogeneity in JUNIPER
1. Define a generic template for Modelio modules to provide support
for big data management systems
2. Instantiate the template for MongoDB and PostgreSQL
www.modeliosoft.com 31
MongoDB modelling module
www.modeliosoft.com 32
MongoDB Example (1/2)
www.modeliosoft.com 33
MongoDB Example (2/2)
+ DEMO Video
Database schema configuration scripts
Deployment scripts
Configuration scripts
www.modeliosoft.com 34
PostgreSQL modeller module
www.modeliosoft.com 35
PostgreSQL Example
master installation script
standby installation script
configuration files
www.modeliosoft.com 36
DATABASE MIGRATION SAMPLE
APPLICATION
www.modeliosoft.com 37
[VIDEO]
www.modeliosoft.com 38
CONCLUSIONS
www.modeliosoft.com 39
In short…
• Challenge:
o Big data applications  How should we handle heterogeneous data ??
• Juniper response:
o Model driven solution for designing real-time big data systems
o Component based solution to heterogeneity
• General business objects + big data concepts modelling
• Database specific concepts
– Modelling
– Model transformations
– Code generation
www.modeliosoft.com 40
… and Perspectives / Exploitation
• Source code and documentation available on our website
o http://forge.modelio.org/projects/juniper
o http://forge.modelio.org/projects/mongodb-modeler
o http://forge.modelio.org/projects/postgresql-modeler
• Tutorial + Dissemination on our forum
www.modeliosoft.com 41
Questions?
Marcos Almeida
SOFTEAM | ModelioSoft
{name.surname}@softeam.fr
SOFTEAM R&D Web Site:
http://rd.softeam.com
Modelio Web Site :
http://www.modelio.org
http://forge.modelio.org/projects/juniper
JUNIPER Web Site :
http://www.juniper-project.org
www.modeliosoft.com 42
*
*for your questions

More Related Content

What's hot

CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019Christoph Windheuser
 
Development of 3 d interfaces for mobile BIM applications by João Poças Martins
Development of 3 d interfaces for mobile BIM applications by João Poças MartinsDevelopment of 3 d interfaces for mobile BIM applications by João Poças Martins
Development of 3 d interfaces for mobile BIM applications by João Poças MartinsJoao Rio
 
ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...
ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...
ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...Pieter Pauwels
 
CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"
CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"
CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"Pieter Pauwels
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 DataBench
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...Big Data Value Association
 
Industry 4.0 Assessment Overview
Industry 4.0 Assessment OverviewIndustry 4.0 Assessment Overview
Industry 4.0 Assessment OverviewEd Morrison
 
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...DataBench
 
BIM in France - A journey from standard to dictionary
BIM in France - A journey from standard to dictionaryBIM in France - A journey from standard to dictionary
BIM in France - A journey from standard to dictionary Mariela Daskalova
 
Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source
Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source
Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source Mark Brörkens
 
BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...
BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...
BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...Pieter Pauwels
 
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF toolsCIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF toolsPieter Pauwels
 
BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...
BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...
BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...Pieter Pauwels
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...Alok Singh
 
AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...
AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...
AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...Ali Ismail
 
1257103560 X Mp Lantand Iso15926 Oct2009
1257103560 X Mp Lantand Iso15926 Oct20091257103560 X Mp Lantand Iso15926 Oct2009
1257103560 X Mp Lantand Iso15926 Oct2009Giorgio Amici
 

What's hot (16)

CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
CD4ML - ThoughtWorks MeetUp Munich Christoph Windheuser May 8th 2019
 
Development of 3 d interfaces for mobile BIM applications by João Poças Martins
Development of 3 d interfaces for mobile BIM applications by João Poças MartinsDevelopment of 3 d interfaces for mobile BIM applications by João Poças Martins
Development of 3 d interfaces for mobile BIM applications by João Poças Martins
 
ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...
ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...
ACM SIGMOD SBD2016 - Querying and reasoning over large scale building dataset...
 
CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"
CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"
CIB W78 2015 - Keynote "The Web of Construction Data:Pathways and Opportunities"
 
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018 Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
Big Data Technical Benchmarking, Arne Berre, BDVe Webinar series, 09/10/2018
 
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...
 
Industry 4.0 Assessment Overview
Industry 4.0 Assessment OverviewIndustry 4.0 Assessment Overview
Industry 4.0 Assessment Overview
 
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
Benchmarking for Big Data Applications with the DataBench Framework, Arne Ber...
 
BIM in France - A journey from standard to dictionary
BIM in France - A journey from standard to dictionaryBIM in France - A journey from standard to dictionary
BIM in France - A journey from standard to dictionary
 
Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source
Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source
Eclipse RMF - Requirements Modeling Framework - ReqIF in der Open Source
 
BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...
BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...
BIMMeeting 2016 - BIM-Infra-GIS: building bridges from single buildings to di...
 
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF toolsCIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
CIB W78 Accelerating BIM Workshop 2015 - IFC2RDF tools
 
BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...
BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...
BuildingSMART Standards Summit 2015 - JBeetz - Product Room - Use Cases for i...
 
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
ODSC18, London, How to build high performing weighted XGBoost ML Model for Re...
 
AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...
AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...
AEC Hackathon -London (06-08/10/2017) Team Conenctivity- BIM and smart city c...
 
1257103560 X Mp Lantand Iso15926 Oct2009
1257103560 X Mp Lantand Iso15926 Oct20091257103560 X Mp Lantand Iso15926 Oct2009
1257103560 X Mp Lantand Iso15926 Oct2009
 

Similar to Model driven engineering for big data management systems

JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...Andrey Sadovykh
 
Multi datastores - CLOSER'14
Multi datastores - CLOSER'14Multi datastores - CLOSER'14
Multi datastores - CLOSER'14Marcos Almeida
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
MODEL-DRIVEN ENGINEERING (MDE) in Practice
MODEL-DRIVEN ENGINEERING (MDE) in PracticeMODEL-DRIVEN ENGINEERING (MDE) in Practice
MODEL-DRIVEN ENGINEERING (MDE) in PracticeHussein Alshkhir
 
Enterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceEnterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceMongoDB
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionFlorian Wilhelm
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorBigML, Inc
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...MLconf
 
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDatenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDenodo
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
The REMICS model-driven process for migrating legacy applications to the cloud
The REMICS model-driven process for migrating legacy applications to the cloudThe REMICS model-driven process for migrating legacy applications to the cloud
The REMICS model-driven process for migrating legacy applications to the cloudMarcos Almeida
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceMongoDB
 
Innovation in model driven software
Innovation in model driven softwareInnovation in model driven software
Innovation in model driven softwareSagi Schliesser
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning InfrastructureSigOpt
 
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Digipolis Antwerpen
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningEdunomica
 
Big Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingBig Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingPaco Nathan
 
2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challangesIvica Crnkovic
 
A Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small ProjectsA Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small ProjectsGabor Guta
 

Similar to Model driven engineering for big data management systems (20)

JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
 
Multi datastores - CLOSER'14
Multi datastores - CLOSER'14Multi datastores - CLOSER'14
Multi datastores - CLOSER'14
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
MODEL-DRIVEN ENGINEERING (MDE) in Practice
MODEL-DRIVEN ENGINEERING (MDE) in PracticeMODEL-DRIVEN ENGINEERING (MDE) in Practice
MODEL-DRIVEN ENGINEERING (MDE) in Practice
 
Enterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a ServiceEnterprise Trends for MongoDB as a Service
Enterprise Trends for MongoDB as a Service
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
DutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive SectorDutchMLSchool. ML for Energy Trading and Automotive Sector
DutchMLSchool. ML for Energy Trading and Automotive Sector
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssenDatenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
Datenstrategie der Zukunft - Technologietrends, die Sie kennen müssen
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
The REMICS model-driven process for migrating legacy applications to the cloud
The REMICS model-driven process for migrating legacy applications to the cloudThe REMICS model-driven process for migrating legacy applications to the cloud
The REMICS model-driven process for migrating legacy applications to the cloud
 
Webinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-ServiceWebinar: Enterprise Trends for Database-as-a-Service
Webinar: Enterprise Trends for Database-as-a-Service
 
Innovation in model driven software
Innovation in model driven softwareInnovation in model driven software
Innovation in model driven software
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
 
Big Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely headingBig Data is changing abruptly, and where it is likely heading
Big Data is changing abruptly, and where it is likely heading
 
2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges
 
A Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small ProjectsA Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small Projects
 

Recently uploaded

TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSedrianrheine
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpressssuser166378
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024Jan Löffler
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteMavein
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdfShreedeep Rayamajhi
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSlesteraporado16
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Shubham Pant
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilitiesalihassaah1994
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxnaveenithkrishnan
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfmchristianalwyn
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsRoxana Stingu
 

Recently uploaded (12)

TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpress
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a Website
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilities
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptx
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
 

Model driven engineering for big data management systems

  • 1. www.modeliosoft.com Model driven engineering for big data management systems Marcos ALMEIDA marcos.almeida@softeam.fr Sarah DAHAB sarah.dahab@telecom-sudparis.eu Andrey SADOVYKH andrey.sadovykh@softeam.fr 1
  • 3. 20 ME 2006 17,5 ME 2005 70 ME 2013 Paris Rennes Nantes Sophia SOFTEAM – a French IT services / Software vendor •SOFTEAM, a growing company  25 years’ experience  900 experts  Regular growth • Specialist in OO technologies, new architectures, methodologies • Banking, Defense, Telecom, … www.modeliosoft.com 3 23 ME 2008
  • 4. Modelio for Software and System Engineering • UML editor with 20 years’ history o CloudML o SysML o MARTE o Code generation o Documentation o Teamwork www.modeliosoft.com 4 • Available under open source at Modelio.org
  • 6. It is all about models … Starting with UML www.modeliosoft.com 6 Requirements UML Use Cases Architecture UML Components and Classes Design Refined Classes or Domain Specific Language Implementation Code generation Java, C++, Frameworks
  • 8. Typical example: Control system for a frigate • 800+ components • Developed by 100+ engineers • 1M+ LOC • MDD fosters Productivity and Quality with o Code generation o Components reuse o Tracing o Automation www.modeliosoft.com 8
  • 9. Curious DSL example: Ruby on Rails Haml HTML %br{:clear => left’} <br clear=”left”/> %p.foo Hello <p class=”foo”>Hello</p> %p#foo Hello <p id=”foo”>Hello</p> .foo <div class=”foo”>...</div> #foo.bar <div id=”foo” class=”bar”>...</div> www.modeliosoft.com 9 Feature: User can manually add movie Scenario: Add a movie Given I am on the RottenPotatoes home page When I follow "Add new movie" Then I should be on the Create New Movie page When I fill in "Title" with "Men In Black" And I should see "Men In Black" Cucumber and Capybara HAML
  • 10. What do we get from MDD? Pros • Design once, deploy everywhere! • Write your transformation once, transform anything! Cons • Transformations are hard to write… • How to make sure they are CORRECT? i.e. – Is there any data/semantic loss? www.modeliosoft.com 10
  • 12. Volume, variety, velocity 1. @-mails sent every second : 2,9 million 2. Video uploaded to YouTube every minute: 25 hours 3. Data processed by Google every day: 24 petabytes 4. Tweets per day: 50 million 5. Products ordered on Amazon per second: 73 items www.modeliosoft.com 12
  • 13. Only 0,5 % of data is analyzed • In 2012, 2 837EB generated - just 0,5% actually analyzed. That still amounts to 14EB (or 14.185 million terabytes) Source: IDC & EMC www.modeliosoft.com 13
  • 14. The main problem is Heterogeneity! • Many different database management systems o Ex: • MySQL (www.mysql.com/), • Big Table (http://research.google.com/archive/bigtable.html) • SimpleDB (http://aws.amazon.com/simpledb/) • Memcached (http://memcached.org/) • … • Many underlying data representation paradigms o Ex: • Relational Databases • Key-value Stores • Object-oriented Databases • Big Tables • … www.modeliosoft.com 14
  • 15. The basis of our solution is MDE… Why? • Separating the problem from the solution o In JUNIPER we model the solution • Fostering automation o Analysis o Code generation www.modeliosoft.com 15 Business Objects Transformation HDFS MySQL MongoDB Abstract Models Specific Models / code Transformation Transformation
  • 16. Understanding the problem… Why is it so HARD? (1/2) • Target Technologies based on different paradigms • Example: www.modeliosoft.com 16 A B JPA @Entity public class A { @Basic public B getB(){ … } … } SQL create table A (…) create table B (…) create table A_B (…)
  • 17. Understanding the problem… Why is it so HARD? (2/2) • Target structure is variable • Example: www.modeliosoft.com 17 A B ER NoSQL A BAB Here A and B are independent entities Here, for performance reasons, B is embedded in A A B
  • 18. Illustration: comparative features of MongoDB and PostgreSQL www.modeliosoft.com 18
  • 19. Our solution: a component based approach to NoSQL heterogeneity • Generic model transformation chain oIntegrated to other Juniper tools •Audit rules •Model to model transformations •Code generators • Database specific instantiations oApplication architecture modelling oData modelling oHardware architecture (deployment) modelling www.modeliosoft.com 19
  • 21. The Juniper FP7 EU project Website: http://www.juniper-project.org/ Start Date: 2012-12-01 Duration: 36 months Total cost: 4 M€ www.modeliosoft.com 21
  • 22. JUNIPER integrates Big Data technologies over MPI www.modeliosoft.com 22 DOCs StreamsDBs Data Processing Stage 1 Stage N Business Intelligence Analytical DBs Visualization dbdb DOCsDOCs Data Processing in JUNIPER S1 S3 S2 Analytical DBs mpi mpi mpi mpi FPGA-enabled nodes Hadoop HPC
  • 23. Modelling in Juniper www.modeliosoft.com 23 Models High level Architecture (Nodes,Programs, Streams…) Real-time constraints Java Code Code Generation (+MPI initialization, communication, etc) Reverse Engineering Schedulability Analysis Tool Scheduling Advisor Measurements & Advice Deployment Scripts ConfigurationModel Export Code Generation
  • 24. Mapping Programming Model, UML and MARTE www.modeliosoft.com 24 JUNIPER Program Channel Cloud Node Programming Model UML MARTE
  • 25. Modelling the application and real-time constraints www.modeliosoft.com 25 Real-time constrains - response time - bandwidth Big Data flow JUNIPER Programs
  • 26. Modelling the hardware infrastructure at a high level www.modeliosoft.com 26 Cloud Node CPU with 4 cores Hard drive
  • 27. MPI code generation www.modeliosoft.com 27 Code Generation JUNIPER Application Model
  • 28. Overview of the Juniper programming model concepts Next step: integrating data modelling to the programming model www.modeliosoft.com 28
  • 29. Business data modelling in Juniper • Example www.modeliosoft.com 29 Uniquely identifying pieces of data Partitioning data In different nodes
  • 30. Business data modelling in Juniper • Concepts www.modeliosoft.com 30
  • 31. Approach taken for dealing with heterogeneity in JUNIPER 1. Define a generic template for Modelio modules to provide support for big data management systems 2. Instantiate the template for MongoDB and PostgreSQL www.modeliosoft.com 31
  • 34. MongoDB Example (2/2) + DEMO Video Database schema configuration scripts Deployment scripts Configuration scripts www.modeliosoft.com 34
  • 36. PostgreSQL Example master installation script standby installation script configuration files www.modeliosoft.com 36
  • 40. In short… • Challenge: o Big data applications  How should we handle heterogeneous data ?? • Juniper response: o Model driven solution for designing real-time big data systems o Component based solution to heterogeneity • General business objects + big data concepts modelling • Database specific concepts – Modelling – Model transformations – Code generation www.modeliosoft.com 40
  • 41. … and Perspectives / Exploitation • Source code and documentation available on our website o http://forge.modelio.org/projects/juniper o http://forge.modelio.org/projects/mongodb-modeler o http://forge.modelio.org/projects/postgresql-modeler • Tutorial + Dissemination on our forum www.modeliosoft.com 41
  • 42. Questions? Marcos Almeida SOFTEAM | ModelioSoft {name.surname}@softeam.fr SOFTEAM R&D Web Site: http://rd.softeam.com Modelio Web Site : http://www.modelio.org http://forge.modelio.org/projects/juniper JUNIPER Web Site : http://www.juniper-project.org www.modeliosoft.com 42 * *for your questions