SlideShare a Scribd company logo
1 of 39
MAP-BASED
TRANSPARENT
PERSISTENCE
FOR VERY LARGE
MODELS lin
Abel Gómez
Massimo Tisi
Gerson Sunyé
and Jordi Cabot
1
OUTLINE
▌The landscape in MDE
▌Motivation: running example and
current persistence approaches
▌Towards a simple EMF-based
persistence layer
▌NEOEMF/MAP: A transparent
persistence layer for EMF models
▌Our experimental evaluation in a
nutshell
▌Conclusions and future work
© ATLANMOD - atlanmod-contact@mines-nantes.fr
2
INTRODUCTION
Why another
persistence
solution?
3
THE LANDSCAPE IN MDE
▌ Models and code generation are the center of the
software-engineering processes
▌ Modeling tools are built around modeling frameworks (EMF
has become the de facto standard)
▌ The technologies at the core of modeling frameworks were
designed to support simple modeling activities
▌ Since its publication, the XMI standard has been the
preferred format for storing and sharing models and
metamodels
▌ Clear limits arise when current technologies are applied to
VLMs: XML is not the right technology for VLMs (verbosity,
costly serialization/deserialization…)
▌ Some solutions exist, but problems in managing memory
and persisting data are still under-studied in MDE
© ATLANMOD - atlanmod-contact@mines-nantes.fr
4
MOTIVATION
Running example
Current
persistence
approaches
© ATLANMOD - atlanmod-contact@mines-nantes.fr
5
RUNING EXAMPLE
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Java Metamodel
(excerpt)
nsURI: ’http://java’
6
RUNING EXAMPLE
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Java Metamodel
(excerpt)
nsURI: ’http://java’
Instance
7
MOTIVATION
▌ Within a modeling ecosystem, all tools that
need to access or manipulate models have to
pass through a single model management
interface
▌ In some of these ecosystems (e.g. EMF) the
model management interface is
automatically generated from the
metamodel
© ATLANMOD - atlanmod-contact@mines-nantes.fr
8
THE GENERATED MODEL
MANAGEMENT INTERFACE
▌ // Creation of objects
▌ Package p1 := Factory.createPackage();
▌ ClassDeclaration c1 := Factory.createClassDeclaration();
▌ BodyDeclaration b1 := Factory.createBodyDeclaration();
▌ BodyDeclaration b2 := Factory.createBodyDeclaration();
▌ Modifier m1 := Factory.createModifier();
▌ Modifier m2 := Factory.createModifier();
▌ // Initialization of attributes
▌ p1.setName("package1");
▌ c1.setName("class1");
▌ b1.setName("bodyDecl1");
▌ b2.setName("bodyDecl2");
▌ m1.setVisibility(VisibilityKind.PUBLIC);
▌ m2.setVisibility(VisibilityKind.PUBLIC);
▌ // Initialization of references
▌ p1.getOwnedElements().add(c1);
▌ c1.getBodyDeclarations().add(b1);
▌ c1.getBodyDeclarations().add(b2);
▌ b1.setModifier(m1);
▌ b2.setModifier(m2)
© ATLANMOD - atlanmod-contact@mines-nantes.fr
9
MOTIVATION
▌ Without any specific memory-management
solution, the model would need to be fully
contained in memory for any access or
modification
▌ Models that exceed the main memory
would cause a significant performance drop
or the application crash
© ATLANMOD - atlanmod-contact@mines-nantes.fr
10
STANDARD TECHNOLOGIES FOR
PERSISTING MODELS IN EMF
▌XML-based (XMI)
│ Pros: Readability, fast for small models
│ Cons: Needs to load/keep the whole
model in memory.
▌Connected Data Objects (CDO)
│ Pros: on-demand loading, transactions,
versioning, notifications
│ Cons: Only the relational mapping is
regularly maintained, does not scale well
with VLMs
© ATLANMOD - atlanmod-contact@mines-nantes.fr
11
NEW TRENDS IN PERSISTING
MODELS IN EMF
▌ Morsa (document-oriented)
│ On-demand loading, incremental updates, fully compatible with the
EMF API
│ Requires its own query language to get good performance
▌ MongoEMF (document-oriented)
│ Uses the standard EMF API
│ It behaves different than the standard back-ends
▌ EMF fragments
│ Uses the standard proxy mechanism to partition models in small
chunks
│ Requires modifications on the metamodels to get the benefits of
partitions
▌ NeoEMF/Graph, a.k.a. Neo4EMF (graph-based)
│ Models are a set of highly interconnected elements → graphs are
the most natural way to represent them
│ The generated API only performs one-step navigations → only a
significant gain in performance is obtained when using native
queries on the underlying persistence back-end
© ATLANMOD - atlanmod-contact@mines-nantes.fr
12
MOTIVATION
▌ We need a transparent persistence layer
able to automatically persist, load and
unload model elements with no changes to
the application code
© ATLANMOD - atlanmod-contact@mines-nantes.fr
13
NEOEMF/MAP
DESIGN
GOALS
Towards a simple
EMF-based
persistence layer
14
MODEL-PERSISTENCE LAYER
▌NEOEMF/MAP must…
… be an exact replacement
… use a replaceable underlying engine
… allow different types of caching
… be memory friendly
… provide on-demand load capabilities
… free unused memory
… outperform current persistence layers
using the standard API
Interoperability
requirements
Performance
requirements
© ATLANMOD - atlanmod-contact@mines-nantes.fr
15
MODEL-PERSISTENCE LAYER
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Model
Manager
Persistence
Manager
Persistence
Backend
NeoEMF
/Map
EMF
/Graph
CDO
XMI
Serialization
Model-based Tools
XMI File GraphDB MapDB
Caching
Strategy
RelationalDB
Model Access API
Persistence API
Backend API
Client
Code
16
NEOEMF/MAP
A TRANSPARENT
PERSISTENCE
LAYER FOR
EMF MODELS
Memory Management
Map-based data model
Model operations as
map operations
17
MEMORY MANAGEMENT
▌ Decoupling dependencies among objects
by assigning a unique identifier to all
model objects allows:
▌ Lightweight on-demand loading
│ Each live model object has a lightweight
delegate object that is in charge of on-
demand loading the element data and
keeping track of the element’s state
▌ Efficient garbage collection in the JRE
│ No hard Java references are kept among
model objects. Any model object not directly
referenced by the application will be
deallocated
© ATLANMOD - atlanmod-contact@mines-nantes.fr
18
MAP-BASED DATA MODEL
▌ The unique identifier allows flattening the
graph structure into a set of key-value
mappings
▌ Operations on hash-maps have a
constant cost
▌ Three different (hash-)maps are used to
store models’ information:
│ Property map: keeps all objects’ data in a
centralized place
│ Type map: tracks how objects interact with
the meta-level (e.g. instance of)
│ Containment map: defines the models’
structure in terms of containment references
© ATLANMOD - atlanmod-contact@mines-nantes.fr
19
MAP-BASED DATA MODEL
▌Property map
│ Key: OID + EstructuralFeature
│ Value: data
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Key Value
{ ‘c1’, ‘name’ } ‘class1’
{ ‘c1’, ‘bodyDeclarations’ } { ‘b1’, ‘b2’ }
20
MAP-BASED DATA MODEL
▌Type map
│ Key: OID
│ Value: nsURI + EObject’s EClass
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Key Value
‘c1’ 〈 nsUri=‘http://java’, class=‘ClassDeclaration’ 〉
21
MAP-BASED DATA MODEL
▌Containment
map
│ Key: OID
│ Value: Container’s OID +
EStructuralFeature (from parent to child).
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Key Value
‘c1’ 〈 container=‘p1’, featureName=‘ownedElements’ 〉
22
MODEL OPERATIONS AS
MAP OPERATIONS
LOOKUPS INSERTS
METHOD MIN. MAX. MIN. MAX
OPERATIONS ON OBJECTS
getType 1 1 0 0
getContainer 1 1 0 0
getContainerFeature 1 1 0 0
OPERATIONS ON PROPERTIES
get* 1 1 0 0
set* 0 3 1 3
isSet* 1 1 0 0
unset* 1 1 0 1
OPERATIONS ON MUTI-VALUED FEATURES
add 1 3 1 3
remove 1 2 1 2
clear 0 0 1 1
size 1 1 0 0
© ATLANMOD - atlanmod-contact@mines-nantes.fr
23
EXPERIMENTAL
EVALUATION
Conditions of the
experiments
Results
Summary
24
EXPERIMENTAL EVALUATION
▌ Based on our joint experience with industrial
partners:
│ We obtained three models from OSS using
reverse engineering…
│ … that resemble models from real-world
scenarios
│ We defined a set of queries (GraBaTs’09 and
industry-like)
│ Only the standard EMF API is used → Queries
are backend-agnostic
│ Three heap sizes: 8GB, 512MB and 256MB
© ATLANMOD - atlanmod-contact@mines-nantes.fr
# MODEL SIZE IN XMI ELEMENTS
1 org.eclipse.gmt.modisco.java 19.3MB 80.665
2 org.eclipse.jdt.core 420.6MB 1.557.007
3 org.eclipse.jdt.* 984.7MB 3.609.454
25
EXPERIMENTAL EVALUATION
▌ Selected back-ends:
│ NEOEMF/MAP (MapDB)
│ NEOEMP/GRAPH (Neo4j embedded)
│ CDO (H2 embedded)
▌ Discarded back-ends:
│ MongoEMF → does not strictly comply with
the standard EMF behavior
│ EMF-fragments → requires manual
modifications in the source models or
metamodels
│ Morsa → only a small subset of the
experiments ran successfully
Configuration details: Intel Core i7 3740QM (2.70GHz), 16 GB of DDR3 SDRAM (800MHz), Samsung SM841
SATA3 SSD Hard Disk (6GB/s), Windows 7 Enterprise 64, JRE 1.7.0_40-b43, Eclipse 4.4.0, EMF 2.10.1,
NeoEMF/Map uses MapDB 0.9.10, NeoEMF/Graph uses Neo4j 1.9.2, CDO 4.3.1 runs on top of H2 1.3.168
© ATLANMOD - atlanmod-contact@mines-nantes.fr
26
EXPERIMENT I
© ATLANMOD - atlanmod-contact@mines-nantes.fr
9 s
161 s
412 s
41 s
1161 s
3767 s
12 s 120 s
301 s
Model 1 Model 2 Model 3
Import model from XMI (8GB)
NeoEMF/Map NeoEMF/Graph CDO
27
EXPERIMENT II
© ATLANMOD - atlanmod-contact@mines-nantes.fr
4 s
35 s
79 s
3 s 25 s
62 s
16 s
201 s
708 s
14 s
133 s
309 s
Model 1 Model 2 Model 3
Model traversal 8GB (incl. loading & unloading)
XMI NeoEMF/Map NeoEMF/Graph CDO28
EXPERIMENT II
© ATLANMOD - atlanmod-contact@mines-nantes.fr
4 s 3 s
42 s
366 s
15 s
235 s
763 s
13 s
550 s 548 s
Model 1 Model 2 Model 3
Model traversal 512MB (incl. loading & unloading)
XMI NeoEMF/Map NeoEMF/Graph CDO29
EXPERIMENT III
© ATLANMOD - atlanmod-contact@mines-nantes.fr
0 s 0 s 0 s0 s
2 s
19 s
0 s 0 s
2 s
Model 1 Model 2 Model 3
Model queries that do not traverse the model 8GB
NeoEMF/Map NeoEMF/Graph CDO
30
EXPERIMENT IV
1 s 24 s
61 s
11 s
188 s
717 s
9 s
48 s
367 s
Model 1 Model 2 Model 3
GraBaTs’09 8GB
NeoEMF/Map
NeoEMF/Graph
CDO
© ATLANMOD - atlanmod-contact@mines-nantes.fr
2 s 36 s
101 s
17 s
359 s
1328 s
9 s
131 s
294 s
Model 1 Model 2 Model 3
Unused Methods 8GB
NeoEMF/Map
NeoEMF/Graph
CDO
31
EXPERIMENT V
© ATLANMOD - atlanmod-contact@mines-nantes.fr
1 s 24 s
62 s
11 s
191 s
677 s
9 s
118 s
296 s
Model 1 Model 2 Model 3
Model modification and saving 8GB
NeoEMF/Map NeoEMF/Graph CDO
32
EXPERIMENT V
© ATLANMOD - atlanmod-contact@mines-nantes.fr
1 s
160 s
472 s
11 s
224 s
9 s
723 s
Model 1 Model 2 Model 3
Model modification and saving 256MB
NeoEMF/Map NeoEMF/Graph CDO
33
SUMMARY
▌ NeoEMF/Map performs better than any other
solution when using the standard API
▌ NeoEMF/Map presents import times in the
same order of magnitude than CDO, but it is
about a 33% slower for the largest model →
NeoEMF/Map is affected by the overhead
produced by modifications on big lists
(>100.000 elements) that grow monotonically
(caching is needed)
▌ The simple data model with low-cost
operations implemented by NeoEMF/Map
contrasts with the more complex data model
implemented by NeoEMF/Graph (consistently
slower by a factor between 7 and 9)
© ATLANMOD - atlanmod-contact@mines-nantes.fr
34
SUMMARY
▌ Traversal of a very large model is much
faster (up to 9×) by using the
NeoEMF/Map
▌ If load and unload times are considered
NeoEMF/Map also outperforms XMI
▌ The fast model-traversal ability of
NeoEMF/Map is exploited by the pattern
followed by most of the queries in the
modernization domain
▌ Queries that traverse the model to apply
and persist changes perform significantly
better on NeoEMF/Map (5× faster on big
models, 9× on small models).
© ATLANMOD - atlanmod-contact@mines-nantes.fr
35
CONCLUSIONS
Conclusions
Future work
© ATLANMOD - atlanmod-contact@mines-nantes.fr
36
CONCLUSIONS
▌ Map-based persistence layer to handle VLMs
▌ Comparison against relational-based and
graph-based alternatives
▌ EMF as the implementation technology
▌ We used queries from some of our industrial
partners in the model-driven modernization
domain as experiments
▌ Typical model-access APIs, with fine-grained
methods with one-step-navigation queries, do
not benefit from complex relational or graph-
based data structures.
▌ Low-level data structures, like hash-tables,
with low and constant access times provide
better results
© ATLANMOD - atlanmod-contact@mines-nantes.fr
37
FUTURE WORK
▌ Caching strategies:
│ Element unloading (which element is not
needed anymore?)
│ Element prefetching (which element will be
needed in future?)
▌ Benefits of other backends depending on
the specific application scenario:
│ Graph-based persistence solutions when
some of our requirements can be dropped
│ Bypassing the model access API by
translating the queries to high performance
native graph-database queries may provide
great benefits
© ATLANMOD - atlanmod-contact@mines-nantes.fr
38
MAP-BASED
TRANSPARENT
PERSISTENCE
FOR VERY LARGE
MODELS lin
Abel Gómez
Massimo Tisi
Gerson Sunyé
and Jordi Cabot

More Related Content

Similar to Fase 2015 - Map-based Transparent Persistence for Very Large Models

Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)
Anil Madhavapeddy
 
C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...
C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...
C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...
Jason Hearne-McGuiness
 
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in SwitzerlandMobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
François Garillot
 
ParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel ProgrammingParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel Programming
khstandrews
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
imec.archive
 

Similar to Fase 2015 - Map-based Transparent Persistence for Very Large Models (20)

Introducing Eclipse MoDisco
Introducing Eclipse MoDiscoIntroducing Eclipse MoDisco
Introducing Eclipse MoDisco
 
Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)
 
C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...
C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...
C++ Data-flow Parallelism sounds great! But how practical is it? Let’s see ho...
 
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
 
Agile Modelling Architecture
Agile Modelling ArchitectureAgile Modelling Architecture
Agile Modelling Architecture
 
Neo4EMF eclipsecon 2013
Neo4EMF eclipsecon 2013Neo4EMF eclipsecon 2013
Neo4EMF eclipsecon 2013
 
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in SwitzerlandMobility insights at Swisscom - Understanding collective mobility in Switzerland
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
 
Spark Summit EU talk by Francois Garillot and Mohamed Kafsi
Spark Summit EU talk by Francois Garillot and Mohamed KafsiSpark Summit EU talk by Francois Garillot and Mohamed Kafsi
Spark Summit EU talk by Francois Garillot and Mohamed Kafsi
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
 
Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy
Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy
Deploying Cloud Native Red Team Infrastructure with Kubernetes, Istio and Envoy
 
byteLAKE's Alveo FPGA Solutions
byteLAKE's Alveo FPGA SolutionsbyteLAKE's Alveo FPGA Solutions
byteLAKE's Alveo FPGA Solutions
 
Crash Dump Analysis 101
Crash Dump Analysis 101Crash Dump Analysis 101
Crash Dump Analysis 101
 
ParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel ProgrammingParaForming - Patterns and Refactoring for Parallel Programming
ParaForming - Patterns and Refactoring for Parallel Programming
 
20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris20081114 Friday Food iLabt Bart Joris
20081114 Friday Food iLabt Bart Joris
 
A DSL-Based Approach for Cloud-Based Systems Elasticity Testing
A DSL-Based Approach for Cloud-Based Systems Elasticity Testing A DSL-Based Approach for Cloud-Based Systems Elasticity Testing
A DSL-Based Approach for Cloud-Based Systems Elasticity Testing
 
It's always sunny with OpenJ9
It's always sunny with OpenJ9It's always sunny with OpenJ9
It's always sunny with OpenJ9
 
DO-178C OOT supplement: A user's perspective
DO-178C OOT supplement: A user's perspectiveDO-178C OOT supplement: A user's perspective
DO-178C OOT supplement: A user's perspective
 
Cray HPC Environments for Leading Edge Simulations
Cray HPC Environments for Leading Edge SimulationsCray HPC Environments for Leading Edge Simulations
Cray HPC Environments for Leading Edge Simulations
 
Container Attached Storage (CAS) with OpenEBS - SDC 2018
Container Attached Storage (CAS) with OpenEBS -  SDC 2018Container Attached Storage (CAS) with OpenEBS -  SDC 2018
Container Attached Storage (CAS) with OpenEBS - SDC 2018
 
Ch1
Ch1Ch1
Ch1
 

More from abgolla

More from abgolla (6)

A model based approach for developing event-driven architectures with AsyncAPI
A model based approach for developing event-driven architectures with AsyncAPIA model based approach for developing event-driven architectures with AsyncAPI
A model based approach for developing event-driven architectures with AsyncAPI
 
A Modeling Editor and Code Generator for AsyncAPI
A Modeling Editor and Code Generator for AsyncAPIA Modeling Editor and Code Generator for AsyncAPI
A Modeling Editor and Code Generator for AsyncAPI
 
Enabling Performance Modeling for the Masses: Initial Experiences
Enabling Performance Modeling for the Masses: Initial ExperiencesEnabling Performance Modeling for the Masses: Initial Experiences
Enabling Performance Modeling for the Masses: Initial Experiences
 
TemporalEMF: A Temporal Metamodeling Framework
TemporalEMF: A Temporal Metamodeling FrameworkTemporalEMF: A Temporal Metamodeling Framework
TemporalEMF: A Temporal Metamodeling Framework
 
On the Opportunities of Scalable Modeling Technologies: An Experience Report ...
On the Opportunities of Scalable Modeling Technologies: An Experience Report ...On the Opportunities of Scalable Modeling Technologies: An Experience Report ...
On the Opportunities of Scalable Modeling Technologies: An Experience Report ...
 
Una herramienta para evaluar el rendimiento de aplicaciones intensivas en datos
Una herramienta para evaluar el rendimiento de aplicaciones intensivas en datosUna herramienta para evaluar el rendimiento de aplicaciones intensivas en datos
Una herramienta para evaluar el rendimiento de aplicaciones intensivas en datos
 

Recently uploaded

Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Silpa
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 

Recently uploaded (20)

Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 

Fase 2015 - Map-based Transparent Persistence for Very Large Models

  • 1. MAP-BASED TRANSPARENT PERSISTENCE FOR VERY LARGE MODELS lin Abel Gómez Massimo Tisi Gerson Sunyé and Jordi Cabot 1
  • 2. OUTLINE ▌The landscape in MDE ▌Motivation: running example and current persistence approaches ▌Towards a simple EMF-based persistence layer ▌NEOEMF/MAP: A transparent persistence layer for EMF models ▌Our experimental evaluation in a nutshell ▌Conclusions and future work © ATLANMOD - atlanmod-contact@mines-nantes.fr 2
  • 4. THE LANDSCAPE IN MDE ▌ Models and code generation are the center of the software-engineering processes ▌ Modeling tools are built around modeling frameworks (EMF has become the de facto standard) ▌ The technologies at the core of modeling frameworks were designed to support simple modeling activities ▌ Since its publication, the XMI standard has been the preferred format for storing and sharing models and metamodels ▌ Clear limits arise when current technologies are applied to VLMs: XML is not the right technology for VLMs (verbosity, costly serialization/deserialization…) ▌ Some solutions exist, but problems in managing memory and persisting data are still under-studied in MDE © ATLANMOD - atlanmod-contact@mines-nantes.fr 4
  • 6. RUNING EXAMPLE © ATLANMOD - atlanmod-contact@mines-nantes.fr Java Metamodel (excerpt) nsURI: ’http://java’ 6
  • 7. RUNING EXAMPLE © ATLANMOD - atlanmod-contact@mines-nantes.fr Java Metamodel (excerpt) nsURI: ’http://java’ Instance 7
  • 8. MOTIVATION ▌ Within a modeling ecosystem, all tools that need to access or manipulate models have to pass through a single model management interface ▌ In some of these ecosystems (e.g. EMF) the model management interface is automatically generated from the metamodel © ATLANMOD - atlanmod-contact@mines-nantes.fr 8
  • 9. THE GENERATED MODEL MANAGEMENT INTERFACE ▌ // Creation of objects ▌ Package p1 := Factory.createPackage(); ▌ ClassDeclaration c1 := Factory.createClassDeclaration(); ▌ BodyDeclaration b1 := Factory.createBodyDeclaration(); ▌ BodyDeclaration b2 := Factory.createBodyDeclaration(); ▌ Modifier m1 := Factory.createModifier(); ▌ Modifier m2 := Factory.createModifier(); ▌ // Initialization of attributes ▌ p1.setName("package1"); ▌ c1.setName("class1"); ▌ b1.setName("bodyDecl1"); ▌ b2.setName("bodyDecl2"); ▌ m1.setVisibility(VisibilityKind.PUBLIC); ▌ m2.setVisibility(VisibilityKind.PUBLIC); ▌ // Initialization of references ▌ p1.getOwnedElements().add(c1); ▌ c1.getBodyDeclarations().add(b1); ▌ c1.getBodyDeclarations().add(b2); ▌ b1.setModifier(m1); ▌ b2.setModifier(m2) © ATLANMOD - atlanmod-contact@mines-nantes.fr 9
  • 10. MOTIVATION ▌ Without any specific memory-management solution, the model would need to be fully contained in memory for any access or modification ▌ Models that exceed the main memory would cause a significant performance drop or the application crash © ATLANMOD - atlanmod-contact@mines-nantes.fr 10
  • 11. STANDARD TECHNOLOGIES FOR PERSISTING MODELS IN EMF ▌XML-based (XMI) │ Pros: Readability, fast for small models │ Cons: Needs to load/keep the whole model in memory. ▌Connected Data Objects (CDO) │ Pros: on-demand loading, transactions, versioning, notifications │ Cons: Only the relational mapping is regularly maintained, does not scale well with VLMs © ATLANMOD - atlanmod-contact@mines-nantes.fr 11
  • 12. NEW TRENDS IN PERSISTING MODELS IN EMF ▌ Morsa (document-oriented) │ On-demand loading, incremental updates, fully compatible with the EMF API │ Requires its own query language to get good performance ▌ MongoEMF (document-oriented) │ Uses the standard EMF API │ It behaves different than the standard back-ends ▌ EMF fragments │ Uses the standard proxy mechanism to partition models in small chunks │ Requires modifications on the metamodels to get the benefits of partitions ▌ NeoEMF/Graph, a.k.a. Neo4EMF (graph-based) │ Models are a set of highly interconnected elements → graphs are the most natural way to represent them │ The generated API only performs one-step navigations → only a significant gain in performance is obtained when using native queries on the underlying persistence back-end © ATLANMOD - atlanmod-contact@mines-nantes.fr 12
  • 13. MOTIVATION ▌ We need a transparent persistence layer able to automatically persist, load and unload model elements with no changes to the application code © ATLANMOD - atlanmod-contact@mines-nantes.fr 13
  • 15. MODEL-PERSISTENCE LAYER ▌NEOEMF/MAP must… … be an exact replacement … use a replaceable underlying engine … allow different types of caching … be memory friendly … provide on-demand load capabilities … free unused memory … outperform current persistence layers using the standard API Interoperability requirements Performance requirements © ATLANMOD - atlanmod-contact@mines-nantes.fr 15
  • 16. MODEL-PERSISTENCE LAYER © ATLANMOD - atlanmod-contact@mines-nantes.fr Model Manager Persistence Manager Persistence Backend NeoEMF /Map EMF /Graph CDO XMI Serialization Model-based Tools XMI File GraphDB MapDB Caching Strategy RelationalDB Model Access API Persistence API Backend API Client Code 16
  • 17. NEOEMF/MAP A TRANSPARENT PERSISTENCE LAYER FOR EMF MODELS Memory Management Map-based data model Model operations as map operations 17
  • 18. MEMORY MANAGEMENT ▌ Decoupling dependencies among objects by assigning a unique identifier to all model objects allows: ▌ Lightweight on-demand loading │ Each live model object has a lightweight delegate object that is in charge of on- demand loading the element data and keeping track of the element’s state ▌ Efficient garbage collection in the JRE │ No hard Java references are kept among model objects. Any model object not directly referenced by the application will be deallocated © ATLANMOD - atlanmod-contact@mines-nantes.fr 18
  • 19. MAP-BASED DATA MODEL ▌ The unique identifier allows flattening the graph structure into a set of key-value mappings ▌ Operations on hash-maps have a constant cost ▌ Three different (hash-)maps are used to store models’ information: │ Property map: keeps all objects’ data in a centralized place │ Type map: tracks how objects interact with the meta-level (e.g. instance of) │ Containment map: defines the models’ structure in terms of containment references © ATLANMOD - atlanmod-contact@mines-nantes.fr 19
  • 20. MAP-BASED DATA MODEL ▌Property map │ Key: OID + EstructuralFeature │ Value: data © ATLANMOD - atlanmod-contact@mines-nantes.fr Key Value { ‘c1’, ‘name’ } ‘class1’ { ‘c1’, ‘bodyDeclarations’ } { ‘b1’, ‘b2’ } 20
  • 21. MAP-BASED DATA MODEL ▌Type map │ Key: OID │ Value: nsURI + EObject’s EClass © ATLANMOD - atlanmod-contact@mines-nantes.fr Key Value ‘c1’ 〈 nsUri=‘http://java’, class=‘ClassDeclaration’ 〉 21
  • 22. MAP-BASED DATA MODEL ▌Containment map │ Key: OID │ Value: Container’s OID + EStructuralFeature (from parent to child). © ATLANMOD - atlanmod-contact@mines-nantes.fr Key Value ‘c1’ 〈 container=‘p1’, featureName=‘ownedElements’ 〉 22
  • 23. MODEL OPERATIONS AS MAP OPERATIONS LOOKUPS INSERTS METHOD MIN. MAX. MIN. MAX OPERATIONS ON OBJECTS getType 1 1 0 0 getContainer 1 1 0 0 getContainerFeature 1 1 0 0 OPERATIONS ON PROPERTIES get* 1 1 0 0 set* 0 3 1 3 isSet* 1 1 0 0 unset* 1 1 0 1 OPERATIONS ON MUTI-VALUED FEATURES add 1 3 1 3 remove 1 2 1 2 clear 0 0 1 1 size 1 1 0 0 © ATLANMOD - atlanmod-contact@mines-nantes.fr 23
  • 25. EXPERIMENTAL EVALUATION ▌ Based on our joint experience with industrial partners: │ We obtained three models from OSS using reverse engineering… │ … that resemble models from real-world scenarios │ We defined a set of queries (GraBaTs’09 and industry-like) │ Only the standard EMF API is used → Queries are backend-agnostic │ Three heap sizes: 8GB, 512MB and 256MB © ATLANMOD - atlanmod-contact@mines-nantes.fr # MODEL SIZE IN XMI ELEMENTS 1 org.eclipse.gmt.modisco.java 19.3MB 80.665 2 org.eclipse.jdt.core 420.6MB 1.557.007 3 org.eclipse.jdt.* 984.7MB 3.609.454 25
  • 26. EXPERIMENTAL EVALUATION ▌ Selected back-ends: │ NEOEMF/MAP (MapDB) │ NEOEMP/GRAPH (Neo4j embedded) │ CDO (H2 embedded) ▌ Discarded back-ends: │ MongoEMF → does not strictly comply with the standard EMF behavior │ EMF-fragments → requires manual modifications in the source models or metamodels │ Morsa → only a small subset of the experiments ran successfully Configuration details: Intel Core i7 3740QM (2.70GHz), 16 GB of DDR3 SDRAM (800MHz), Samsung SM841 SATA3 SSD Hard Disk (6GB/s), Windows 7 Enterprise 64, JRE 1.7.0_40-b43, Eclipse 4.4.0, EMF 2.10.1, NeoEMF/Map uses MapDB 0.9.10, NeoEMF/Graph uses Neo4j 1.9.2, CDO 4.3.1 runs on top of H2 1.3.168 © ATLANMOD - atlanmod-contact@mines-nantes.fr 26
  • 27. EXPERIMENT I © ATLANMOD - atlanmod-contact@mines-nantes.fr 9 s 161 s 412 s 41 s 1161 s 3767 s 12 s 120 s 301 s Model 1 Model 2 Model 3 Import model from XMI (8GB) NeoEMF/Map NeoEMF/Graph CDO 27
  • 28. EXPERIMENT II © ATLANMOD - atlanmod-contact@mines-nantes.fr 4 s 35 s 79 s 3 s 25 s 62 s 16 s 201 s 708 s 14 s 133 s 309 s Model 1 Model 2 Model 3 Model traversal 8GB (incl. loading & unloading) XMI NeoEMF/Map NeoEMF/Graph CDO28
  • 29. EXPERIMENT II © ATLANMOD - atlanmod-contact@mines-nantes.fr 4 s 3 s 42 s 366 s 15 s 235 s 763 s 13 s 550 s 548 s Model 1 Model 2 Model 3 Model traversal 512MB (incl. loading & unloading) XMI NeoEMF/Map NeoEMF/Graph CDO29
  • 30. EXPERIMENT III © ATLANMOD - atlanmod-contact@mines-nantes.fr 0 s 0 s 0 s0 s 2 s 19 s 0 s 0 s 2 s Model 1 Model 2 Model 3 Model queries that do not traverse the model 8GB NeoEMF/Map NeoEMF/Graph CDO 30
  • 31. EXPERIMENT IV 1 s 24 s 61 s 11 s 188 s 717 s 9 s 48 s 367 s Model 1 Model 2 Model 3 GraBaTs’09 8GB NeoEMF/Map NeoEMF/Graph CDO © ATLANMOD - atlanmod-contact@mines-nantes.fr 2 s 36 s 101 s 17 s 359 s 1328 s 9 s 131 s 294 s Model 1 Model 2 Model 3 Unused Methods 8GB NeoEMF/Map NeoEMF/Graph CDO 31
  • 32. EXPERIMENT V © ATLANMOD - atlanmod-contact@mines-nantes.fr 1 s 24 s 62 s 11 s 191 s 677 s 9 s 118 s 296 s Model 1 Model 2 Model 3 Model modification and saving 8GB NeoEMF/Map NeoEMF/Graph CDO 32
  • 33. EXPERIMENT V © ATLANMOD - atlanmod-contact@mines-nantes.fr 1 s 160 s 472 s 11 s 224 s 9 s 723 s Model 1 Model 2 Model 3 Model modification and saving 256MB NeoEMF/Map NeoEMF/Graph CDO 33
  • 34. SUMMARY ▌ NeoEMF/Map performs better than any other solution when using the standard API ▌ NeoEMF/Map presents import times in the same order of magnitude than CDO, but it is about a 33% slower for the largest model → NeoEMF/Map is affected by the overhead produced by modifications on big lists (>100.000 elements) that grow monotonically (caching is needed) ▌ The simple data model with low-cost operations implemented by NeoEMF/Map contrasts with the more complex data model implemented by NeoEMF/Graph (consistently slower by a factor between 7 and 9) © ATLANMOD - atlanmod-contact@mines-nantes.fr 34
  • 35. SUMMARY ▌ Traversal of a very large model is much faster (up to 9×) by using the NeoEMF/Map ▌ If load and unload times are considered NeoEMF/Map also outperforms XMI ▌ The fast model-traversal ability of NeoEMF/Map is exploited by the pattern followed by most of the queries in the modernization domain ▌ Queries that traverse the model to apply and persist changes perform significantly better on NeoEMF/Map (5× faster on big models, 9× on small models). © ATLANMOD - atlanmod-contact@mines-nantes.fr 35
  • 36. CONCLUSIONS Conclusions Future work © ATLANMOD - atlanmod-contact@mines-nantes.fr 36
  • 37. CONCLUSIONS ▌ Map-based persistence layer to handle VLMs ▌ Comparison against relational-based and graph-based alternatives ▌ EMF as the implementation technology ▌ We used queries from some of our industrial partners in the model-driven modernization domain as experiments ▌ Typical model-access APIs, with fine-grained methods with one-step-navigation queries, do not benefit from complex relational or graph- based data structures. ▌ Low-level data structures, like hash-tables, with low and constant access times provide better results © ATLANMOD - atlanmod-contact@mines-nantes.fr 37
  • 38. FUTURE WORK ▌ Caching strategies: │ Element unloading (which element is not needed anymore?) │ Element prefetching (which element will be needed in future?) ▌ Benefits of other backends depending on the specific application scenario: │ Graph-based persistence solutions when some of our requirements can be dropped │ Bypassing the model access API by translating the queries to high performance native graph-database queries may provide great benefits © ATLANMOD - atlanmod-contact@mines-nantes.fr 38
  • 39. MAP-BASED TRANSPARENT PERSISTENCE FOR VERY LARGE MODELS lin Abel Gómez Massimo Tisi Gerson Sunyé and Jordi Cabot