Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
MAP-BASED
TRANSPARENT
PERSISTENCE
FOR VERY LARGE
MODELS lin
Abel Gómez
Massimo Tisi
Gerson Sunyé
and Jordi Cabot
1
OUTLINE
▌The landscape in MDE
▌Motivation: running example and
current persistence approaches
▌Towards a simple EMF-based
...
INTRODUCTION
Why another
persistence
solution?
3
THE LANDSCAPE IN MDE
▌ Models and code generation are the center of the
software-engineering processes
▌ Modeling tools ar...
MOTIVATION
Running example
Current
persistence
approaches
© ATLANMOD - atlanmod-contact@mines-nantes.fr
5
RUNING EXAMPLE
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Java Metamodel
(excerpt)
nsURI: ’http://java’
6
RUNING EXAMPLE
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Java Metamodel
(excerpt)
nsURI: ’http://java’
Instance
7
MOTIVATION
▌ Within a modeling ecosystem, all tools that
need to access or manipulate models have to
pass through a single...
THE GENERATED MODEL
MANAGEMENT INTERFACE
▌ // Creation of objects
▌ Package p1 := Factory.createPackage();
▌ ClassDeclarat...
MOTIVATION
▌ Without any specific memory-management
solution, the model would need to be fully
contained in memory for any...
STANDARD TECHNOLOGIES FOR
PERSISTING MODELS IN EMF
▌XML-based (XMI)
│ Pros: Readability, fast for small models
│ Cons: Nee...
NEW TRENDS IN PERSISTING
MODELS IN EMF
▌ Morsa (document-oriented)
│ On-demand loading, incremental updates, fully compati...
MOTIVATION
▌ We need a transparent persistence layer
able to automatically persist, load and
unload model elements with no...
NEOEMF/MAP
DESIGN
GOALS
Towards a simple
EMF-based
persistence layer
14
MODEL-PERSISTENCE LAYER
▌NEOEMF/MAP must…
… be an exact replacement
… use a replaceable underlying engine
… allow differen...
MODEL-PERSISTENCE LAYER
© ATLANMOD - atlanmod-contact@mines-nantes.fr
Model
Manager
Persistence
Manager
Persistence
Backen...
NEOEMF/MAP
A TRANSPARENT
PERSISTENCE
LAYER FOR
EMF MODELS
Memory Management
Map-based data model
Model operations as
map o...
MEMORY MANAGEMENT
▌ Decoupling dependencies among objects
by assigning a unique identifier to all
model objects allows:
▌ ...
MAP-BASED DATA MODEL
▌ The unique identifier allows flattening the
graph structure into a set of key-value
mappings
▌ Oper...
MAP-BASED DATA MODEL
▌Property map
│ Key: OID + EstructuralFeature
│ Value: data
© ATLANMOD - atlanmod-contact@mines-nante...
MAP-BASED DATA MODEL
▌Type map
│ Key: OID
│ Value: nsURI + EObject’s EClass
© ATLANMOD - atlanmod-contact@mines-nantes.fr
...
MAP-BASED DATA MODEL
▌Containment
map
│ Key: OID
│ Value: Container’s OID +
EStructuralFeature (from parent to child).
© A...
MODEL OPERATIONS AS
MAP OPERATIONS
LOOKUPS INSERTS
METHOD MIN. MAX. MIN. MAX
OPERATIONS ON OBJECTS
getType 1 1 0 0
getCont...
EXPERIMENTAL
EVALUATION
Conditions of the
experiments
Results
Summary
24
EXPERIMENTAL EVALUATION
▌ Based on our joint experience with industrial
partners:
│ We obtained three models from OSS usin...
EXPERIMENTAL EVALUATION
▌ Selected back-ends:
│ NEOEMF/MAP (MapDB)
│ NEOEMP/GRAPH (Neo4j embedded)
│ CDO (H2 embedded)
▌ D...
EXPERIMENT I
© ATLANMOD - atlanmod-contact@mines-nantes.fr
9 s
161 s
412 s
41 s
1161 s
3767 s
12 s 120 s
301 s
Model 1 Mod...
EXPERIMENT II
© ATLANMOD - atlanmod-contact@mines-nantes.fr
4 s
35 s
79 s
3 s 25 s
62 s
16 s
201 s
708 s
14 s
133 s
309 s
...
EXPERIMENT II
© ATLANMOD - atlanmod-contact@mines-nantes.fr
4 s 3 s
42 s
366 s
15 s
235 s
763 s
13 s
550 s 548 s
Model 1 M...
EXPERIMENT III
© ATLANMOD - atlanmod-contact@mines-nantes.fr
0 s 0 s 0 s0 s
2 s
19 s
0 s 0 s
2 s
Model 1 Model 2 Model 3
M...
EXPERIMENT IV
1 s 24 s
61 s
11 s
188 s
717 s
9 s
48 s
367 s
Model 1 Model 2 Model 3
GraBaTs’09 8GB
NeoEMF/Map
NeoEMF/Graph...
EXPERIMENT V
© ATLANMOD - atlanmod-contact@mines-nantes.fr
1 s 24 s
62 s
11 s
191 s
677 s
9 s
118 s
296 s
Model 1 Model 2 ...
EXPERIMENT V
© ATLANMOD - atlanmod-contact@mines-nantes.fr
1 s
160 s
472 s
11 s
224 s
9 s
723 s
Model 1 Model 2 Model 3
Mo...
SUMMARY
▌ NeoEMF/Map performs better than any other
solution when using the standard API
▌ NeoEMF/Map presents import time...
SUMMARY
▌ Traversal of a very large model is much
faster (up to 9×) by using the
NeoEMF/Map
▌ If load and unload times are...
CONCLUSIONS
Conclusions
Future work
© ATLANMOD - atlanmod-contact@mines-nantes.fr
36
CONCLUSIONS
▌ Map-based persistence layer to handle VLMs
▌ Comparison against relational-based and
graph-based alternative...
FUTURE WORK
▌ Caching strategies:
│ Element unloading (which element is not
needed anymore?)
│ Element prefetching (which ...
MAP-BASED
TRANSPARENT
PERSISTENCE
FOR VERY LARGE
MODELS lin
Abel Gómez
Massimo Tisi
Gerson Sunyé
and Jordi Cabot
Upcoming SlideShare
Loading in …5
×

Fase 2015 - Map-based Transparent Persistence for Very Large Models

569 views

Published on

The progressive industrial adoption of Model-Driven Engineering (MDE) is fostering the development of large tool ecosystems like the Eclipse Modeling project. These tools are built on top of a set of base technologies that have been primarily designed for small-scale scenarios, where models are manually developed. In particular, efficient runtime manipulation for large-scale models is an under-studied problem and this is hampering the application of MDE to several industrial scenarios.

In this paper we introduce and evaluate a map-based persistence model for MDE tools. We use this model to build a transparent persistence layer for modeling tools, on top of a map-based database engine. The layer can be plugged into the Eclipse Modeling Framework, lowering execution times and memory consumption levels of other existing approaches. Empirical tests are performed based on a typical industrial scenario, model-driven reverse engineering, where very large software models originate from the analysis of massive code bases. The layer is freely distributed and can be immediately used for enhancing the scalability of any existing Eclipse Modeling tool.

http://www.emn.fr/z-info/atlanmod/index.php/NeoEMF/Map

Published in: Science
  • Be the first to comment

  • Be the first to like this

Fase 2015 - Map-based Transparent Persistence for Very Large Models

  1. 1. MAP-BASED TRANSPARENT PERSISTENCE FOR VERY LARGE MODELS lin Abel Gómez Massimo Tisi Gerson Sunyé and Jordi Cabot 1
  2. 2. OUTLINE ▌The landscape in MDE ▌Motivation: running example and current persistence approaches ▌Towards a simple EMF-based persistence layer ▌NEOEMF/MAP: A transparent persistence layer for EMF models ▌Our experimental evaluation in a nutshell ▌Conclusions and future work © ATLANMOD - atlanmod-contact@mines-nantes.fr 2
  3. 3. INTRODUCTION Why another persistence solution? 3
  4. 4. THE LANDSCAPE IN MDE ▌ Models and code generation are the center of the software-engineering processes ▌ Modeling tools are built around modeling frameworks (EMF has become the de facto standard) ▌ The technologies at the core of modeling frameworks were designed to support simple modeling activities ▌ Since its publication, the XMI standard has been the preferred format for storing and sharing models and metamodels ▌ Clear limits arise when current technologies are applied to VLMs: XML is not the right technology for VLMs (verbosity, costly serialization/deserialization…) ▌ Some solutions exist, but problems in managing memory and persisting data are still under-studied in MDE © ATLANMOD - atlanmod-contact@mines-nantes.fr 4
  5. 5. MOTIVATION Running example Current persistence approaches © ATLANMOD - atlanmod-contact@mines-nantes.fr 5
  6. 6. RUNING EXAMPLE © ATLANMOD - atlanmod-contact@mines-nantes.fr Java Metamodel (excerpt) nsURI: ’http://java’ 6
  7. 7. RUNING EXAMPLE © ATLANMOD - atlanmod-contact@mines-nantes.fr Java Metamodel (excerpt) nsURI: ’http://java’ Instance 7
  8. 8. MOTIVATION ▌ Within a modeling ecosystem, all tools that need to access or manipulate models have to pass through a single model management interface ▌ In some of these ecosystems (e.g. EMF) the model management interface is automatically generated from the metamodel © ATLANMOD - atlanmod-contact@mines-nantes.fr 8
  9. 9. THE GENERATED MODEL MANAGEMENT INTERFACE ▌ // Creation of objects ▌ Package p1 := Factory.createPackage(); ▌ ClassDeclaration c1 := Factory.createClassDeclaration(); ▌ BodyDeclaration b1 := Factory.createBodyDeclaration(); ▌ BodyDeclaration b2 := Factory.createBodyDeclaration(); ▌ Modifier m1 := Factory.createModifier(); ▌ Modifier m2 := Factory.createModifier(); ▌ // Initialization of attributes ▌ p1.setName("package1"); ▌ c1.setName("class1"); ▌ b1.setName("bodyDecl1"); ▌ b2.setName("bodyDecl2"); ▌ m1.setVisibility(VisibilityKind.PUBLIC); ▌ m2.setVisibility(VisibilityKind.PUBLIC); ▌ // Initialization of references ▌ p1.getOwnedElements().add(c1); ▌ c1.getBodyDeclarations().add(b1); ▌ c1.getBodyDeclarations().add(b2); ▌ b1.setModifier(m1); ▌ b2.setModifier(m2) © ATLANMOD - atlanmod-contact@mines-nantes.fr 9
  10. 10. MOTIVATION ▌ Without any specific memory-management solution, the model would need to be fully contained in memory for any access or modification ▌ Models that exceed the main memory would cause a significant performance drop or the application crash © ATLANMOD - atlanmod-contact@mines-nantes.fr 10
  11. 11. STANDARD TECHNOLOGIES FOR PERSISTING MODELS IN EMF ▌XML-based (XMI) │ Pros: Readability, fast for small models │ Cons: Needs to load/keep the whole model in memory. ▌Connected Data Objects (CDO) │ Pros: on-demand loading, transactions, versioning, notifications │ Cons: Only the relational mapping is regularly maintained, does not scale well with VLMs © ATLANMOD - atlanmod-contact@mines-nantes.fr 11
  12. 12. NEW TRENDS IN PERSISTING MODELS IN EMF ▌ Morsa (document-oriented) │ On-demand loading, incremental updates, fully compatible with the EMF API │ Requires its own query language to get good performance ▌ MongoEMF (document-oriented) │ Uses the standard EMF API │ It behaves different than the standard back-ends ▌ EMF fragments │ Uses the standard proxy mechanism to partition models in small chunks │ Requires modifications on the metamodels to get the benefits of partitions ▌ NeoEMF/Graph, a.k.a. Neo4EMF (graph-based) │ Models are a set of highly interconnected elements → graphs are the most natural way to represent them │ The generated API only performs one-step navigations → only a significant gain in performance is obtained when using native queries on the underlying persistence back-end © ATLANMOD - atlanmod-contact@mines-nantes.fr 12
  13. 13. MOTIVATION ▌ We need a transparent persistence layer able to automatically persist, load and unload model elements with no changes to the application code © ATLANMOD - atlanmod-contact@mines-nantes.fr 13
  14. 14. NEOEMF/MAP DESIGN GOALS Towards a simple EMF-based persistence layer 14
  15. 15. MODEL-PERSISTENCE LAYER ▌NEOEMF/MAP must… … be an exact replacement … use a replaceable underlying engine … allow different types of caching … be memory friendly … provide on-demand load capabilities … free unused memory … outperform current persistence layers using the standard API Interoperability requirements Performance requirements © ATLANMOD - atlanmod-contact@mines-nantes.fr 15
  16. 16. MODEL-PERSISTENCE LAYER © ATLANMOD - atlanmod-contact@mines-nantes.fr Model Manager Persistence Manager Persistence Backend NeoEMF /Map EMF /Graph CDO XMI Serialization Model-based Tools XMI File GraphDB MapDB Caching Strategy RelationalDB Model Access API Persistence API Backend API Client Code 16
  17. 17. NEOEMF/MAP A TRANSPARENT PERSISTENCE LAYER FOR EMF MODELS Memory Management Map-based data model Model operations as map operations 17
  18. 18. MEMORY MANAGEMENT ▌ Decoupling dependencies among objects by assigning a unique identifier to all model objects allows: ▌ Lightweight on-demand loading │ Each live model object has a lightweight delegate object that is in charge of on- demand loading the element data and keeping track of the element’s state ▌ Efficient garbage collection in the JRE │ No hard Java references are kept among model objects. Any model object not directly referenced by the application will be deallocated © ATLANMOD - atlanmod-contact@mines-nantes.fr 18
  19. 19. MAP-BASED DATA MODEL ▌ The unique identifier allows flattening the graph structure into a set of key-value mappings ▌ Operations on hash-maps have a constant cost ▌ Three different (hash-)maps are used to store models’ information: │ Property map: keeps all objects’ data in a centralized place │ Type map: tracks how objects interact with the meta-level (e.g. instance of) │ Containment map: defines the models’ structure in terms of containment references © ATLANMOD - atlanmod-contact@mines-nantes.fr 19
  20. 20. MAP-BASED DATA MODEL ▌Property map │ Key: OID + EstructuralFeature │ Value: data © ATLANMOD - atlanmod-contact@mines-nantes.fr Key Value { ‘c1’, ‘name’ } ‘class1’ { ‘c1’, ‘bodyDeclarations’ } { ‘b1’, ‘b2’ } 20
  21. 21. MAP-BASED DATA MODEL ▌Type map │ Key: OID │ Value: nsURI + EObject’s EClass © ATLANMOD - atlanmod-contact@mines-nantes.fr Key Value ‘c1’ 〈 nsUri=‘http://java’, class=‘ClassDeclaration’ 〉 21
  22. 22. MAP-BASED DATA MODEL ▌Containment map │ Key: OID │ Value: Container’s OID + EStructuralFeature (from parent to child). © ATLANMOD - atlanmod-contact@mines-nantes.fr Key Value ‘c1’ 〈 container=‘p1’, featureName=‘ownedElements’ 〉 22
  23. 23. MODEL OPERATIONS AS MAP OPERATIONS LOOKUPS INSERTS METHOD MIN. MAX. MIN. MAX OPERATIONS ON OBJECTS getType 1 1 0 0 getContainer 1 1 0 0 getContainerFeature 1 1 0 0 OPERATIONS ON PROPERTIES get* 1 1 0 0 set* 0 3 1 3 isSet* 1 1 0 0 unset* 1 1 0 1 OPERATIONS ON MUTI-VALUED FEATURES add 1 3 1 3 remove 1 2 1 2 clear 0 0 1 1 size 1 1 0 0 © ATLANMOD - atlanmod-contact@mines-nantes.fr 23
  24. 24. EXPERIMENTAL EVALUATION Conditions of the experiments Results Summary 24
  25. 25. EXPERIMENTAL EVALUATION ▌ Based on our joint experience with industrial partners: │ We obtained three models from OSS using reverse engineering… │ … that resemble models from real-world scenarios │ We defined a set of queries (GraBaTs’09 and industry-like) │ Only the standard EMF API is used → Queries are backend-agnostic │ Three heap sizes: 8GB, 512MB and 256MB © ATLANMOD - atlanmod-contact@mines-nantes.fr # MODEL SIZE IN XMI ELEMENTS 1 org.eclipse.gmt.modisco.java 19.3MB 80.665 2 org.eclipse.jdt.core 420.6MB 1.557.007 3 org.eclipse.jdt.* 984.7MB 3.609.454 25
  26. 26. EXPERIMENTAL EVALUATION ▌ Selected back-ends: │ NEOEMF/MAP (MapDB) │ NEOEMP/GRAPH (Neo4j embedded) │ CDO (H2 embedded) ▌ Discarded back-ends: │ MongoEMF → does not strictly comply with the standard EMF behavior │ EMF-fragments → requires manual modifications in the source models or metamodels │ Morsa → only a small subset of the experiments ran successfully Configuration details: Intel Core i7 3740QM (2.70GHz), 16 GB of DDR3 SDRAM (800MHz), Samsung SM841 SATA3 SSD Hard Disk (6GB/s), Windows 7 Enterprise 64, JRE 1.7.0_40-b43, Eclipse 4.4.0, EMF 2.10.1, NeoEMF/Map uses MapDB 0.9.10, NeoEMF/Graph uses Neo4j 1.9.2, CDO 4.3.1 runs on top of H2 1.3.168 © ATLANMOD - atlanmod-contact@mines-nantes.fr 26
  27. 27. EXPERIMENT I © ATLANMOD - atlanmod-contact@mines-nantes.fr 9 s 161 s 412 s 41 s 1161 s 3767 s 12 s 120 s 301 s Model 1 Model 2 Model 3 Import model from XMI (8GB) NeoEMF/Map NeoEMF/Graph CDO 27
  28. 28. EXPERIMENT II © ATLANMOD - atlanmod-contact@mines-nantes.fr 4 s 35 s 79 s 3 s 25 s 62 s 16 s 201 s 708 s 14 s 133 s 309 s Model 1 Model 2 Model 3 Model traversal 8GB (incl. loading & unloading) XMI NeoEMF/Map NeoEMF/Graph CDO28
  29. 29. EXPERIMENT II © ATLANMOD - atlanmod-contact@mines-nantes.fr 4 s 3 s 42 s 366 s 15 s 235 s 763 s 13 s 550 s 548 s Model 1 Model 2 Model 3 Model traversal 512MB (incl. loading & unloading) XMI NeoEMF/Map NeoEMF/Graph CDO29
  30. 30. EXPERIMENT III © ATLANMOD - atlanmod-contact@mines-nantes.fr 0 s 0 s 0 s0 s 2 s 19 s 0 s 0 s 2 s Model 1 Model 2 Model 3 Model queries that do not traverse the model 8GB NeoEMF/Map NeoEMF/Graph CDO 30
  31. 31. EXPERIMENT IV 1 s 24 s 61 s 11 s 188 s 717 s 9 s 48 s 367 s Model 1 Model 2 Model 3 GraBaTs’09 8GB NeoEMF/Map NeoEMF/Graph CDO © ATLANMOD - atlanmod-contact@mines-nantes.fr 2 s 36 s 101 s 17 s 359 s 1328 s 9 s 131 s 294 s Model 1 Model 2 Model 3 Unused Methods 8GB NeoEMF/Map NeoEMF/Graph CDO 31
  32. 32. EXPERIMENT V © ATLANMOD - atlanmod-contact@mines-nantes.fr 1 s 24 s 62 s 11 s 191 s 677 s 9 s 118 s 296 s Model 1 Model 2 Model 3 Model modification and saving 8GB NeoEMF/Map NeoEMF/Graph CDO 32
  33. 33. EXPERIMENT V © ATLANMOD - atlanmod-contact@mines-nantes.fr 1 s 160 s 472 s 11 s 224 s 9 s 723 s Model 1 Model 2 Model 3 Model modification and saving 256MB NeoEMF/Map NeoEMF/Graph CDO 33
  34. 34. SUMMARY ▌ NeoEMF/Map performs better than any other solution when using the standard API ▌ NeoEMF/Map presents import times in the same order of magnitude than CDO, but it is about a 33% slower for the largest model → NeoEMF/Map is affected by the overhead produced by modifications on big lists (>100.000 elements) that grow monotonically (caching is needed) ▌ The simple data model with low-cost operations implemented by NeoEMF/Map contrasts with the more complex data model implemented by NeoEMF/Graph (consistently slower by a factor between 7 and 9) © ATLANMOD - atlanmod-contact@mines-nantes.fr 34
  35. 35. SUMMARY ▌ Traversal of a very large model is much faster (up to 9×) by using the NeoEMF/Map ▌ If load and unload times are considered NeoEMF/Map also outperforms XMI ▌ The fast model-traversal ability of NeoEMF/Map is exploited by the pattern followed by most of the queries in the modernization domain ▌ Queries that traverse the model to apply and persist changes perform significantly better on NeoEMF/Map (5× faster on big models, 9× on small models). © ATLANMOD - atlanmod-contact@mines-nantes.fr 35
  36. 36. CONCLUSIONS Conclusions Future work © ATLANMOD - atlanmod-contact@mines-nantes.fr 36
  37. 37. CONCLUSIONS ▌ Map-based persistence layer to handle VLMs ▌ Comparison against relational-based and graph-based alternatives ▌ EMF as the implementation technology ▌ We used queries from some of our industrial partners in the model-driven modernization domain as experiments ▌ Typical model-access APIs, with fine-grained methods with one-step-navigation queries, do not benefit from complex relational or graph- based data structures. ▌ Low-level data structures, like hash-tables, with low and constant access times provide better results © ATLANMOD - atlanmod-contact@mines-nantes.fr 37
  38. 38. FUTURE WORK ▌ Caching strategies: │ Element unloading (which element is not needed anymore?) │ Element prefetching (which element will be needed in future?) ▌ Benefits of other backends depending on the specific application scenario: │ Graph-based persistence solutions when some of our requirements can be dropped │ Bypassing the model access API by translating the queries to high performance native graph-database queries may provide great benefits © ATLANMOD - atlanmod-contact@mines-nantes.fr 38
  39. 39. MAP-BASED TRANSPARENT PERSISTENCE FOR VERY LARGE MODELS lin Abel Gómez Massimo Tisi Gerson Sunyé and Jordi Cabot

×