Parallel Execution of Model Management Programs (STAF 2017)

Parallel Execution of Model
Management Programs
Sina Madani
supported through CROSSMINER EC project
supervised by Dr. Dimitris Kolovos and Prof. Richard Paige

Introduction
 Motivation for the project
 Brief literature review
 Proposed Solution
 Work so far
 Expected contributions
 Plan for Evaluation

Problem and Motivation
 Model-Driven Engineering (MDE) used in industrial contexts
 e.g. automotive, civil engineering, reverse-engineered code
 Current tools not built to handle Very Large Models (VLMs)
 “Very Large” meaning millions of elements
 Scalability is a widely cited concern with MDE
 Execution of programs with VLMs as input is very slow
 Almost all mainstream MDE tools are single-threaded in execution

“Model Management”
 Simply doing things with models!
 Querying
 Validation
 Model-to-Model Transformations (M2M)
 Model-to-Text Transformations (M2T) / code generation
 Comparison
 Merging
 Migration

Related Works – overview
 Increasing interest in improving performance of MDE tools
 Main approaches:
 Incrementality – only computing the delta of changes (caching)
 Laziness – delaying computation until it can no longer be avoided
 Reactivity – event-driven computations (incremental and lazy)
 Parallelism – splitting the computation across multiple threads
 Distribution – using multiple computers to perform the computation

Incrementality
 The most explored solution (especially in graph transformations)
 Querying / Pattern matching (Bergmann, Varró, Ráth...)
 Model-to-Model transformations (Jouault & Tisi, 2010)
 Model-to-Text transformations (Ogunyomi, 2016)
 Model Validation (Cabot & Teniente, 2006)
 Partial evaluation (Razavi & Kontogiannis, 2012)

Lazy and Reactive approaches
 Lazy ATL (Tisi et al., 2011)
 Navigation (source) and Generation (target)
 Lazy OCL (Tisi et al., 2015)
 Includes lazy collections based on iterators
 Reactive ATL (Pérez et al., 2015)
 Automated model transformations
 VIATRA-3 (Bergmann et al., 2015)
 Event-driven incremental model transformations
 Suitable for real-time applications

Parallel and Distributed
 A number of works on parallel graph transformations
 LinTra (Burgueño et al., 2013 / 2015)
 Concurrent model transformation by streaming from tuple space
 Parallel ATL (Tisi et al., 2013)
 Task-parallel approach, 2.2x speedup with 4 cores / threads
 Distributed ATL based on MapReduce (Benelallam et al., 2015)
 Data-parallel approach, ~3x speedup with 8 nodes
 Efficient Model Partitioning (Benelallam et al., 2016)

Un(der)-explored areas
 Most work focused on model-to-model transformations
 Distributed execution on multi-threaded machines
 Combining parallelism with incrementality/laziness
 Using GPUs for accelerating execution

Incremental Lazy Parallel Distributed GPGPU Reactive
Querying Too many to
name!
ATL ATL ATL-MR
IncQuery-D
Reactive-ATL
Viatra3
M2M Too many to
name!
ATL ATL
LinTra
ATL
LinTra
ATL
Viatra3
Validation OCL OCL OCL
EVL (partial)
M2T EGL
Comparison
Dark Green: Main focus areas of research
Light Green: Potential interest / contributions of research (if time permits)
Grey: Unlikely to be within research scope

Proposed Solution – overview
 Concurrent execution of model management tasks
 Using Epsilon as implementation testbed
 Offers DSLs for various tasks based on a common language
 Hybrid (imperative/declarative) DSLs
 Convenient for generalising concurrency concerns across tasks
 Open-source Eclipse project (intend to merge our changes)
 Minimising impact to existing codebase is a concern

Epsilon at a glance
http://eclipse.org/epsilon

Proposed Solution – challenges
 Concurrent modification
 Dependencies pose a problem – require synchronisation or duplication
 Parallelisation of imperative constructs
 EOL potentially allows execution of any Java program!
 Can limit scope to pure functions on collections
 GPU acceleration requires a limited programming model
 Low-level APIs
 Only primitives and one-dimensional arrays as data
 No / limited branching logic

Preliminary Work
 Focused on Epsilon Validation Language (EVL)
 Read-only model (except fixes), so simplifies concurrency
 Element (data) and constraint (task) parallel
 Implemented and tested multiple parallel approaches
 ThreadPoolExecutor, concurrent collections... (minimal synchronization)
 Equivalence testing with original implementation
 Results show promising speed-ups (~3x with 4 threads)
 Tests seem to be passing so far

EVL example – couples in movies
context Couple {
constraint twoDifferentPeople {
guard: self.commonMovies.size() > 5
check {
if (self.p1.name == self.p2.name) {
return not self.p1.movies.includesAll(self.p2.movies);
}
return true;
}
message: "Couple contains the same person!"
}
}

Future Work & Expected Contributions
 Concurrent implementation of EVL, ECL, EPL, EGX
 and likely ETL for comparing our results and approach to other works
 Investigation into GPU acceleration of model management programs
 Investigation into distributed parallelism of model management tasks
 Combining parallel execution with laziness and incrementality

Evaluation Plan
 Two primary aspects:
 Correctness – does our implementation behave as it should?
 Performance – how much faster is our approach/implementation?
 Testing on very large models and complex programs
 Though finding them is proving to be a challenge!
 Equivalence testing concurrent / non-concurrent implementations
 Also comparing with other tools for consistency
 Requires writing semantically identical scripts in other languages

Current Status
 Started in January
 Mostly literature review
 Finalising tests for Parallel EVL v1.0
 More scripts and models which can be re-used for evaluating other tasks
 Testing with lots of cores / threads on computing cluster
 Looking into distributed processing (Spark, Hadoop, Kafka...)
 Next tasks: Pattern-matching (EPL), Model-to-Text (EGX)
 Intend to see how much of Parallel EVL approach can be re-used

Summary
 Current model management execution engines are inefficient
 Mostly single-threaded, sequential
 Laziness and incrementality are desirable but insufficient
 Concurrency is hard!

Questions?
Thank you for listening!
Contact:
Sina Madani
sm1748@york.ac.uk
Thread 1 Thread 2 Thread N
Results
Concurrent
Collections
Merge
Batch
jobs
ExecutorService
Submit

Proposed Solution – justification
 Why parallelism?
 Single-threaded CPU performance relatively stagnant
 All general-purpose CPUs now multi-core (increasing thread counts)
 Distributed / cloud computing resources more ubiquitous
 Data-parallel approach (SIMD)
 Scalability concerns are with # of model elements
 Allows for partial distribution of data per processing unit
 Minimises synchronisation / contention -> better performance
 Suitable for stream processing (e.g. GPUs, various frameworks)
 Avoids rule dependencies

Parallel Execution of Model Management Programs (STAF 2017)

Recommended

Recommended

More Related Content

What's hot

What's hot (14)

Similar to Parallel Execution of Model Management Programs (STAF 2017)

Similar to Parallel Execution of Model Management Programs (STAF 2017) (20)

Recently uploaded

Recently uploaded (20)

Parallel Execution of Model Management Programs (STAF 2017)

Editor's Notes