Incremental pattern matching in the VIATRA2 model transformation framework

1,030 views

Published on

Presented at an invited talk session for the University of Leicester in spring 2009.

Published in: Technology, Economy & Finance
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,030
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Incremental pattern matching in the VIATRA2 model transformation framework

  1. 1. Incremental
pa,ern
matching
in
the
 VIATRA2
model
transforma9on
 framework
 István
Ráth
(rath@mit.bme.hu)
 Budapest
University
of
Technology
and
Economics
Budapest
University
of
Technology
and
Economics
Department
of
Measurement
and
Informa<on
Systems

  2. 2. Overview
  Introduc9on
to
GT
 o from
the
VIATRA2
perspec9ve
  Incremental
pa,ern
matching
with
RETE
  Ini9al
Performance
analysis
 o Incremental
vs.
Local
search
  Fine
tuning

 o Paralleliza9on
 o Hybrid
Pa,ern
Matching
  Ongoing
Research/Future
work
  Summary

  3. 3. INTRODUCTION
Graph
Transforma9ons
from
the
VIATRA2
perspec9ve…

  4. 4. MDA,
DSM
in
prac9ce
 Modeling PIM (re-engineering) Domain-specific models Embedded CORBA J2EE Platform-specific platform model model models model CORBA J2EE Embedded Legacy Applicationapplication application application code BME‐MIT
Miniszimpózium

  5. 5. MDA,
DSM
in
prac9ce
 Modeling PIM (re-engineering) Domain-specific models Domain‐specific
 views
 Embedded CORBA J2EE Platform-specific platform model model models model CORBAapplication J2EE application Embedded application Applica9ons
 Legacy code Application of
VIATRA2
 BME‐MIT
Miniszimpózium

  6. 6. 6
 Metamodeling
 Class 1 At most one tokens Place 1 Association * 1 outarc Metamodel Token Multiplicity inarc * * constraint Arbitrary Slot Transition Instance model t1:Transition Object a4:outarc a3:inarc p1:Place Link p3:Place a1:inarc a2:outarctkn3:tokens to1:Token t2:Transition ICGT
08

  7. 7. 7
 Graph
Transforma9on
 LHS RHS a1:inarc a2:outarc a1:inarc a2:outarc Place Tran. Place Place Tran. Plan ttn1:tokens tkn2:tokens Token Token matching updatingPhases of GT matching –  Pattern Matching phase –  Updating phase: delete+ createPattern Matching is the most critical issue from performance viewpoint
  8. 8. Incremental
model
transforma9ons
  Key
usage
scenarios
for
MT:
 o  Mapping
between
languages
 o  Intra‐domain
model
manipula9on
 •  Model
execu9on
 •  Validity
checking
(constraint
evalua9on)
  They
work
with
evolving
models.
 o  Users
are
constantly
changing/modifying
them.
 o  Users
usually
work
with
large
models.
  Problem:
transforma9ons
are
slow
 o  To
execute…
(large
models)
 o  and
to
re‐execute
again
and
again
(always
star9ng
from
scratch).
  Solu9on:
incrementality
 o  Take
the
source
model,
and
its
mapped
counterpart;
 o  Use
the
informa9on
about
how
the
source
model
was
changed;
 o  Map
and
apply
the
changes
(but
ONLY
the
changes)
to
the
target
 model.

  9. 9. Towards
incrementality
  How
to
achieve
incrementality?
 o Incremental
updates:
avoid
re‐genera9on.
 •  Don’t
recreate
what
is
already
there.
 •  
Use
reference
(correspondence)
models.
 o Incremental
execu+on:
avoid
re‐computa9on.
 •  Don’t
recalculate
what
was
already
computed.
 •  How?

  10. 10. INCREMENTAL
GRAPH
PATTERN
 MATCHING
With
the
RETE
algorithm,
as
implemented
in
VIATRA2

  11. 11. Incremental
graph
pa,ern
matching
  Graph
transforma9ons
require
pa,ern
matching
 o Most
expensive
phase
  
Goal:
retrieve
the
matching
set
quickly
  How?
 o Store
(cache)
matches
 o Update
them
as
the
model
changes
 •  Update
precisely
  Expected
results:
good,
if…
 o There
is
enough
memory
(*)
 o Queries
are
dominant
 o Model
changes
are
rela9vely
sparse
(**)
 o e.g.
synchroniza9on,
constraint
evalua9on,
…

  12. 12. Opera9onal
overview
 XForm

pa,ern
 interpreter
 model
manipula9on
matching
 Incremental
 VIATRA

 
pa,ern
 event

 no9fica9on
 Model
space
 matcher
 updates

  13. 13. Architecture
 LS
pa,ern
 Model
 XForm
 matcher
 parser
 parser
 XML
 XForm
VIATRA2
Framework
 serializer
 interpreter
 Incremental
 Na9ve
importer
&
 
pa,ern
 loader
interface
 matcher
 Core
interfaces
 VIATRA
Model
space
 Program
model
store

  14. 14. Core
idea:
use
RETE
nets
  RETE
network
 Model
space
 INPUT
 t3
 o  node:
(par9al)
matches
of
a
 p1
p2
p3
t1
 t2
 k1
k2
t3
 (sub)pa,ern
 t3
 t3
 o  edge:
update
propaga9on
 t3
  Demonstra9ng
the
principle
 o  input:
Petri
net
 Input
nodes
 :
Place
 p1
p2
p3
 :
Token
 k1
k2
 :
Transi9on
 t1
 t2
 t3
 o  pa,ern:
fireable
transi9on
 o  Model
change:
new
transi9on
 t3
 (t3)
 t1
 Intermediate
 p1,
k1
 p2,
k2
 p1
 nodes
 p1,
k1,
t1
 p2,
k2,
t3
 p3
 p2
 t2
 p2,
k2,
t3
 t3
 Produc9on
node
 p1,
k1,
t1,
p3
 p2,
k2,
t2,
p3

  15. 15. RETE
network
construc9on
  Key:
pa,ern
decomposi9on
 o Pa,ern
=
set
of
constraints
(defined
over
pa,ern
variables)
 o Types
of
constraints:
type,
topology
(source/target),
 hierarchy
(containment),
a,ribute
value,
generics
 (instanceOf/supertypeOf),
injec+vity,
[nega9ve]
pa,ern
 calls,
…
  Construc9on
algorithm
(roughly)
 o 1.
Decompose
the
pa,ern
into
elementary
constraints
(*)
 o 2.
Process
the
elementary
constraints
and
connect
them
 with
appropriate
intermediate
nodes
(JOIN,
MINUS‐JOIN,
 UNION,
…)
 o 3.
Create
terminator
produc9on
node

  16. 16. Key
RETE
components
  JOIN
node
 INPUT
 INPUT
 INPUT
 o ~rela9onal
 algebra:
natural
 join
 JOIN
  MINUS‐JOIN
 JOIN
 o Nega9ve
 existence
(NACs)
 PRODUCTION
 sourcePlace

  17. 17. Suppor9ng
a
rich
pa,ern
language
  Pa,ern
calls
 o Simply
connect
the
produc9on
nodes
 o 
Pa,ern
recursion
is
fully
supported
  OR‐pa,erns
 o UNION
intermediate
nodes
  Check
condi9ons
 o check
(value(X)
%
5
==
3)
 o check
(length(name(X))
<
4)
 o check
(myFunction(name(X))!=‘myException’)
 o 
Filter
and
term
evaluator
nodes
  Result:
full
VIATRA
transforma9on
language
support;
 any
pa,ern
can
be
matched
incrementally.

  18. 18. Updates
  Needed
when
the
model
space
changes
  VIATRA
no9fica9on
mechanism
(EMF
is
also
possible)
 o Transparent:
user
modifica9on,
model
imports,
results
of
a
 transforma9on,
external
modifica9on,
…

RETE
is
always
 updated!
  Input
nodes
receive
elementary
modifica9ons
and
 release
an
update
token
 o Represents
a
change
in
the
par9al
matching
(+/‐)
  Nodes
process
updates
and
propagate
them
if
 needed
 o PRECISE
update
mechanism

  19. 19. PERFORMANCE
ANALYSIS

  20. 20. Performance
  In
theory…
 o Building
phase
is
expensive
(“warm‐up”)
 •  How
expensive?
 o Once
the
network
is
built,
pa,ern
matching
is
an
 “instantaneous”
opera9on.
 •  Excluding
the
linear
cost
of
reading
the
result
set.
 o But…
there
is
a
performance
penalty
on
model
 manipula9on.
 •  How
much?
  Dependencies?
 o Pa,ern
size
 o Matching
set
size
 o Model
size
 o …?

  21. 21. 21
 Benchmarking
  Aim:
 o systema9c
and
reproducible
measurements
 o on
performance
 o under
different
and
precisely
defined
circumstances
  Overall
goal:
 o help
transforma9on
engineers
in
selec9ng
tools,
fine
tuning
 op9ons
 o Serve
as
reference
for
future
research
  Popular
approach
in
different
fields
 o AI
 o rela9onal
databases
 o rule‐based
expert
systems
 ICGT
08

  22. 22. 22
 Benchmarking
in
graph
transforma9on
  Specifica9on
examples
for
GT
 o Goal:
assessing
expressiveness
 o UML‐to‐XMI,
object‐rela9onal
mapping,

 UML‐to‐EJB,
etc.
  „Generic”
Performance
benchmarks
for
GT
 o Varró
benchmark

 o R.
Geiß
and
M.
Kroll:
On
Improvements
of
the
Varró
 Benchmark
for
Graph
Transforma<on
Tools
 o (Ag<ve
Tool
Contest,
Grabats
’08,
…)
  Our
ICGT’08
paper:
 Benchmarks
for
graph
transforma<on
with
incremental
 pa_ern
matching
 o simula9on
 o synchroniza9on
 ICGT
08

  23. 23. Petri
net
simula9on
benchmark
  Example
transforma9on:
Petri
net
simula9on
 o One
complex
pa,ern
for
the
enabledness
condi9on
 o Two
graph
transforma9on
rules
for
firing
(tokens
are
re‐created)
 o As‐long‐as‐possible
(ALAP)
style
execu9on
(“fire
at
will”)
 o Model
graphs:
 •  A
“large”
Petri
net
actually
used
in
a
research
project
(~60
places,
~70
 transi9ons,
~300
arcs)
 •  Scaling
up:
automa9c
genera9on
preserving
liveness
(up
to
100000
 places,
100000
transi9ons,
500000
arcs)
  Analysis
 o Measure
execu9on
9me
(average
mul9ple
runs)
 o Take
“warm‐up”
runs
into
considera9on
  Profiling
 o Measure
overhead,
network
construc9on
9me
 o “Normalize”
results

  24. 24. Profiling
results
  Model
manipula9on
overhead:
~15%
(of
overall
CPU
 9me)
 o Depends
largely
on
the
transforma9on!
  Memory
overhead
 o Petri
nets
(with
RETE
networks)
up
to
~100000
fit
into
 1‐1.5GB
RAM
(VIATRA
model
space
limita9ons)
 o Grows
linearly
with
model
size
(as
expected)
 o Nature
of
growth
is
pa,ern‐dependent
  Network
construc9on
overhead
 o Similar
to
memory;
pa,ern‐dependent.
 o PN:
In
the
same
order
as
VIATRA’s
LS
heuris9cs
 ini9aliza9on.

  25. 25. Execu9on
9mes
for
Petri
net
simula9on
 Matches/outperforms
 Sparse
Petri
net
benchmark
 GrGEN.NET
for
large
 models
and
high
 1000000
 itera9on
counts.
 Viatra/RETE
(x1k)
 Three
orders
of
 100000
 magnitude
and
 Viatra/LS
(x1k)
 growing…
Execu<on
<me
(ms)
 10000
 GrGenNET
(x1k)
 1000
 Viatra/RETE
 100
 (x1M)
 GrGen.NET
 10
 (x1M)
 100
 1000
 10000
 100000
 Petri
net
size

  26. 26. 26
 Object‐Rela9onal
Mapping:
Synchroniza9on
 :schemaRef : Package : Schema :tableRef : Class : Table :columnRef : Attribute : Attribute : Column : Column :columnRef orphanTable Sync order 1. Orphan schema deleteNEG 2. Orphan Tableclass delete : tablerRef C: Class 3. Orphan Tableassoc delete {DEL} 4. Orphan Columnattr. DeleteOR T : Table 5. Renaming A: Association 6. new Classes/Assocs/Columns : tableRef created ICGT
08

  27. 27. 27
 Object‐Rela9onal
Mapping:
Synchroniza9on
  Test Case generation   Source modification −  Fully connected graph −  1/3 classes deleted −  N classes −  1/5 associations deletes −  N(N-1) directed association −  ½ attributes renamed −  K attributes −  1 new class added and the fully connected graph is rebuilt  Execution   Characteristic −  Phases Paradigm 1.  Generation Features ORM Syn. 2.  Build LHS size large 3.  Modification fan-out medium 4.  Synchronization matchings PD −  Measured: Synchronization transformation phase sequence length PD ICGT
08

  28. 28. 28
ORM
Synchroniza9on
benchmark
results
 •  xecution time: E •  ETE: linear R •  S: exponential L Wrt. number of affected elements ICGT
08

  29. 29. 29
 Varró:
STS
Benchmark
•  on reusable model elements N•  uites better for LS S ICGT
08

  30. 30. Ini9al
benchmarking:
summary
  Predictable
near‐linear
growth
 o As
long
as
there
is
enough
memory
 o Certain
problem
classes:
constant
execu9on
9me

  Benchmark
example
descrip9ons,
specifica9ons,
and
 measurement
results
available
at:
 o h_p://wiki.eclipse.org/VIATRA2/Benchmarks

  31. 31. FINE
TUNING
PERFORMANCE

  32. 32. Improving
performance
  Strategies
 o Improve
the
construc9on
algorithm
 •  Memory
efficiency
(node
sharing)
 •  Heuris9cs‐driven
constraint
enumera9on
(based
on
pa,ern
[and
 model
space]
content)
 o Parallelism
 •  Update
the
RETE
network
in
parallel
with
the
transforma9on
 •  Support
parallel
transforma9on
execu9on
 •  Paper
at
GT‐VMT’09:
Paralleliza9on
of
Graph
Transforma9on
 Based
on
Incremental
Pa,ern
Matching
 o Hybrid
pa,ern
matching
 •  „mix”
pa,ern
matching
strategies,
use
each
at
its
best
 •  Paper
at
ICMT’09:
Efficient
model
transforma9ons
by
combining
 pa,ern
matching
strategies

  33. 33. Concurrent
pa,ern
matching
   Asynchronous
update
propaga9on
in
RETE
   Concurrent
to
the
transforma9on
 o Wait
for
update
propaga9on
only
when
querying
 query
sourcePlace
instantaneous
 token
removed
no<fica <on
asynchronous
 token
removed
no<ficaasynchronous
 <on
 query
targetPlace
synchronizing

 token
added
no<fica<asynchronous
 on
 transforma9on
 RETE

  34. 34. Concurrent
PM
Performance
  Theore9cal
expecta9ons
 Pa_ern
matcher
 Update
overhead
 Pa_ern
query
 LS
 0
 slow
 RETE
 large
 instantaneous
 concurrent
RETE
 smaller
 instantaneous,
poten9al
wait
  Ini9al
measurements
 Pa_ern
matcher
 N=7
 N=8
 N=9
 N=10
 LS
 0.8s
 3.2s
 37s
 390s
 RETE
 0.8s
 2.6s
 8.3s
 26.2s
 concurrent
RETE
 0.8s
 2.2s
 6.9s
 22.8s
 20%
speed
 increase

  35. 35. Mul9‐threaded
pa,ern
matching
  Mul9‐threaded
RETE
propaga9on
 o Node
set
par99oned
into
containers
 Detec9ng
the
fixpoint
is
nontrivial
 Timestamp‐based
synchroniza9on
protocol
 Pa,ern
query:
wait
for
termina9on!
 o Dedicated
update
propaga9on
thread
per
container
 High
connec9vity
is
 Place
 Token
 bad
for
performance
 Transi9on
 p1
p2
p3
 k1
k2
 t1
 t2
 t3
 Whoops…
 p1,
k1
 p2,
k2
 t3
OK!
 OK!
 p1,
k1,
t1
 p3,
k2,
t3
 p3,
k2,
t3
 p1,
k1,
t1,
p3
 p3,
k2,
t2,
p2
 OK!
 thread
1
 thread
2
 thread
3

  36. 36. Mul9‐threaded
PM
performance
  High
connec9vity
is
bad
for
performance
 o Too
much
synchroniza9on
between
threads…
 o Par99on
granularity:
i.e.
pa,ern‐wise
 o Transforma9on
parts
using
independent
pa,erns?
 o Intelligent
arrangement?

  37. 37. Parallel
rule
execu9on
  Rule
execu9on
is
dominant
(if
PM
is
fast)
 o There
is
more
to
gain
there!
  Problem:
parallel
rules
must
not
conflict
 o Core
conflicts:
see
cri9cal
pair
analysis
 o Pa,ern
composi9on,
recursion…
 o Parallel
rules
require
thread‐safe
PM,
model
  Exclusion
/
locks
are
required
 token
removed
no<fica <on
 lock
by
thread
2
 token
removed
no<fica <on
 lock
by
thread
1
 thread
1
 thread
2
 RETE

  38. 38. Evalua9ng
parallel
performance
 PNML
generator
 Single
 Double
  Performance
evalua9on
 Serial
 2.9s
 5.8s
 o Code
genera9on:
 Parallel
 3.7s
 3.9s
 50%
speed
increase
 
read
only
 o GraBaTs
‘08
AntWorld
benchmark:
4
CPUs,
was
slower

  Observa9on:
locks
can
become
bo,lenecks
  Proposed
solu9ons
 o Par9al
locking?
Rule‐level
locking?
  Read‐intensive
transforma9on
is
preferable

  39. 39. Parallelism:
lessons
learned
  Incremental
PM
might
be
good
for
paralleliza9on
 o No
search,
shorter
locking!
  Incremental
PM
might
be
bad
for
paralleliza9on!
 o Update
propaga9on
keeps
locks
longer
 o Concurrent
matcher
solves
this
  Future
work
 o Enhancing
model
locking
 o Clever
mul9‐threading
of
RETE
 o Very
large
models:
distributed
MT,
distributed
PM

  40. 40. Hybrid
pa,ern
matching
  Idea:
combine
local
search‐based
and
incremental
 pa,ern
matching
  Mo9va9on
 o Incremental
PM
is
be,er
for
most
cases,
but…
 •  Has
memory
overhead!
 •  Has
update
overhead
 o 
LS
might
be
be,er
in
certain
cases

  41. 41. Where
LS
is
be,er…
  Memory
consump9on
 o RETE
sizes
grow
roughly
linearly
with
the
model
space
 o Constrained
memory

trashing
  Cache
construc9on
9me
penalty
 o RETE
networks
take
9me
to
construct
 o „naviga9on
pa,erns”
can
be
matched
quicker
by
LS

  Expensive
updates
 o Certain
pa,erns’
matching
set
is
HUGE
 o Depends
largely
on
the
transforma9on

  42. 42. Hybrid
PM
in
the
source
code
 Assign
a
PM
implementa9on
on
a
 per‐pa,ern
(per‐rule)
basis

 ability
to
fine
tune
performance
 on
a
very
fine
grained
level.

  43. 43. Modified
ORM
Synchroniza9on
benchmark
 Hybrid
scales
 be,er
with
 increasing
 models!

  44. 44. AntWorld
benchmark
  Paper
for
a
special
issue
in
STTT’09:
Experimental
 assessment
of
combining
pa,ern
matching
 strategies
with
VIATRA2
  In‐detail
inves9ga9on
 o hybrid
approach
 o Transforma9on
language‐level
op9miza9ons

  45. 45. AntWorld
Results
 Linear
characteris9c
retained,
slower
by
only
 a
constant
mul9plier

  46. 46. Memory
footprint
Linear
reduc9on
in
memory
 overhead

  47. 47. Language‐level
op9miza9ons

  48. 48. Model
management
op9miza9ons

  49. 49. Op9miza9on
summary
  In
short:
you
may
get
a
linear
decrease
in
memory
 for
a
linear
increase
in
execu9on
9me

retains
 complexity
class
characteris9cs


  50. 50. ONGOING
RESEARCH
&
 FUTURE
WORK

  51. 51. Event‐driven
live
transforma9ons
  Problem:
MT
is
mostly
batch‐like
 o But
models
are
constantly
evolving

Frequent
re‐ transforma9ons
are
needed
for
 •  mapping
 •  synchroniza9on
 •  constraint
checking
 •  …
  An
incremental
PM
can
solve
the
performance
 problem,
but
a
formalism
is
needed
 o to
specify
when
to
(re)act
 o and
how.
  Ideally,
the
formalism
should
be
MT‐like.

  52. 52. Event‐driven
live
transforma9ons
(cont’d)
  An
idea:
represent
events
as
model
elements.
  Our
take:
represent
events
as
changes
in
the
 matching
set
of
a
pa,ern.
 o ~generaliza9on
  Live
transforma9ons
 o maintain
the
context
(variable
values,
global
variables,
…);
 o run
as
a
“daemon”,
react
whenever
necessary;
 o as
the
models
change,
the
system
can
react
instantly,
since
 everything
needed
is
there
in
the
RETE
network:
no
re‐ computa+on
is
necessary.

  Paper
at
ICMT2008:
Live
model
transforma9ons
 driven
by
incremental
pa,ern
matching.

  53. 53. Future
work
  Paralleliza9on
 o A
lot
of
research
going
on
(GrGEN.NET,
FUJABA,
…)
 o We
have
some
tricks
le•
to
be
explored,
too
  New
applica9ons
of
incremental
PM
 o Live
transforma9ons

model
synchroniza9on,
 constraint
evalua9on,
trace
model
genera9on
 o Stochas9c
model
simula9on

collabora9on
with
 Reiko’s
group
  Adapta9on
of
our
technology
to
other
pla€orms
 o ViatraEMF

  54. 54. Summary
  Incremental
pa,ern
matching
in
VIATRA2
 o A
leap
forward
in
performance
 o Applicable
to
other
tools
 o In‐depth
inves9ga9ons
revealed
interes9ng
details
  Future
 o Performance
will
be
further
improved
 o We’re
working
on
new
applica9ons
  Thank
you
for
your
a,en9on.
 h,p://eclipse.org/gmt/VIATRA2
 h,p://wiki.eclipse.org/VIATRA2


×