Integrating Efficient Model Queries
                  in State-of-the-art EMF Tools

                      Gábor Bergmann, Ábel Hegedüs, Ákos
                     Horváth, István Ráth, Zoltán Ujhelyi and
                                   Dániel Varró



Budapest University of Technology and Economics   TOOLS Europe 2012
Fault Tolerant Systems Research Group             Prague, Czech Republic, 2012. 05. 30.
Overview
   Introduction
   EMF-INCQUERY overview
   Transparent query integration
   Performance considerations
   Conclusion
MDE today
                                          Modeling techniques
                           Integrated model-driven tools
         REQ

  Model-              SD
based early
 analysis Architecture
                                 IMP
           modeling,
            DSMLs                            TST
                           Code
                        generation,
IDE                    model-assisted         Model-based
                       development           test generation
MDE today
                                   Modeling techniques
                      Eclipse Modeling Framework
           REQ

                 SD

                           IMP

                                      TST


IDE - Eclipse
A key problem of MDE
 Scalability vs. modeling tools
   o Issues encountered by several industrial partners using
     tools based on the Eclipse Modeling Framework (EMF)
   o Modeling scenarios can get really complex really quickly
      • Instance models of size 1-2M and beyond
      • Performance issues with model transformations and code
        generators have an adverse effect on everyday development tasks
 Scalability?              Recognized e.g. by AUTOSAR tool vendors
   o Complex                 http://wiki.eclipse.org/Auto_IWG_WP2
     (meta)models
   o Large instance models
   o Complex query and manipulation scenarios
Focus: model queries
 Model queries: “a piece of code that retrieves a given set of the
  model”
 Queries are at the heart of MDE
   o Every model access/read is a (simple) query
   o More complex: Views, content providers
   o Most complex: Model transformations, code generators, …
 Query performance (= the speed of content retrieval) is crucial
   o First vs. consecutive vs. throughput
   o Query result vs. query contents
 Core problems
   1. Slow query engines
   2. Considerable programming effort necessary to get complex
      queries right
EMF-INCQUERY
Overview
 EMF-INCQUERY: a model query engine
  o Supports batch queries
  o Optimized for incremental queries!
 Incremental evaluation
  o Based on the RETE algorithm
  o Compute once, update afterwards
  o Gain: Instant re-evaluation
  o Price: Uses some more memory
     • Manageable with proper life cycles
EMF instance
   model
                Input = Model contents + changes (EMF notifications)
                               Input     RETE network
                               nodes

Query definition
                          Intermediate
                              nodes


                             Output       Delta
                             nodes       monitor

                         Generated query
                           components
                Query engine

           Output = Query results + (subsequent) Query result deltas
Benefits of EMF-INCQUERY
 Makes on-the-fly well-formedness validation, view
  maintenance, … feasible over really large instance
  models
 Simplifies writing really complex queries
  o Graph pattern language
  o Highly reusable  query libraries
 Easy-to-integrate into existing apps
  o Works with any EMF domain metamodel
  o Integrates through Eclipse standards  works with
    many EMF-based apps out-of-the-box
DEMO INCQUERY Pattern Language
 School metamodel in EMF
INCQUERY Pattern Language




      Expressive declarative query language
      by graph patterns
         Capture local + global queries
         Compositionality + Reusabilility
         „Arbitrary” Recursion, Negation
New features with version 0.6
 Tools
  o Xtext2-based language
     • Content assist (code completion), validation, outline, …
  o Unique query language features
     • Unlimited recursion and transitive closure
     • Aggregate functions
     • Match/exceed the expressive power of OCL, while providing more
       flexible re-use
 Runtime
  o Build and execute queries on-the-fly
  o Strong generative bindings to Java
  o Efficient incremental transitive closure
INCQUERY Development Tools
                                • Works with most EMF-
                                  based editors out-of-
                                  the-box
                                • Reveals matches as
                                  selection


               Pattern Editor


Queries are applied
& updates on-the-
fly
                                              Query Explorer
INCQUERY Model Validation

    • Works with most EMF-
      based tools out-of-the-
      box
    • Manages error-warning
      markers on-the-fly as the
Standard Eclipse BPMN Editor
      user is editing the model
      = Instantaneous feedback




                                  Markers in the Problems View
TRANSPARENT QUERY
   INTEGRATION

  … beyond the development tools
Derived features in EMF
Derived features:                        Derived attributes:
• Supported by auxiliary                 • Typed scalar values
  mechanisms (e.g. Java, OCL)            • Supported by
• Supported by EMF-based                   INCQUERY derivatives
                                              <<derived>>
  tools out-of-the-box
                      <<derived>>


                         <<derived>>




                     Derived edges:
                     • Binary relationships
                     • Supported by binary
                       INCQUERY patterns
INCQUERY derivatives
                                  You can combine IQPL
                                  with (side-effect free,
                                   deterministic) Xbase



Key benefits:
• Automatic
  dependency-aware
  change notifications              Derived attribute
  for derived features             “hasPrimeWeight”:
• Instantaneous result               True iff pattern
  retrieval                       “coursePrimeWeight”
                                    matches to host C
PERFORMANCE BENCHMARK

     Instantaneous result retrieval?
Benchmarking scenario
 In-memory models
   o Embedded railways control software system design domain
   o Instance models up to 2.7M elements
 Scenario: well-formedness validation
   o 5 rules, from simple to very complex
   o Batch validation: loading models + executing queries
   o On-the-fly revalidation: modifying models + re-executing
     queries instantaneously
 Tools
   o Eclipse tools (OCL, INCQUERY, EMF-Java)
   o RDF/SPARQL engines (for comparison)
 http://viatra.inf.mit.bme.hu/publications/trainbenchmark
Revalidation on-the-fly

   SPARQL tools are
                            • Incremental EMF engines
generally 1-2 Orders-of-
                              typically 5-10x faster
Magnitude (OM) slower
                            • sub-100ms response times
  than top EMF tools
                              for up to 1.5M elements
A closer look at the top

    Performance advantage
     of OCL-IA over OCL is
      not significant with
        complex queries




                       INCQUERY is 1-1.5 orders-
                         of-magnitude faster
                             than OCL-IA
Memory usage
• Incremental engines
  impose a linear memory
  consumption overhead
• INCQUERY overhead is
  larger than OCL-IA
• Even the largest models
  fit into 1GB of RAM
Final points
 EMF-INCQUERY has been proposed as an
  Eclipse.org project under EMFT
  http://www.eclipse.org/proposals/modeling.emf.incquery/
 EMF-INCQUERY 0.6 preview release is available
  immediately
  o Pointers:
    http://viatra.inf.mit.bme.hu/incquery
    http://viatra.inf.mit.bme.hu/incquery/getting_started
    http://viatra.inf.mit.bme.hu/performance
 Release 0.6 is scheduled shortly after the Eclipse
  Juno Launch (beginning of July)
Acknowledgements
 Additional development team members
  o Márk Czotter, Tamás Szabó
 Performance benchmarking
  o Benedek Izsó, Zoltán Szatmári
 Testing, examples
  o Attila Csicsely, Tamás Csurgó, Dániel Kávássy, Gábor
    Szárnyas, Tamás Tóth



 Thank you very much!

EMF-IncQuery presentation at TOOLS 2012

  • 1.
    Integrating Efficient ModelQueries in State-of-the-art EMF Tools Gábor Bergmann, Ábel Hegedüs, Ákos Horváth, István Ráth, Zoltán Ujhelyi and Dániel Varró Budapest University of Technology and Economics TOOLS Europe 2012 Fault Tolerant Systems Research Group Prague, Czech Republic, 2012. 05. 30.
  • 2.
    Overview  Introduction  EMF-INCQUERY overview  Transparent query integration  Performance considerations  Conclusion
  • 3.
    MDE today Modeling techniques Integrated model-driven tools REQ Model- SD based early analysis Architecture IMP modeling, DSMLs TST Code generation, IDE model-assisted Model-based development test generation
  • 4.
    MDE today Modeling techniques Eclipse Modeling Framework REQ SD IMP TST IDE - Eclipse
  • 5.
    A key problemof MDE  Scalability vs. modeling tools o Issues encountered by several industrial partners using tools based on the Eclipse Modeling Framework (EMF) o Modeling scenarios can get really complex really quickly • Instance models of size 1-2M and beyond • Performance issues with model transformations and code generators have an adverse effect on everyday development tasks  Scalability? Recognized e.g. by AUTOSAR tool vendors o Complex http://wiki.eclipse.org/Auto_IWG_WP2 (meta)models o Large instance models o Complex query and manipulation scenarios
  • 6.
    Focus: model queries Model queries: “a piece of code that retrieves a given set of the model”  Queries are at the heart of MDE o Every model access/read is a (simple) query o More complex: Views, content providers o Most complex: Model transformations, code generators, …  Query performance (= the speed of content retrieval) is crucial o First vs. consecutive vs. throughput o Query result vs. query contents  Core problems 1. Slow query engines 2. Considerable programming effort necessary to get complex queries right
  • 7.
  • 8.
    Overview  EMF-INCQUERY: amodel query engine o Supports batch queries o Optimized for incremental queries!  Incremental evaluation o Based on the RETE algorithm o Compute once, update afterwards o Gain: Instant re-evaluation o Price: Uses some more memory • Manageable with proper life cycles
  • 9.
    EMF instance model Input = Model contents + changes (EMF notifications) Input RETE network nodes Query definition Intermediate nodes Output Delta nodes monitor Generated query components Query engine Output = Query results + (subsequent) Query result deltas
  • 10.
    Benefits of EMF-INCQUERY Makes on-the-fly well-formedness validation, view maintenance, … feasible over really large instance models  Simplifies writing really complex queries o Graph pattern language o Highly reusable  query libraries  Easy-to-integrate into existing apps o Works with any EMF domain metamodel o Integrates through Eclipse standards  works with many EMF-based apps out-of-the-box
  • 11.
    DEMO INCQUERY PatternLanguage  School metamodel in EMF
  • 12.
    INCQUERY Pattern Language Expressive declarative query language by graph patterns Capture local + global queries Compositionality + Reusabilility „Arbitrary” Recursion, Negation
  • 13.
    New features withversion 0.6  Tools o Xtext2-based language • Content assist (code completion), validation, outline, … o Unique query language features • Unlimited recursion and transitive closure • Aggregate functions • Match/exceed the expressive power of OCL, while providing more flexible re-use  Runtime o Build and execute queries on-the-fly o Strong generative bindings to Java o Efficient incremental transitive closure
  • 14.
    INCQUERY Development Tools • Works with most EMF- based editors out-of- the-box • Reveals matches as selection Pattern Editor Queries are applied & updates on-the- fly Query Explorer
  • 15.
    INCQUERY Model Validation • Works with most EMF- based tools out-of-the- box • Manages error-warning markers on-the-fly as the Standard Eclipse BPMN Editor user is editing the model = Instantaneous feedback Markers in the Problems View
  • 16.
    TRANSPARENT QUERY INTEGRATION … beyond the development tools
  • 17.
    Derived features inEMF Derived features: Derived attributes: • Supported by auxiliary • Typed scalar values mechanisms (e.g. Java, OCL) • Supported by • Supported by EMF-based INCQUERY derivatives <<derived>> tools out-of-the-box <<derived>> <<derived>> Derived edges: • Binary relationships • Supported by binary INCQUERY patterns
  • 18.
    INCQUERY derivatives You can combine IQPL with (side-effect free, deterministic) Xbase Key benefits: • Automatic dependency-aware change notifications Derived attribute for derived features “hasPrimeWeight”: • Instantaneous result True iff pattern retrieval “coursePrimeWeight” matches to host C
  • 19.
    PERFORMANCE BENCHMARK Instantaneous result retrieval?
  • 20.
    Benchmarking scenario  In-memorymodels o Embedded railways control software system design domain o Instance models up to 2.7M elements  Scenario: well-formedness validation o 5 rules, from simple to very complex o Batch validation: loading models + executing queries o On-the-fly revalidation: modifying models + re-executing queries instantaneously  Tools o Eclipse tools (OCL, INCQUERY, EMF-Java) o RDF/SPARQL engines (for comparison)  http://viatra.inf.mit.bme.hu/publications/trainbenchmark
  • 21.
    Revalidation on-the-fly SPARQL tools are • Incremental EMF engines generally 1-2 Orders-of- typically 5-10x faster Magnitude (OM) slower • sub-100ms response times than top EMF tools for up to 1.5M elements
  • 22.
    A closer lookat the top Performance advantage of OCL-IA over OCL is not significant with complex queries INCQUERY is 1-1.5 orders- of-magnitude faster than OCL-IA
  • 23.
    Memory usage • Incrementalengines impose a linear memory consumption overhead • INCQUERY overhead is larger than OCL-IA • Even the largest models fit into 1GB of RAM
  • 24.
    Final points  EMF-INCQUERYhas been proposed as an Eclipse.org project under EMFT http://www.eclipse.org/proposals/modeling.emf.incquery/  EMF-INCQUERY 0.6 preview release is available immediately o Pointers: http://viatra.inf.mit.bme.hu/incquery http://viatra.inf.mit.bme.hu/incquery/getting_started http://viatra.inf.mit.bme.hu/performance  Release 0.6 is scheduled shortly after the Eclipse Juno Launch (beginning of July)
  • 25.
    Acknowledgements  Additional developmentteam members o Márk Czotter, Tamás Szabó  Performance benchmarking o Benedek Izsó, Zoltán Szatmári  Testing, examples o Attila Csicsely, Tamás Csurgó, Dániel Kávássy, Gábor Szárnyas, Tamás Tóth  Thank you very much!