Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Intro Proposal Evaluation Conclusion
Models from Code or Code as a Model?
Antonio García-Domínguez, Dimitris Kolovos
Aston...
Intro Proposal Evaluation Conclusion
Using codebases to drive software engineering tasks
Usual sequence of events
We had a...
Intro Proposal Evaluation Conclusion
Existing approaches
Some well-known reverse engineering tools (“extractors”)
Eclipse ...
Intro Proposal Evaluation Conclusion
Epsilon EMC JDT driver: use IDE indices as models
IDEs already extract representation...
Intro Proposal Evaluation Conclusion
Making it possible: Epsilon architecture
Epsilon Object Language (EOL) = JavaScript +...
Intro Proposal Evaluation Conclusion
Configuration dialog for an EMC JDT model
A. García, D. S. Kolovos Models from Code or...
Intro Proposal Evaluation Conclusion
allInstances() in EMC JDT
Reflection-based X.allInstances
1 Parse Java sources in the ...
Intro Proposal Evaluation Conclusion
Case study: validate code against UML models
Overview
We are maintaining a library, a...
Intro Proposal Evaluation Conclusion
Experiment setup
Inputs
Source code: JFreeChart 1.0.17, 1.0.18 and 1.0.19
UML model f...
Intro Proposal Evaluation Conclusion
Performance results
MoDisco validates all versions in 112.2s, JDT/select in 71.92s, J...
Intro Proposal Evaluation Conclusion
Performance discussion
MoDisco: slower loading for faster validation
When using .xmi ...
Intro Proposal Evaluation Conclusion
Conclusion and future lines of work
Summary
Codebases are a valuable input for many S...
End of the presentation
Questions?
@antoniogado
A. García, D. S. Kolovos Models from Code or Code as a Model? 13 / 13
Extra features in EMC JDT
Shorthand properties
Quick access to commonly needed information
EMC PropertyGetter computes val...
Upcoming SlideShare
Loading in …5
×

OCL'16 slides: Models from Code or Code as a Model?

124 views

Published on

In these slides, we talk about a model driver for Epsilon that allows exposing Java projects as regular models through the Java Development Tools DOM.

Published in: Software
  • Be the first to comment

  • Be the first to like this

OCL'16 slides: Models from Code or Code as a Model?

  1. 1. Intro Proposal Evaluation Conclusion Models from Code or Code as a Model? Antonio García-Domínguez, Dimitris Kolovos Aston University, University of York OCL’16 October 2nd, 2016 A. García, D. S. Kolovos Models from Code or Code as a Model? 1 / 13
  2. 2. Intro Proposal Evaluation Conclusion Using codebases to drive software engineering tasks Usual sequence of events We had a business need We invested in developing a program that covered it Now we may need to: Extract architecture or underlying business process? Find bugs before they hit us? Improve its design? Migrate it to a new technology? Code is the most accurate description: let’s use it How do we extract knowledge? Regexps do not scale to complex tasks We need something that understands the language The “extractor” is embedded within a process A. García, D. S. Kolovos Models from Code or Code as a Model? 2 / 13
  3. 3. Intro Proposal Evaluation Conclusion Existing approaches Some well-known reverse engineering tools (“extractors”) Eclipse MoDisco: EMF-based, implements KDM/ASTM JaMoPP: EMF-based, uses custom metamodel Moose: FAMIX-based tool Rascal: extracts partial JDT representation into a model Issue: extractors are one-off processes They produce standalone models: the code is no longer needed However, current tools are not incremental: if the code is changed, the extraction has to be redone from scratch Issue: extractors are not query-aware Some tasks will only access a small part of the codebase Extracting the rest only adds overhead A. García, D. S. Kolovos Models from Code or Code as a Model? 3 / 13
  4. 4. Intro Proposal Evaluation Conclusion Epsilon EMC JDT driver: use IDE indices as models IDEs already extract representations all the time Eclipse indexes Java projects on the background Keeps fast pointers to classes / methods Extra info available on demand through parsing JDT indices are under active improvement: See Stefan Xenos’ talk on EclipseCon NA’16 Cross-pollination from CDT project (C++) Faster, more thorough indexing coming in future releases Our proposal: Epsilon EMC JDT driver Expose code as seen by IDE (JDT) as a model On-demand loading + direct access to Java classes Sources available on GitHub (epsilonlabs/emc-jdt) A. García, D. S. Kolovos Models from Code or Code as a Model? 4 / 13
  5. 5. Intro Proposal Evaluation Conclusion Making it possible: Epsilon architecture Epsilon Object Language (EOL) = JavaScript + OCL Epsilon Model Connectivity (EMC) Core Model Validation (EVL) Code Generation (EGL) Model-to-model Transformation (ETL) ... Task-specific languages Technology-specific drivers Eclipse Modeling Framework (EMF) Schema-less XML Eclipse Java Developer Tools (JDT) CSV ... extends implements All Epsilon languages are based on EOL EOL accesses models through EMC interfaces By implementing the EMC interfaces, all Epsilon languages can use JDT indices as models A. García, D. S. Kolovos Models from Code or Code as a Model? 5 / 13
  6. 6. Intro Proposal Evaluation Conclusion Configuration dialog for an EMC JDT model A. García, D. S. Kolovos Models from Code or Code as a Model? 6 / 13
  7. 7. Intro Proposal Evaluation Conclusion allInstances() in EMC JDT Reflection-based X.allInstances 1 Parse Java sources in the projects on the fly 2 Traverse JDT Document Object Model with ASTVisitor 3 Use Java reflection to fetch instances of X 4 Cache for later executions, if desired Special case: TypeDeclaration.allInstances Searchable, lazy (no parsing unless looping over it) Supports two new operations: c.select(it|it.name = expr) searches with JDT the relevant compilation unit, parses it and returns the right DOM node c.search(it|it.name = expr) works the same, but it returns the raw index entry (a simpler JDT SourceType) A. García, D. S. Kolovos Models from Code or Code as a Model? 7 / 13
  8. 8. Intro Proposal Evaluation Conclusion Case study: validate code against UML models Overview We are maintaining a library, and we need to check its compliance with a UML model as it changes Here, “compliance” means “must have all the classes and methods in the UML model” (code may have more) Which is faster/more convenient: extracting a model with MoDisco first, or checking it directly with the EMC JDT driver? A. García, D. S. Kolovos Models from Code or Code as a Model? 8 / 13
  9. 9. Intro Proposal Evaluation Conclusion Experiment setup Inputs Source code: JFreeChart 1.0.17, 1.0.18 and 1.0.19 UML model for 1.0.17 extracted by Modelio Tools MoDisco 0.13.2 discoverer extracted 1 .xmi per version Epsilon interim (3d4408), emc-jdt interim (5b5ea) Validation task: implemented in the Epsilon Validation Language 1 version for MoDisco, 2 for EMC JDT (select/search) Rules: 1 Each UML class has its corresponding Java class 2 Each UML method is implemented in that Java class 3 Each nested class obeys the two above rules 4 validation errors found in 1.0.18 and 1.0.19 A. García, D. S. Kolovos Models from Code or Code as a Model? 9 / 13
  10. 10. Intro Proposal Evaluation Conclusion Performance results MoDisco validates all versions in 112.2s, JDT/select in 71.92s, JDT/search in 36.52s 1.0.17 1.0.18 1.0.19 1.0.17 1.0.18 1.0.19 1.0.17 1.0.18 1.0.19 0 10 20 30 40 JDT w/select JDT w/searchMoDisco Executiontime(s) Extract Load Validation A. García, D. S. Kolovos Models from Code or Code as a Model? 10 / 13
  11. 11. Intro Proposal Evaluation Conclusion Performance discussion MoDisco: slower loading for faster validation When using .xmi files, we load entire model into memory Pro: with everything in memory, validation is faster Con: huge models won’t fit in memory We’ll need a store with on-demand loading (e.g. CDO) On-demand loading can change performance profile Amortisation of extraction costs depends on codebase Frozen codebase (e.g. legacy systems): Full extraction is quickly amortised MoDisco is a better choice Quickly changing codebase (e.g. actively developed systems): Extracting on demand is usually better (models don’t live long) EMC JDT is a better choice A. García, D. S. Kolovos Models from Code or Code as a Model? 11 / 13
  12. 12. Intro Proposal Evaluation Conclusion Conclusion and future lines of work Summary Codebases are a valuable input for many SE tasks Two options to query codebases: Extract standalone models (MoDisco) Use code directly as a model (EMC JDT) EMC JDT is faster for changing codebases Future work Further optimisations to improve performance Evaluate impact of future JDT versions More filtering fields for searchable collections More shorthand properties for common scenarios Port approach to other languages (e.g. C++ through CDT) A. García, D. S. Kolovos Models from Code or Code as a Model? 12 / 13
  13. 13. End of the presentation Questions? @antoniogado A. García, D. S. Kolovos Models from Code or Code as a Model? 13 / 13
  14. 14. Extra features in EMC JDT Shorthand properties Quick access to commonly needed information EMC PropertyGetter computes value on demand FieldDeclaration: “name” BodyDeclaration: “public”, “static”... A. García, D. S. Kolovos Models from Code or Code as a Model? 14 / 13

×