• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Model driven retrieval of model repositories
 

Model driven retrieval of model repositories

on

  • 1,116 views

Model-Driven Development (MDD) is a software development methodology that focuses on the creation and maintenance of domain models as the primary form of expression in the development cycle. One of ...

Model-Driven Development (MDD) is a software development methodology that focuses on the creation and maintenance of domain models as the primary form of expression in the development cycle. One of the fundamental characteristics of such approach is the reuse of software artifacts through their model representation. However, software reuse is impaired by the fact that current systems lack an efficient way to search through the model repositories as many of the current solutions don't tackle the relationships between model artifacts. These relationships are instead important to better satisfy the user information need in a model-driven development scenario.

This thesis aims to define a model-driven methodology for creating model search engines. As opposed to many related works, this methodology is metamodel-independent and exploits the metamodel of the searched project models in order to obtain more precise results. A prototype has been implemented to support such methodology.
We address two case studies that deal with the indexing and the retrieving of models from two different collections of UML and WebML projects respectively. Each case study involves several experiments adopting different indexing strategies. Finally, after having manually built the ground truth for each repository, we performed various tests using established Information Retrieval measures like DCG, MRR, MAP, Precision and Recall in order to evaluate the results.

Statistics

Views

Total Views
1,116
Views on SlideShare
921
Embed Views
195

Actions

Likes
0
Downloads
7
Comments
0

2 Embeds 195

http://dbgroup.como.polimi.it 173
http://www.twylah.com 22

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • The tests for the UML case study involve different types of keyword-basedquery. Each type, that in the following is called “meta-query”, hasdifferent characteristics in terms of the document that is searched by thequery (e.g., project, class) and in terms of the information need that isexpressed through the query (e.g., the user may want to search a specificproject or all the projects related to a topic). We first outlined a set of fivemeta-queries, then we chose two of them. For each of these, we built a setof ten instances that we used to test the UML case.The tests for the WebML case involve a set of ten queries.
  • Tentativoditenerecontodellastruttura del modello in unacosa text-based. Nonmigliorarilevare in termini dirilevanza ma è utile in casi “esplorativi”.Besides retrieving the relevant classeswith respect to a query, Experiment D retrieves their neighboring classestoo, which are not necessarily relevant to that query. These neighboringclasses are present among the results because they have imported termsthat are part of the query string. Since their “content” field is larger due to the imported terms, those neighboring classes are penalized by the Field-Norm and, at the same time, the truly relevant classes are ranked in a higher position, therefore the results are better. To conclude, the FieldNorm helpswhen it penalizes classes that are retrieved only because they are neighboring of relevant classes, but it provides misleading results when it penalizes the relevant classes due to the larger size of their “content” field after the import algorithm.

Model driven retrieval of model repositories Model driven retrieval of model repositories Presentation Transcript

  • Politecnico di Milano POLO TERRITORIALE DI COMO Master of Science in Computer Engineering Model-Driven Retrieval of Model Repositories Master graduation thesis by:Supervisor: Prof. Marco Brambilla Stefano Celentano, ID: 755287Assistant Supervisor: Prof. Alessandro Bozzon Lorenzo Furrer, ID: 750213
  • Model-Driven Retrieval of Model Repositories 2Introduction• Software models retrieval is essential for the paradigm of Model-Driven Development (MDD)• Current systems lack efficient and standardized methodologies • The metamodel is not taken into account• Our contributions: • A methodology for model-driven retrieval of model repositories that takes into account the metamodels • The development of a prototype for such methodology • Two case studies • Evaluation of different test configurations
  • Model-Driven Retrieval of Model Repositories 3Outline• Model retrieval approaches• MDD and Metamodeling• Our Approach: Introduction & Methodology • Abstract Solution • Design Dimensions • Indexing Strategies• Prototype Architecture• Case Studies • UML Case Study • WebML Case Study• Tests and evaluation Prototype & Case studies• Future works
  • Model-Driven Retrieval of Model Repositories 4 Model Retrieval Approaches• Text-based • Model representation: unstructured document (bag of words) (e.g., Vector Space Model, Tf-idf) • Query type: keyword-based • Matching algorithm: standard IR similarity measures (e.g., cosine similarity) • Content-based • Model representation: model structure is taken into account (e.g., graph-based) • Query type: search by example • Matching algorithm: ad-hoc algorithms (depends on the model representation)
  • Model-Driven Retrieval of Model Repositories 5Model-Driven Development and Meta-Metamodeling metamodel• A fundamental concept: Metamodel «metamodel»• MOF (Meta-Object Model Facility) Instance
  • Model-Driven Retrieval of Model Repositories 6Our Approach (1/3): Abstract Solution
  • Model-Driven Retrieval of Model Repositories 7Our Approach (2/3): Design Dimensions• Segmentation Granularity • Whole project • Subproject • Project concept• Index structure • Flat • Weighted • Multi-field • Hybrid (e.g., multi-field index containing weighted terms)• Query type • Keyword-based search • Search by example• Result presentation • Snippet visualization • Faceted search
  • Model-Driven Retrieval of Model Repositories 8Our Approach (3/3): Indexing Strategies Segmentation Index Index terms Granularity weightsExperiment A Whole project Flat NOExperiment B Metamodel Multi-field NO conceptExperiment C Metamodel Multi-field Assigned concept according to the metamodel conceptExperiment D* Metamodel Multi-field Assigned concept according to the metamodel concept* The indexing phase includes a graph-based algorithm that enriches the documentrepresentation of a model element with information that are extracted from itsneighboring elements.
  • Model-Driven Retrieval of Model Repositories 9 Prototype Architecture Configurator Data Source BPEL Crawler Router Queue Listener pipeline BPEL Processor• Based on SMILA: an extensible framework for building search solutions to access unstructured Analyzers information.• Uses Apache Solr: a Index scalable search platform Apache featuring full-text search. Solr
  • Model-Driven Retrieval of Model Repositories 10 Prototype Architecture Configurator Data Source BPEL Crawler Router Queue Listener pipeline BPEL Processor• Based on SMILA: an extensible framework for building search solutions to access unstructured Analyzers information.• Uses Apache Solr: a Index scalable search platform Apache featuring full-text search. Solr
  • Model-Driven Retrieval of Model Repositories 11Case Studies• UML Class Diagram • 84 meta-models from AtlanMod • Small size • General purpose• WebML • 12 real-life industrial projects • Large size • Large quantity of concepts • Domain specific
  • Model-Driven Retrieval of Model Repositories 12UML Case Study: Experiment A • Granularity: Project • Index: Flat Content Field: location commentsBefore commentsAfter entries predicates name type allFields fields predicate name expression field value LocatedElement Query Entry Field Predicate Expression
  • Model-Driven Retrieval of Model Repositories 13UML Case Study: Experiment B • Granularity: Class • Index: Multi-Field ProjectName Field: BQL ClassName Field: Entry AttributeNames Field: name type allFields fields predicate
  • Model-Driven Retrieval of Model Repositories 14UML Case Study: Experiment C • Granularity: Class • Index: Multi-Field, Weighted ProjectName Field: BQL|1.0 ClassName Field: Entry|1.7 AttributeNames Field: name|1.0 type|1.0 allFields|1.0 fields|1.5 predicate|1.6
  • Model-Driven Retrieval of Model Repositories 15UML Case Study: Experiment D • Granularity: Class • Index: Multi-Field, Weighted ProjectName Field: BQL|1.0 ClassName Field: Entry|1.7 AttributeNames Field: name|0.75 location|0.9 commentsBefore|0.9 commentsAfter|0.9 name|1.0 type|1.0 allFields|1.0 predicate|1.6 fields|1.3 Predicate|0.765 Query|0.816 #HOP = 1 Field|0.85 LocatedElement|0.9
  • Model-Driven Retrieval of Model Repositories 16WebML Case Study: Experiment B • Granularity: Area • Index: Multi-Field AreaName Field: Book requests Content Field: Book requests Create book ConnectUserToBook New book request New book User Book request list
  • Model-Driven Retrieval of Model Repositories 17WebML Case Study: Experiment C • Granularity: Area • Index: Multi-Field, Weighted AreaName Field: Book|1.2 requests|1.2 Content Field: Create|1.0 book |1.0 ConnectUserToBook|1.0 New|1.1 book|1.1 request|1.1 New|1.0 Book|1.0 User|1.0 Book |1.1request|1.1 list|1.1
  • Model-Driven Retrieval of Model Repositories 18Tests and Evaluation: Meta-queriesMeta-queries Type of Information need searched document1 Project All projects related to one specific topic2 Project All projects related to one general topic3 Pattern Searches for a pattern by using as query string the terms belonging to different classes connected by some relation4 Class Searching for a class using as query string all (or some) of the terms belonging to a class5 Class Searching for a class using as query string some of the terms belonging to a class and some terms related to the project
  • Model-Driven Retrieval of Model Repositories 19UML Experiment A (Project Granularity, Flat Index) • DCG and iDCG are very close in the first 3 positions. • ALWAYS able to retrieve the most relevant document in the first position.
  • Model-Driven Retrieval of Model Repositories 20Other UML Experiments • Weighted experiment is always better than the non-weighted one. • Both Experiments B and C are close to the ideal curve in the first positions. • Experiment D is supposed to answer a different user need than the one captured by the used ground truth.
  • Model-Driven Retrieval of Model Repositories 21WebML Experiments • Experiments B and C perform identically up to the third position. • After that, the experiment using weights performs always slightly better than the non-weighted one.
  • Model-Driven Retrieval of Model Repositories 22Conclusions• The system has been tested with both a general purpose and a domain specific modeling language.• Good performances in the first rank positions.• Performances of the weighted case are always better or equal than the others, albeit slightly.• The prototype has shown good results in retrieving documents that are relevant in terms of conceptual and terminological similarity.• Structural similarity is difficult to capture in a text-based search.Future Directions• Integrating a content-based solution• Metamodel integration• Testing more configurations• Weight training