Model driven retrieval of model repositories

Politecnico di Milano
POLO TERRITORIALE DI COMO
Master of Science in Computer Engineering

Model-Driven Retrieval
of Model Repositories
Master graduation thesis by:
Supervisor: Prof. Marco Brambilla Stefano Celentano, ID: 755287
Assistant Supervisor: Prof. Alessandro Bozzon Lorenzo Furrer, ID: 750213

Model-Driven Retrieval of Model Repositories 2

Introduction
• Software models retrieval is essential for the paradigm of
Model-Driven Development (MDD)
• Current systems lack efficient and standardized
methodologies
• The metamodel is not taken into account
• Our contributions:
• A methodology for model-driven retrieval of model repositories that
takes into account the metamodels
• The development of a prototype
for such methodology
• Two case studies
• Evaluation of different test
configurations


Outline
• Model retrieval approaches
• MDD and Metamodeling
• Our Approach:
Introduction & Methodology
• Abstract Solution
• Design Dimensions
• Indexing Strategies
• Prototype Architecture
• Case Studies
• UML Case Study
• WebML Case Study
• Tests and evaluation Prototype & Case studies

• Future works


Model Retrieval Approaches
• Text-based
• Model representation: unstructured document
(bag of words) (e.g., Vector Space Model, Tf-idf)
• Query type: keyword-based
• Matching algorithm: standard IR similarity
measures (e.g., cosine similarity)

• Content-based
• Model representation: model structure is
taken into account (e.g., graph-based)
• Query type: search by example
• Matching algorithm: ad-hoc algorithms
(depends on the model representation)


Model-Driven Development and
Meta-
Metamodeling metamodel

• A fundamental concept: Metamodel
«metamodel»

• MOF (Meta-Object
Model
Facility)

Instance


Our Approach (1/3): Abstract Solution


Our Approach (2/3): Design Dimensions
• Segmentation Granularity
• Whole project
• Subproject
• Project concept
• Index structure
• Flat
• Weighted
• Multi-field
• Hybrid (e.g., multi-field index
containing weighted terms)
• Query type
• Keyword-based search
• Search by example
• Result presentation
• Snippet visualization
• Faceted search


Our Approach (3/3): Indexing Strategies
Segmentation Index Index terms
Granularity weights

Experiment A Whole project Flat NO

Experiment B Metamodel Multi-field NO
concept
Experiment C Metamodel Multi-field Assigned
concept according to the
metamodel
concept
Experiment D* Metamodel Multi-field Assigned
concept according to the
metamodel
concept
* The indexing phase includes a graph-based algorithm that enriches the document
representation of a model element with information that are extracted from its
neighboring elements.


Prototype Architecture
Configurator
Data
Source

BPEL
Crawler Router Queue Listener
pipeline
BPEL
Processor
• Based on SMILA: an
extensible framework for
building search solutions
to access unstructured
Analyzers
information.

• Uses Apache Solr: a
Index
scalable search platform Apache
featuring full-text search. Solr


Case Studies
• UML Class Diagram
• 84 meta-models from AtlanMod
• Small size
• General purpose

• WebML
• 12 real-life industrial projects
• Large size
• Large quantity of concepts
• Domain specific


UML Case Study: Experiment A
• Granularity: Project
• Index: Flat
Content Field:
location commentsBefore
commentsAfter entries
predicates name type
allFields fields predicate
name expression field
value LocatedElement
Query Entry Field
Predicate Expression


UML Case Study: Experiment B
• Granularity: Class
• Index: Multi-Field

ProjectName Field:
BQL

ClassName Field:
Entry

AttributeNames Field:
name type allFields fields
predicate


WebML Case Study: Experiment B
• Granularity: Area
• Index: Multi-Field
AreaName Field:
Book requests

Content Field:
Book requests Create
book ConnectUserToBook
New book request New
book User Book request
list


Tests and Evaluation: Meta-queries
Meta-queries Type of Information need
searched
document
1 Project All projects related to one specific topic

2 Project All projects related to one general topic

3 Pattern Searches for a pattern by using as query
string the terms belonging to different classes
connected by some relation

4 Class Searching for a class using as query string
all (or some) of the terms belonging to a
class
5 Class Searching for a class using as query string
some of the terms belonging to a class and
some terms related to the project


UML Experiment A (Project Granularity, Flat Index)

• DCG and iDCG are very
close in the first 3 positions.

• ALWAYS able to retrieve
the most relevant document
in the first position.


Other UML Experiments

• Weighted experiment is always
better than the non-weighted one.

• Both Experiments B and C are
close to the ideal curve in the first
positions.

• Experiment D is supposed to
answer a different user need than
the one captured by the used
ground truth.


WebML Experiments

• Experiments B and C perform
identically up to the third position.

• After that, the experiment using
weights performs always slightly
better than the non-weighted one.


Conclusions
• The system has been tested with both a general purpose and a
domain specific modeling language.
• Good performances in the first rank positions.
• Performances of the weighted case are always better or equal than
the others, albeit slightly.
• The prototype has shown good results in retrieving documents that
are relevant in terms of conceptual and terminological similarity.
• Structural similarity is difficult to capture in a text-based search.

Future Directions
• Integrating a content-based solution
• Metamodel integration
• Testing more configurations
• Weight training

Model driven retrieval of model repositories

Recommended

Recommended

More Related Content

Similar to Model driven retrieval of model repositories

Similar to Model driven retrieval of model repositories (20)

More from Marco Brambilla

More from Marco Brambilla (20)

Recently uploaded

Recently uploaded (20)

Model driven retrieval of model repositories

Editor's Notes