SlideShare a Scribd company logo
1 of 22
Politecnico di Milano
                         POLO TERRITORIALE DI COMO
                     Master of Science in Computer Engineering




                Model-Driven Retrieval
                of Model Repositories
                                                     Master graduation thesis by:
Supervisor: Prof. Marco Brambilla                    Stefano Celentano, ID: 755287
Assistant Supervisor: Prof. Alessandro Bozzon        Lorenzo Furrer, ID: 750213
Model-Driven Retrieval of Model Repositories   2




Introduction
• Software models retrieval is essential for the paradigm of
  Model-Driven Development (MDD)
• Current systems lack efficient and standardized
  methodologies
  • The metamodel is not taken into account
• Our contributions:
  • A methodology for model-driven retrieval of model repositories that
    takes into account the metamodels
  • The development of a prototype
    for such methodology
  • Two case studies
  • Evaluation of different test
    configurations
Model-Driven Retrieval of Model Repositories   3




Outline
• Model retrieval approaches
• MDD and Metamodeling
• Our Approach:
                                                Introduction & Methodology
   • Abstract Solution
   • Design Dimensions
   • Indexing Strategies
• Prototype Architecture
• Case Studies
   • UML Case Study
   • WebML Case Study
• Tests and evaluation                          Prototype & Case studies

• Future works
Model-Driven Retrieval of Model Repositories   4




 Model Retrieval Approaches
• Text-based
   • Model representation: unstructured document
     (bag of words) (e.g., Vector Space Model, Tf-idf)
   • Query type: keyword-based
   • Matching algorithm: standard IR similarity
      measures (e.g., cosine similarity)

                             • Content-based
                               • Model representation: model structure is
                                 taken into account (e.g., graph-based)
                               • Query type: search by example
                               • Matching algorithm: ad-hoc algorithms
                                 (depends on the model representation)
Model-Driven Retrieval of Model Repositories   5


Model-Driven Development and
                                                                          Meta-
Metamodeling                                                              metamodel




• A fundamental concept:                                                  Metamodel
 «metamodel»

• MOF (Meta-Object
                                                                          Model
 Facility)



                                                                              Instance
Model-Driven Retrieval of Model Repositories   6




Our Approach (1/3): Abstract Solution
Model-Driven Retrieval of Model Repositories   7




Our Approach (2/3): Design Dimensions
• Segmentation Granularity
   • Whole project
   • Subproject
   • Project concept
• Index structure
   • Flat
   • Weighted
   • Multi-field
   • Hybrid (e.g., multi-field index
     containing weighted terms)
• Query type
   • Keyword-based search
   • Search by example
• Result presentation
   • Snippet visualization
   • Faceted search
Model-Driven Retrieval of Model Repositories    8




Our Approach (3/3): Indexing Strategies
                    Segmentation           Index                        Index terms
                    Granularity                                         weights

Experiment A        Whole project          Flat                                 NO

Experiment B        Metamodel              Multi-field                          NO
                    concept
Experiment C        Metamodel              Multi-field                  Assigned
                    concept                                             according to the
                                                                        metamodel
                                                                        concept
Experiment D*       Metamodel              Multi-field                  Assigned
                    concept                                             according to the
                                                                        metamodel
                                                                        concept
* The indexing phase includes a graph-based algorithm that enriches the document
representation of a model element with information that are extracted from its
neighboring elements.
Model-Driven Retrieval of Model Repositories         9


 Prototype Architecture
                                                Configurator
   Data
  Source




                                                                                BPEL
  Crawler       Router         Queue            Listener
                                                                               pipeline
                                                                 BPEL
                                                                 Processor
• Based on SMILA: an
 extensible framework for
 building search solutions
 to access unstructured
                                                                               Analyzers
 information.

• Uses Apache Solr: a
                                                                                Index
 scalable search platform                                                                  Apache
 featuring full-text search.                                                               Solr
Model-Driven Retrieval of Model Repositories         10


 Prototype Architecture
                                                Configurator
   Data
  Source




                                                                                BPEL
  Crawler       Router         Queue            Listener
                                                                               pipeline
                                                                 BPEL
                                                                 Processor
• Based on SMILA: an
 extensible framework for
 building search solutions
 to access unstructured
                                                                               Analyzers
 information.

• Uses Apache Solr: a
                                                                                Index
 scalable search platform                                                                  Apache
 featuring full-text search.                                                               Solr
Model-Driven Retrieval of Model Repositories   11


Case Studies
• UML Class Diagram
 • 84 meta-models from AtlanMod
 • Small size
 • General purpose


• WebML
 • 12 real-life industrial projects
 • Large size
 • Large quantity of concepts
 • Domain specific
Model-Driven Retrieval of Model Repositories   12



UML Case Study: Experiment A
                          • Granularity: Project
                          • Index: Flat
                                          Content Field:
                                           location commentsBefore
                                           commentsAfter entries
                                           predicates name type
                                           allFields fields predicate
                                           name expression field
                                           value LocatedElement
                                           Query Entry Field
                                           Predicate Expression
Model-Driven Retrieval of Model Repositories   13



UML Case Study: Experiment B
                          • Granularity: Class
                          • Index: Multi-Field

                                      ProjectName Field:
                                       BQL

                                      ClassName Field:
                                       Entry

                                      AttributeNames Field:
                                       name type allFields fields
                                       predicate
Model-Driven Retrieval of Model Repositories   14



UML Case Study: Experiment C
                         • Granularity: Class
                         • Index: Multi-Field, Weighted

                                     ProjectName Field:
                                      BQL|1.0

                                     ClassName Field:
                                      Entry|1.7

                                     AttributeNames Field:
                                      name|1.0 type|1.0
                                      allFields|1.0 fields|1.5
                                      predicate|1.6
Model-Driven Retrieval of Model Repositories   15



UML Case Study: Experiment D
                          • Granularity: Class
                          • Index: Multi-Field, Weighted

                                ProjectName Field:
                                 BQL|1.0
                                ClassName Field:
                                 Entry|1.7
                                AttributeNames Field:
                                 name|0.75 location|0.9
                                 commentsBefore|0.9
                                 commentsAfter|0.9 name|1.0
                                 type|1.0 allFields|1.0
                                 predicate|1.6 fields|1.3
                                 Predicate|0.765 Query|0.816
            #HOP = 1             Field|0.85 LocatedElement|0.9
Model-Driven Retrieval of Model Repositories   16




WebML Case Study: Experiment B
                               • Granularity: Area
                               • Index: Multi-Field
                                        AreaName Field:
                                         Book requests

                                        Content Field:
                                         Book requests Create
                                         book ConnectUserToBook
                                         New book request New
                                         book User Book request
                                         list
Model-Driven Retrieval of Model Repositories   17




WebML Case Study: Experiment C
                               • Granularity: Area
                               • Index: Multi-Field, Weighted
                                        AreaName Field:
                                         Book|1.2 requests|1.2

                                        Content Field:
                                         Create|1.0 book |1.0
                                         ConnectUserToBook|1.0
                                         New|1.1 book|1.1
                                         request|1.1 New|1.0
                                         Book|1.0 User|1.0
                                         Book |1.1request|1.1
                                         list|1.1
Model-Driven Retrieval of Model Repositories   18




Tests and Evaluation: Meta-queries
Meta-queries   Type of     Information need
               searched
               document
1              Project     All projects related to one specific topic

2              Project     All projects related to one general topic

3              Pattern     Searches for a pattern by using as query
                           string the terms belonging to different classes
                           connected by some relation

4              Class       Searching for a class using as query string
                           all (or some) of the terms belonging to a
                           class
5              Class       Searching for a class using as query string
                           some of the terms belonging to a class and
                           some terms related to the project
Model-Driven Retrieval of Model Repositories   19


UML Experiment A (Project Granularity, Flat Index)




                                          • DCG and iDCG are very
                                          close in the first 3 positions.

                                          • ALWAYS able to retrieve
                                          the most relevant document
                                          in the first position.
Model-Driven Retrieval of Model Repositories   20


Other UML Experiments




                                        • Weighted experiment is always
                                        better than the non-weighted one.

                                        • Both Experiments B and C are
                                        close to the ideal curve in the first
                                        positions.

                                        • Experiment D is supposed to
                                        answer a different user need than
                                        the one captured by the used
                                        ground truth.
Model-Driven Retrieval of Model Repositories   21


WebML Experiments




                                         • Experiments B and C perform
                                         identically up to the third position.

                                         • After that, the experiment using
                                         weights performs always slightly
                                         better than the non-weighted one.
Model-Driven Retrieval of Model Repositories   22




Conclusions
• The system has been tested with both a general purpose and a
    domain specific modeling language.
•    Good performances in the first rank positions.
•    Performances of the weighted case are always better or equal than
    the others, albeit slightly.
•    The prototype has shown good results in retrieving documents that
    are relevant in terms of conceptual and terminological similarity.
•    Structural similarity is difficult to capture in a text-based search.


Future Directions
•   Integrating a content-based solution
•   Metamodel integration
•   Testing more configurations
•   Weight training

More Related Content

Similar to Model driven retrieval of model repositories

Model visualization made easy: Incremental query-driven views in modeling tools
Model visualization made easy: Incremental query-driven views in modeling toolsModel visualization made easy: Incremental query-driven views in modeling tools
Model visualization made easy: Incremental query-driven views in modeling toolsÁkos Horváth
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopHannes Fassold
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkDatabricks
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015dgarijo
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender systemKaren Li
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Publishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHubPublishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHubGlobus
 
Design patterns and MV
Design patterns and MVDesign patterns and MV
Design patterns and MVSway Wang
 
No BS Guide to Deep Learning in the Enterprise
No BS Guide to Deep Learning in the EnterpriseNo BS Guide to Deep Learning in the Enterprise
No BS Guide to Deep Learning in the EnterpriseJesus Rodriguez
 
Fifth elephant-grill
Fifth elephant-grillFifth elephant-grill
Fifth elephant-grillamarsri
 
Model-Driven Cloud Data Storage
Model-Driven Cloud Data StorageModel-Driven Cloud Data Storage
Model-Driven Cloud Data Storagejccastrejon
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningVarad Meru
 
A Machine Learning Approach to SPARQL Query Performance Prediction
A Machine Learning Approach to SPARQL Query Performance PredictionA Machine Learning Approach to SPARQL Query Performance Prediction
A Machine Learning Approach to SPARQL Query Performance PredictionRakebul Hasan
 
Object oriented system design
Object oriented system designObject oriented system design
Object oriented system designnkryption
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-Systeminside-BigData.com
 
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...Abdel Salam Sayyad
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowDatabricks
 
A status update on COMBINE standardization activities, with a focus on SBML
A status update on COMBINE standardization activities, with a focus on SBMLA status update on COMBINE standardization activities, with a focus on SBML
A status update on COMBINE standardization activities, with a focus on SBMLMike Hucka
 
Integrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networksIntegrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networksDaniele Gianni
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Kun Le
 

Similar to Model driven retrieval of model repositories (20)

Model visualization made easy: Incremental query-driven views in modeling tools
Model visualization made easy: Incremental query-driven views in modeling toolsModel visualization made easy: Incremental query-driven views in modeling tools
Model visualization made easy: Incremental query-driven views in modeling tools
 
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshopJRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
JRs presentation-few-shot-learning-overview @ AI4Media WP5 workshop
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache Spark
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015
 
Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Publishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHubPublishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHub
 
Design patterns and MV
Design patterns and MVDesign patterns and MV
Design patterns and MV
 
No BS Guide to Deep Learning in the Enterprise
No BS Guide to Deep Learning in the EnterpriseNo BS Guide to Deep Learning in the Enterprise
No BS Guide to Deep Learning in the Enterprise
 
Fifth elephant-grill
Fifth elephant-grillFifth elephant-grill
Fifth elephant-grill
 
Model-Driven Cloud Data Storage
Model-Driven Cloud Data StorageModel-Driven Cloud Data Storage
Model-Driven Cloud Data Storage
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 
A Machine Learning Approach to SPARQL Query Performance Prediction
A Machine Learning Approach to SPARQL Query Performance PredictionA Machine Learning Approach to SPARQL Query Performance Prediction
A Machine Learning Approach to SPARQL Query Performance Prediction
 
Object oriented system design
Object oriented system designObject oriented system design
Object oriented system design
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
 
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development Workflow
 
A status update on COMBINE standardization activities, with a focus on SBML
A status update on COMBINE standardization activities, with a focus on SBMLA status update on COMBINE standardization activities, with a focus on SBML
A status update on COMBINE standardization activities, with a focus on SBML
 
Integrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networksIntegrated modeling and simulation framework for wireless sensor networks
Integrated modeling and simulation framework for wireless sensor networks
 
Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...Best practices for building and deploying predictive models over big data pre...
Best practices for building and deploying predictive models over big data pre...
 

More from Marco Brambilla

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...Marco Brambilla
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Marco Brambilla
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Marco Brambilla
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheresMarco Brambilla
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social MediaMarco Brambilla
 
Trigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoTrigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoMarco Brambilla
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Marco Brambilla
 
Analyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsAnalyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsMarco Brambilla
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...Marco Brambilla
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksMarco Brambilla
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Marco Brambilla
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionMarco Brambilla
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Marco Brambilla
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...Marco Brambilla
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Marco Brambilla
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Marco Brambilla
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...Marco Brambilla
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.Marco Brambilla
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoMarco Brambilla
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introductionMarco Brambilla
 

More from Marco Brambilla (20)

M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
M.Sc. Thesis Topics and Proposals @ Polimi Data Science Lab - 2024 - prof. Br...
 
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
Thesis Topics and Proposals @ Polimi Data Science Lab - 2023 - prof. Brambill...
 
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023Hierarchical Transformers for User Semantic Similarity - ICWE 2023
Hierarchical Transformers for User Semantic Similarity - ICWE 2023
 
Exploring the Bi-verse. A trip across the digital and physical ecospheres
Exploring the Bi-verse.A trip across the digital and physical ecospheresExploring the Bi-verse.A trip across the digital and physical ecospheres
Exploring the Bi-verse. A trip across the digital and physical ecospheres
 
Conversation graphs in Online Social Media
Conversation graphs in Online Social MediaConversation graphs in Online Social Media
Conversation graphs in Online Social Media
 
Trigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demoTrigger.eu: Cocteau game for policy making - introduction and demo
Trigger.eu: Cocteau game for policy making - introduction and demo
 
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
Generation of Realistic Navigation Paths for Web Site Testing using RNNs and ...
 
Analyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projectsAnalyzing rich club behavior in open source projects
Analyzing rich club behavior in open source projects
 
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...Analysis of On-line Debate on Long-Running Political Phenomena.The Brexit C...
Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit C...
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals Available Data Science M.Sc. Thesis Proposals
Available Data Science M.Sc. Thesis Proposals
 
Data Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extractionData Cleaning for social media knowledge extraction
Data Cleaning for social media knowledge extraction
 
Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018Iterative knowledge extraction from social networks. The Web Conference 2018
Iterative knowledge extraction from social networks. The Web Conference 2018
 
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...Driving Style and Behavior Analysis based on Trip Segmentation over GPS  Info...
Driving Style and Behavior Analysis based on Trip Segmentation over GPS Info...
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
 
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships amo...
 
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...Model-driven Development of  User Interfaces for IoT via Domain-specific Comp...
Model-driven Development of User Interfaces for IoT via Domain-specific Comp...
 
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.A Model-Based Method for  Seamless Web and Mobile Experience. Splash 2016 conf.
A Model-Based Method for Seamless Web and Mobile Experience. Splash 2016 conf.
 
Big Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di MilanoBig Data and Stream Data Analysis at Politecnico di Milano
Big Data and Stream Data Analysis at Politecnico di Milano
 
Web Science. An introduction
Web Science. An introductionWeb Science. An introduction
Web Science. An introduction
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

Model driven retrieval of model repositories

  • 1. Politecnico di Milano POLO TERRITORIALE DI COMO Master of Science in Computer Engineering Model-Driven Retrieval of Model Repositories Master graduation thesis by: Supervisor: Prof. Marco Brambilla Stefano Celentano, ID: 755287 Assistant Supervisor: Prof. Alessandro Bozzon Lorenzo Furrer, ID: 750213
  • 2. Model-Driven Retrieval of Model Repositories 2 Introduction • Software models retrieval is essential for the paradigm of Model-Driven Development (MDD) • Current systems lack efficient and standardized methodologies • The metamodel is not taken into account • Our contributions: • A methodology for model-driven retrieval of model repositories that takes into account the metamodels • The development of a prototype for such methodology • Two case studies • Evaluation of different test configurations
  • 3. Model-Driven Retrieval of Model Repositories 3 Outline • Model retrieval approaches • MDD and Metamodeling • Our Approach: Introduction & Methodology • Abstract Solution • Design Dimensions • Indexing Strategies • Prototype Architecture • Case Studies • UML Case Study • WebML Case Study • Tests and evaluation Prototype & Case studies • Future works
  • 4. Model-Driven Retrieval of Model Repositories 4 Model Retrieval Approaches • Text-based • Model representation: unstructured document (bag of words) (e.g., Vector Space Model, Tf-idf) • Query type: keyword-based • Matching algorithm: standard IR similarity measures (e.g., cosine similarity) • Content-based • Model representation: model structure is taken into account (e.g., graph-based) • Query type: search by example • Matching algorithm: ad-hoc algorithms (depends on the model representation)
  • 5. Model-Driven Retrieval of Model Repositories 5 Model-Driven Development and Meta- Metamodeling metamodel • A fundamental concept: Metamodel «metamodel» • MOF (Meta-Object Model Facility) Instance
  • 6. Model-Driven Retrieval of Model Repositories 6 Our Approach (1/3): Abstract Solution
  • 7. Model-Driven Retrieval of Model Repositories 7 Our Approach (2/3): Design Dimensions • Segmentation Granularity • Whole project • Subproject • Project concept • Index structure • Flat • Weighted • Multi-field • Hybrid (e.g., multi-field index containing weighted terms) • Query type • Keyword-based search • Search by example • Result presentation • Snippet visualization • Faceted search
  • 8. Model-Driven Retrieval of Model Repositories 8 Our Approach (3/3): Indexing Strategies Segmentation Index Index terms Granularity weights Experiment A Whole project Flat NO Experiment B Metamodel Multi-field NO concept Experiment C Metamodel Multi-field Assigned concept according to the metamodel concept Experiment D* Metamodel Multi-field Assigned concept according to the metamodel concept * The indexing phase includes a graph-based algorithm that enriches the document representation of a model element with information that are extracted from its neighboring elements.
  • 9. Model-Driven Retrieval of Model Repositories 9 Prototype Architecture Configurator Data Source BPEL Crawler Router Queue Listener pipeline BPEL Processor • Based on SMILA: an extensible framework for building search solutions to access unstructured Analyzers information. • Uses Apache Solr: a Index scalable search platform Apache featuring full-text search. Solr
  • 10. Model-Driven Retrieval of Model Repositories 10 Prototype Architecture Configurator Data Source BPEL Crawler Router Queue Listener pipeline BPEL Processor • Based on SMILA: an extensible framework for building search solutions to access unstructured Analyzers information. • Uses Apache Solr: a Index scalable search platform Apache featuring full-text search. Solr
  • 11. Model-Driven Retrieval of Model Repositories 11 Case Studies • UML Class Diagram • 84 meta-models from AtlanMod • Small size • General purpose • WebML • 12 real-life industrial projects • Large size • Large quantity of concepts • Domain specific
  • 12. Model-Driven Retrieval of Model Repositories 12 UML Case Study: Experiment A • Granularity: Project • Index: Flat Content Field: location commentsBefore commentsAfter entries predicates name type allFields fields predicate name expression field value LocatedElement Query Entry Field Predicate Expression
  • 13. Model-Driven Retrieval of Model Repositories 13 UML Case Study: Experiment B • Granularity: Class • Index: Multi-Field ProjectName Field: BQL ClassName Field: Entry AttributeNames Field: name type allFields fields predicate
  • 14. Model-Driven Retrieval of Model Repositories 14 UML Case Study: Experiment C • Granularity: Class • Index: Multi-Field, Weighted ProjectName Field: BQL|1.0 ClassName Field: Entry|1.7 AttributeNames Field: name|1.0 type|1.0 allFields|1.0 fields|1.5 predicate|1.6
  • 15. Model-Driven Retrieval of Model Repositories 15 UML Case Study: Experiment D • Granularity: Class • Index: Multi-Field, Weighted ProjectName Field: BQL|1.0 ClassName Field: Entry|1.7 AttributeNames Field: name|0.75 location|0.9 commentsBefore|0.9 commentsAfter|0.9 name|1.0 type|1.0 allFields|1.0 predicate|1.6 fields|1.3 Predicate|0.765 Query|0.816 #HOP = 1 Field|0.85 LocatedElement|0.9
  • 16. Model-Driven Retrieval of Model Repositories 16 WebML Case Study: Experiment B • Granularity: Area • Index: Multi-Field AreaName Field: Book requests Content Field: Book requests Create book ConnectUserToBook New book request New book User Book request list
  • 17. Model-Driven Retrieval of Model Repositories 17 WebML Case Study: Experiment C • Granularity: Area • Index: Multi-Field, Weighted AreaName Field: Book|1.2 requests|1.2 Content Field: Create|1.0 book |1.0 ConnectUserToBook|1.0 New|1.1 book|1.1 request|1.1 New|1.0 Book|1.0 User|1.0 Book |1.1request|1.1 list|1.1
  • 18. Model-Driven Retrieval of Model Repositories 18 Tests and Evaluation: Meta-queries Meta-queries Type of Information need searched document 1 Project All projects related to one specific topic 2 Project All projects related to one general topic 3 Pattern Searches for a pattern by using as query string the terms belonging to different classes connected by some relation 4 Class Searching for a class using as query string all (or some) of the terms belonging to a class 5 Class Searching for a class using as query string some of the terms belonging to a class and some terms related to the project
  • 19. Model-Driven Retrieval of Model Repositories 19 UML Experiment A (Project Granularity, Flat Index) • DCG and iDCG are very close in the first 3 positions. • ALWAYS able to retrieve the most relevant document in the first position.
  • 20. Model-Driven Retrieval of Model Repositories 20 Other UML Experiments • Weighted experiment is always better than the non-weighted one. • Both Experiments B and C are close to the ideal curve in the first positions. • Experiment D is supposed to answer a different user need than the one captured by the used ground truth.
  • 21. Model-Driven Retrieval of Model Repositories 21 WebML Experiments • Experiments B and C perform identically up to the third position. • After that, the experiment using weights performs always slightly better than the non-weighted one.
  • 22. Model-Driven Retrieval of Model Repositories 22 Conclusions • The system has been tested with both a general purpose and a domain specific modeling language. • Good performances in the first rank positions. • Performances of the weighted case are always better or equal than the others, albeit slightly. • The prototype has shown good results in retrieving documents that are relevant in terms of conceptual and terminological similarity. • Structural similarity is difficult to capture in a text-based search. Future Directions • Integrating a content-based solution • Metamodel integration • Testing more configurations • Weight training

Editor's Notes

  1. The tests for the UML case study involve different types of keyword-basedquery. Each type, that in the following is called “meta-query”, hasdifferent characteristics in terms of the document that is searched by thequery (e.g., project, class) and in terms of the information need that isexpressed through the query (e.g., the user may want to search a specificproject or all the projects related to a topic). We first outlined a set of fivemeta-queries, then we chose two of them. For each of these, we built a setof ten instances that we used to test the UML case.The tests for the WebML case involve a set of ten queries.
  2. Tentativoditenerecontodellastruttura del modello in unacosa text-based. Nonmigliorarilevare in termini dirilevanza ma è utile in casi “esplorativi”.Besides retrieving the relevant classeswith respect to a query, Experiment D retrieves their neighboring classestoo, which are not necessarily relevant to that query. These neighboringclasses are present among the results because they have imported termsthat are part of the query string. Since their “content” field is larger due to the imported terms, those neighboring classes are penalized by the Field-Norm and, at the same time, the truly relevant classes are ranked in a higher position, therefore the results are better. To conclude, the FieldNorm helpswhen it penalizes classes that are retrieved only because they are neighboring of relevant classes, but it provides misleading results when it penalizes the relevant classes due to the larger size of their “content” field after the import algorithm.