SlideShare a Scribd company logo
1 of 12
Download to read offline
Sayed Mohsin
Reza
ModelMine: A Tool to Facilitate Mining Models
from Open Source Repositories
Sayed Mohsin Reza, Omar Badreddin, and Khandoker Rahad
Presenter
Sayed Mohsin Reza
PhD Student
Department of Computer Science
University of Texas, USA
Email: sreza3@miners.utep.edu
Website: https://www.smreza.com/
Tool available at https://www.smreza.com/projects/modelmine/
Sayed Mohsin
Reza
Introduction & Background
• Mining Software Repositories (MSR) has witnessed tremendous growth in
the past few years
• MSR contributes to establishing research agendas in software
development, cost estimation, testing, quality assurance and more
• MSR contributes to analyzing software defects, development activities,
processes, patterns, and more.
Outcome
Reducing Software Development Cost
Better Software Design
Better Software Quality
2
Sayed Mohsin
Reza
Problem & Motivation
• Limited tools that target mining models and design artifacts from open
source repositories
• Limitations
• Both Repos & Mining tools focus on textual artifacts
• Data Representation
• Limited Search Criteria
• Limited Ranking Scope
• Faced trouble when writing paper “The Human in MDE Loop: A Case Study
on Integrating Handwritten Code in Model-Driven Engineering Repositories”
accepted recently in a journal.
3
Sayed Mohsin
Reza
Existing Tools
• Metric Miner (https://github.com/Woutrrr/metricminer2 )
Capable: Mining Commits, modifications, export result in CSV format
Limitation: search functionality, Result ranking and need JAVA coding
knowledge.
• Qualitas Corpus (http://qualitascorpus.com/ )
Collection of software systems intended to be used for empirical studies of
code artefacts
Limitation: updates of the systems, search functionality
• GHTorrent (https://ghtorrent.org/ )
Pros: provides repositories of GitHub with extracted metadata
Cons: provides repositories of GitHub in a static way.
4
Sayed Mohsin
Reza
Contributions
This paper presents a novel model mining tool called ModelMine
• Facilitates mining models with various search criteria
• faster data extraction for non-textual artifacts
• User friendly tool available tool to non MSR experts
• Ensure search capability in the following mining areas
1. Model based Repository Search - Available in some tools
2. Mode based Artifact Search – Novel feature
3. Commit History Search – Available in PyDriller, Metric Miner
5
Sayed Mohsin
Reza
ModelMine Architecture
6
Open Source Repository Server: GitLab, BitBucket, SourceForge etc.
Sayed Mohsin
Reza
Tool Demonstration
Link: https://www.smreza.com/projects/modelmine/
7
Live Demo
Upcoming Topics
• Evaluation of the tool
• Results of evaluation
• Conclusion
Sayed Mohsin
Reza
Evaluation
Comparative Analysis with PyDriller.
1. Performance Analysis – learn about execution time and memory
consumption
2. Usability Analysis - how easy the tool is to learn
• Ten participants working in software engineering research.
• Eight - doctoral students
• two - master’s students in computer science
Evaluation Forms: https://forms.gle/kJcWASsKM13AHh9a6
8
Sayed Mohsin
Reza
Evaluation Tasks
1. Task 1 (Size related): Retrieve the list of repositories: Minimum 1 UML
Model and repository size > 30 MB.
2. Task 2 (Time related): Retrieve the list of repositories: Minimum 1 UML
Model and created between January 2019 and December 2019.
3. Task 3 (File property related): Retrieve the list of artifacts with .𝑢𝑚𝑙 file
extension.
4. Task 4 (Commit related): Retrieve the list of commits: with a model artifact.
5. Task 5 (File property + commit related): Retrieve the list of commits: with
any model artifacts (any model-based file extension)
9
Sayed Mohsin
Reza
Performance Analysis Results
Performance results show that
PyDriller takes more time and memory
than ModelMine.
• PyDriller downloads whole git file
of a repository and mine commit
information.
• ModelMine fetches the information
directly without downloading any
file and have no intermediate
process.
Figure: Performance evaluation
results
10
Sayed Mohsin
Reza
Usability Analysis Results
• ModelMine has better usability in
all usability criteria than PyDriller
• In user interface & learning curve
category, ModelMine has 50% more
ratings than PyDriller.
• One participant comments that
ModelMine provides faster learning
experience than PyDriller due to its
easy UI design and better
readability.
Figure: Usability study results
11
Sayed Mohsin
Reza
Conclusion
• Mining models from open source repositories with non-textual-based
artifacts
• User Friendly tool for non experts.
• Performance superior to existing mining tools
• Tool available at https://www.smreza.com/projects/modelmine/
12
Questions?

More Related Content

Similar to ModelMine a tool to facilitate mining models from open source repositories presentation models 2020

Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadatamarkgrover
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...SEAD
 
MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
 
Large scale computing
Large scale computing Large scale computing
Large scale computing Bhupesh Bansal
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryNeo4j
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEEMEMTECHSTUDENTPROJECTS
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationNeo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationTamikaTannis
 
Mining Testing Questions on Stack Overflow
Mining Testing Questions on Stack OverflowMining Testing Questions on Stack Overflow
Mining Testing Questions on Stack OverflowPavneet Singh Kochhar
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsHisham Arafat
 
Taming Complexity: On Studying the Application of Model-Driven Engineering to...
Taming Complexity: On Studying the Application of Model-Driven Engineering to...Taming Complexity: On Studying the Application of Model-Driven Engineering to...
Taming Complexity: On Studying the Application of Model-Driven Engineering to...Florian Rademacher
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyHong (Jenny) Jing
 
Machine Learning & Predictive Maintenance
Machine Learning &  Predictive MaintenanceMachine Learning &  Predictive Maintenance
Machine Learning & Predictive MaintenanceArnab Biswas
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrievalAYESHA JAVED
 
Architecting for Huper Growth and Great Engineering Culture
Architecting for Huper Growth and Great Engineering CultureArchitecting for Huper Growth and Great Engineering Culture
Architecting for Huper Growth and Great Engineering CultureSARCCOM
 
Architecting for Hyper Growth and Great Engineering Culture
Architecting for Hyper Growth and Great Engineering CultureArchitecting for Hyper Growth and Great Engineering Culture
Architecting for Hyper Growth and Great Engineering Cultureifnu bima
 

Similar to ModelMine a tool to facilitate mining models from open source repositories presentation models 2020 (20)

Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationNeo4j GraphTour Santa Monica 2019 - Amundsen Presentation
Neo4j GraphTour Santa Monica 2019 - Amundsen Presentation
 
Mining Testing Questions on Stack Overflow
Mining Testing Questions on Stack OverflowMining Testing Questions on Stack Overflow
Mining Testing Questions on Stack Overflow
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
Taming Complexity: On Studying the Application of Model-Driven Engineering to...
Taming Complexity: On Studying the Application of Model-Driven Engineering to...Taming Complexity: On Studying the Application of Model-Driven Engineering to...
Taming Complexity: On Studying the Application of Model-Driven Engineering to...
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case Study
 
6th sem
6th sem6th sem
6th sem
 
Machine Learning & Predictive Maintenance
Machine Learning &  Predictive MaintenanceMachine Learning &  Predictive Maintenance
Machine Learning & Predictive Maintenance
 
The recommendations system for source code components retrieval
The recommendations system for source code components retrievalThe recommendations system for source code components retrieval
The recommendations system for source code components retrieval
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 
Architecting for Huper Growth and Great Engineering Culture
Architecting for Huper Growth and Great Engineering CultureArchitecting for Huper Growth and Great Engineering Culture
Architecting for Huper Growth and Great Engineering Culture
 
Architecting for Hyper Growth and Great Engineering Culture
Architecting for Hyper Growth and Great Engineering CultureArchitecting for Hyper Growth and Great Engineering Culture
Architecting for Hyper Growth and Great Engineering Culture
 

Recently uploaded

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 

Recently uploaded (20)

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 

ModelMine a tool to facilitate mining models from open source repositories presentation models 2020

  • 1. Sayed Mohsin Reza ModelMine: A Tool to Facilitate Mining Models from Open Source Repositories Sayed Mohsin Reza, Omar Badreddin, and Khandoker Rahad Presenter Sayed Mohsin Reza PhD Student Department of Computer Science University of Texas, USA Email: sreza3@miners.utep.edu Website: https://www.smreza.com/ Tool available at https://www.smreza.com/projects/modelmine/
  • 2. Sayed Mohsin Reza Introduction & Background • Mining Software Repositories (MSR) has witnessed tremendous growth in the past few years • MSR contributes to establishing research agendas in software development, cost estimation, testing, quality assurance and more • MSR contributes to analyzing software defects, development activities, processes, patterns, and more. Outcome Reducing Software Development Cost Better Software Design Better Software Quality 2
  • 3. Sayed Mohsin Reza Problem & Motivation • Limited tools that target mining models and design artifacts from open source repositories • Limitations • Both Repos & Mining tools focus on textual artifacts • Data Representation • Limited Search Criteria • Limited Ranking Scope • Faced trouble when writing paper “The Human in MDE Loop: A Case Study on Integrating Handwritten Code in Model-Driven Engineering Repositories” accepted recently in a journal. 3
  • 4. Sayed Mohsin Reza Existing Tools • Metric Miner (https://github.com/Woutrrr/metricminer2 ) Capable: Mining Commits, modifications, export result in CSV format Limitation: search functionality, Result ranking and need JAVA coding knowledge. • Qualitas Corpus (http://qualitascorpus.com/ ) Collection of software systems intended to be used for empirical studies of code artefacts Limitation: updates of the systems, search functionality • GHTorrent (https://ghtorrent.org/ ) Pros: provides repositories of GitHub with extracted metadata Cons: provides repositories of GitHub in a static way. 4
  • 5. Sayed Mohsin Reza Contributions This paper presents a novel model mining tool called ModelMine • Facilitates mining models with various search criteria • faster data extraction for non-textual artifacts • User friendly tool available tool to non MSR experts • Ensure search capability in the following mining areas 1. Model based Repository Search - Available in some tools 2. Mode based Artifact Search – Novel feature 3. Commit History Search – Available in PyDriller, Metric Miner 5
  • 6. Sayed Mohsin Reza ModelMine Architecture 6 Open Source Repository Server: GitLab, BitBucket, SourceForge etc.
  • 7. Sayed Mohsin Reza Tool Demonstration Link: https://www.smreza.com/projects/modelmine/ 7 Live Demo Upcoming Topics • Evaluation of the tool • Results of evaluation • Conclusion
  • 8. Sayed Mohsin Reza Evaluation Comparative Analysis with PyDriller. 1. Performance Analysis – learn about execution time and memory consumption 2. Usability Analysis - how easy the tool is to learn • Ten participants working in software engineering research. • Eight - doctoral students • two - master’s students in computer science Evaluation Forms: https://forms.gle/kJcWASsKM13AHh9a6 8
  • 9. Sayed Mohsin Reza Evaluation Tasks 1. Task 1 (Size related): Retrieve the list of repositories: Minimum 1 UML Model and repository size > 30 MB. 2. Task 2 (Time related): Retrieve the list of repositories: Minimum 1 UML Model and created between January 2019 and December 2019. 3. Task 3 (File property related): Retrieve the list of artifacts with .𝑢𝑚𝑙 file extension. 4. Task 4 (Commit related): Retrieve the list of commits: with a model artifact. 5. Task 5 (File property + commit related): Retrieve the list of commits: with any model artifacts (any model-based file extension) 9
  • 10. Sayed Mohsin Reza Performance Analysis Results Performance results show that PyDriller takes more time and memory than ModelMine. • PyDriller downloads whole git file of a repository and mine commit information. • ModelMine fetches the information directly without downloading any file and have no intermediate process. Figure: Performance evaluation results 10
  • 11. Sayed Mohsin Reza Usability Analysis Results • ModelMine has better usability in all usability criteria than PyDriller • In user interface & learning curve category, ModelMine has 50% more ratings than PyDriller. • One participant comments that ModelMine provides faster learning experience than PyDriller due to its easy UI design and better readability. Figure: Usability study results 11
  • 12. Sayed Mohsin Reza Conclusion • Mining models from open source repositories with non-textual-based artifacts • User Friendly tool for non experts. • Performance superior to existing mining tools • Tool available at https://www.smreza.com/projects/modelmine/ 12 Questions?