SlideShare a Scribd company logo
http://people.disim.univaq.it/diruscio/
davide.diruscio@univaq.it
@ddiruscio
Dipartimento di Ingegneria e Scienze
Università degli Studi dell’Aquila
dell’Informazione e Matematica
On the way of listening to the crowd for
supporting modeling activities
Davide Di Ruscio
2
3https://www.slideshare.net/CrossingMinds/recommendation-system-explained?from_action=save
4
Recommendation systems
Information filtering systems
Deal with choice overload
Focused on user’s:
– Preferences
– Interest
– Observed Behaviour
https://www.slideshare.net/CrossingMinds/recommendation-system-explained?from_action=save
5
Recommendation systems - Examples
Facebook–“People You May Know”
Netflix–“Other Movies You May Enjoy”
LinkedIn–“Jobs You May Be Interested In”
Amazon–“Customer who bought this item also bought …”
YouTube–“Recommended Videos”
Google–“Search results adjusted”
Pinterest–“Recommended Images”
…
https://www.slideshare.net/CrossingMinds/recommendation-system-explained?from_action=save
6
Recommendation systems
Recommendation systems (RS) help to match users with items
– Ease information overload
Different system designs / paradigms
– Based on availability of exploitable data
– Implicit and explicit user feedback
– Domain characteristics
RS are software agents that elicit the interests and preferences of individual consumers
[…] and make recommendations accordingly. They have the potential to support and
improve the quality of the decision's consumers make while searching for and selecting
products online.
[Xiao & Benbasat, MISQ, 2007]
http://clgiles.ist.psu.edu/IST441/materials/powerpoint/RC/rec.pptx
7
Recommendation systems
RS seen as a function
Given:
– User model (e.g. ratings, preferences, demographics, situational context)
– Items (with or without description of item characteristics)
Find:
– Relevance score. Used for ranking.
Finally:
– Recommend items that are assumed to be relevant
http://clgiles.ist.psu.edu/IST441/materials/powerpoint/RC/rec.pptx
8
The road ahead
Recommendation
Systems
Recommendation Systems
in Software Engineering
Developing Recommendation Systems:
Challenges and Lessons learned
What about Model Recommenders?
Recommendation Systems
in Software Engineering
(RSSE)
10
Recommendation Systems in Software Engineering
A recommendation system in software
engineering is
“. . . a software application that provides
information items estimated to be
valuable for a software engineering task
in a given context.”
11
Recommendation Systems in Software Engineering
Data Preprocessing Capturing Context
Producing
Recommendations
Presenting
Recommendations
12
Understanding complex problems
13
Understanding complex problems
14
Software Analytics
"Software analytics is analytics on software data for managers
and software engineers with the aim of empowering software
development individuals and teams to gain and share insight
form their data to make better decisions."
R. Buse, T. Zimmermann. Information Needs for Software Development Analytics. Proc. Int'l Conf. Software Engineering (ICSE), IEEE CS,
2012
15
Mining Software Repositories field
The Mining Software Repositories (MSR)
field analyzes the rich data available in
software repositories to uncover
interesting and actionable information
about software systems and projects.
http://www.msrconf.org/
Q&A systems
Bug Reports
API
Documentation
16
Some numbers on EMSE research
Research on empirical software engineering has increasingly used data
made available in online repositories or collective efforts
Cumulative number of FOSS projects per year Average number of FOSS projects per year
Today, GitHub
hosts more than
94 Millions
of repositoriesPhilippe Krief, Eclipse Foundation
Today, GitHub
hosts more than
94 Millions
of repositoriesPhilippe Krief, Eclipse Foundation
Today, GitHub
hosts more than
94 Millions
of repositoriesPhilippe Krief, Eclipse Foundation
Today, GitHub
hosts more than
94 Millions
of repositoriesPhilippe Krief, Eclipse Foundation
The CROSSMINER experience
https://www.crossminer.org/
http://eclipse.org/scava
22
Context
Source code
Q&A systems
Bug Reports
API
Documentation
Tutorials
Configuration
Management Systems
Development of new software systems
by reusing existing open source components
23
Mining and
Knowledge Extraction
Tools
Source code
Q&A systems
Bug Reports
API
Documentation
Tutorials
Configuration
Management Systems
Advanced IDEs
CROSSMINER: high-level view
24
CROSSMINER: high-level view
Data Preprocessing Capturing Context
Producing
Recommendations
Presenting
Recommendations
25
Mining and Analysis Tools
CROSSMINER: high-level view
Data Preprocessing Capturing Context
Producing
Recommendations
Presenting
Recommendations
Knowledge Base
Source Code
Miner
NLP
Miner
Configuration
Miner
Cross project
Analysis
OSS forges
Source Code
Natural
language
channels
Configuration
Scripts
lookup/store
mine
26
CROSSMINER: high-level view
Data Preprocessing Capturing Context
Producing
Recommendations
Presenting
Recommendations
Developer
IDE
Knowledge Base
query
recommendations
Data
Storage
Real-time recommendations that serve productivity and quality increase
27
Examples of recommendations
Use of machine learning algorithms to produce recommendations during
development:
– Depending on the set of selected third-party libraries, the system is able to recommend
additional libraries that should be included in the project being developed
– Given a selected library, the system is able to suggest alternative ones that share some
similarities with the selected one
– Depending on the set of selected libraries, the system shows API documentation and Q&A
posts that can help developers to understand how to use the selected libraries
– During the development, developers get recommendations about API function calls and usage
patterns that might be used
– …
28
The CROSSMINER Recommendation Systems
CrossSim – Recommending similar projects
CrossRec – Recommending third-party libraries
FOCUS – Recommending API function calls and usage patterns
MNBN – Recommending GitHub topics
PostFinder - Recommending StackOverlfow posts
MNBN – Recommending GitHub topics
29
The CROSSMINER Recommendation Systems
CrossSim – Recommending similar projects
CrossRec – Recommending third-party libraries
FOCUS – Recommending API function calls and usage patterns
MNBN – Recommending GitHub topics
PostFinder - Recommending StackOverlfow posts
MNBN – Recommending GitHub topics
30
31
Overview of CrossSim
Graphs for representing different kinds
of relationships in the OSS ecosystem
• e.g., developers commit to repositories,
users star repositories, projects contain
source code files, etc.
Cross Project Relationships for Computing Open Source Software Similarity
32
CrossSim – Recommending similar projects
CrossRec – Recommending third-party libraries
FOCUS – Recommending API function calls and usage patterns
MNBN – Recommending GitHub topics
PostFinder - Recommending StackOverlfow posts
MNBN – Recommending GitHub topics
33
3434
R1 R2 R3
C1 5 5 2
C2 3 3 4
C3 5 5 ?
◼ User-item matrix: Ratings given to Pizza
restaurants by customers
◼ Unknown ratings can be deduced from the most
similar customers
34CROSSMINER Lisbon Meeting, 27-28 February 2018
Collaborative-Filtering Recommendation
35CROSSMINER Lisbon Meeting, 27-28 February 2018
◼ Representing the project-library relationships using a user-item
ratings matrix
◼ Predict the inclusion of additional libraries
CrossRec: Projects-Libraries Representation
36
CrossSim – Recommending similar projects
CrossRec – Recommending third-party libraries
FOCUS – Recommending API function calls and usage patterns
MNBN – Recommending GitHub topics
PostFinder - Recommending StackOverlfow posts
MNBN – Recommending GitHub topics
37
Problem
“Which API methods should this piece of client code
invoke, considering that it has already invoked these
other API methods?”
38
Explanatory example: method under development
39
Explanatory example: method declaration
Method declaration (MD)
Method invocations (MI)
40
Explanatory example: complete method
declaration
41
Context-aware recommendation
University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 41
Examples of context: day of the
week, hour of the day, weather
conditions, …
42
Context-aware recommendation
University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 42
Predict the inclusion of additional invocations
43
44
CrossSim – Recommending similar projects
CrossRec – Recommending third-party libraries
FOCUS – Recommending API function calls and usage patterns
MNBN – Recommending GitHub topics
PostFinder - Recommending StackOverlfow posts
MNBN – Recommending GitHub topics
45
47
CrossSim – Recommending similar projects
CrossRec – Recommending third-party libraries
FOCUS – Recommending API function calls and usage patterns
MNBN – Recommending GitHub topics
PostFinder - Recommending StackOverlfow posts
MNBN – Recommending GitHub topics
48
GitHub topics
49
Proposed approach
The CROSSMINER experience:
challenges and lessons learned
51
Development of the CROSSMINER
recommendation systems: main activities
52
Requirement elicitation phase: main challenge
Clear understanding of the needed recommendation systems:
• Understanding the functionalities that are expected from the final users of the envisioned
recommendation
• You might risk spending time on developing systems that are able to provide
recommendations, which instead might not be relevant and inline with the actual user
needs.
53
Requirement elicitation phase: main challenge
Solution employed in CROSSMINER
– We implemented demo projects that reflected real-world scenarios
– Explanatory context inputs and corresponding recommendation items that the
envisioned recommendation systems should have been able to produce.
54
Development phase: main challenge
Clear awareness of existing recommendation techniques
– Knowledge of techniques and patterns that might be employed
– Comparing and evaluating candidate approaches can be a very daunting task
55
Development phase: main challenge
Applied solution
– Significant effort has been devoted to analyze existing approaches that might
have been used as starting points.
Data Preprocessing Capturing Context
Producing
Recommendations
Presenting
Recommendations
56
57
Evaluation phase: main challenge
There is no golden rule for evaluating all possible recommendation
systems due to their intrinsic features as well as heterogeneity
– Which evaluation methodology is suitable?
– Which metric(s) can be used?
– Which dataset is eligible/available for evaluation?
– Which baseline(s) can be compared with?
58
Lessons learned
User scepticism: target users might be sceptical about the relevance of
the potential items that can be recommended
Quality of data: importance of having the availability of big data and
high-quality data for training and evaluation activities
Baseline availability: Not always it is possible to reuse tools and data of
the identified baselines
59
Lessons learned
In the case of the FOCUS evaluation, one of the considered datasets
was initially consisting of 5,147 Java projects retrieved from the
Software Heritage archive
To comply with the requirements of the baseline and of FOCUS, we had
to restrict the dataset
- we ended up with a dataset consisting of 610 Java projects
- we had to create a dataset ten times bigger than the used one for
the evaluation
What about
Model recommenders?
61
Model recommenders
A recommender system for model driven software
engineering can combine data from different sources in
order to infer a list of relevant and actionable model
changes in real time.
Stefan Kögel, Recommender system for model driven software development
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of
Software Engineering
62
Model recommenders
Recommendation systems for supporting
- the development of metamodels
- the development of models
- the development of model-to-model transformations
…
63
Model recommenders
Mussbacher, G., Combemale, B., Kienzle, J. et al. Opportunities in
intelligent modeling assistance. Softw Syst Model 19, 1045–1053 (2020).
64
Model recommenders
The devil is in the details data
65
Google’s AI-related software
The lines of code in Google’s AI-related software
D. Sculley et al., Hidden technical debt in machine learning systems, in Proc. 28th Int. Conf. Neural Information Processing Systems,
vol. 2. Cambridge, MA: MIT Press, pp. 2503–2511. [Online]. Available: http://dl.acm.org/citation .cfm?id=2969442.2969519
66
Model recommenders
The devil is in the details data
67
Model recommenders
The devil is in the details data
The availability of source code forges enabled so
many research directions and possibilities in EMSE
What’s the situation concerning
repositories of modeling artifacts?
68
Model recommenders
The devil is in the details data
The availability of source code forges enabled so
many research directions and possibilities in EMSE
What’s the situation concerning
repositories of modeling artifacts?
All of them seem to struggle in
attracting contributions from the
community
69
CloudMDE 2015
Model-Driven Engineering on and for the Cloud
Proceedings of the
3rd International Workshop on Model-Driven Engineering on and for the Cloud
18th International Conference on Model Driven Engineering Languages and Systems
(MoDELS 2015)
Ottawa, Canada, September 29, 2015.
Edited by Richard Paige, Jordi Cabot, Marco Brambilla, James H. Hill
70
CloudMDE 2015
Model-Driven Engineering on and for the Cloud
Proceedings of the
3rd International Workshop on Model-Driven Engineering on and for the Cloud
18th International Conference on Model Driven Engineering Languages and Systems
(MoDELS 2015)
Ottawa, Canada, September 29, 2015.
Edited by Richard Paige, Jordi Cabot, Marco Brambilla, James H. Hill
71
My main points to conclude
The devil is in the details
My “fear” is that:
- technologies are there
- knowledge and expertise are there
But we are missing the necessary raw material
- there are alternatives (e.g., use of synthetic data) even though they
might enable only sub-optimal solutions
data
72
Recommendation
Systems
Recommendation Systems
in Software Engineering
Developing Recommendation Systems:
Challenges and Lessons learned
What about Model Recommenders?
73
Eclipse SCAVA project
eclipse.org/scava

More Related Content

Similar to On the way of listening to the crowd for supporting modeling activities

How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
4Science
 

Similar to On the way of listening to the crowd for supporting modeling activities (20)

Introduction to License Compliance and My research (D. German)
Introduction to License Compliance and My research (D. German)Introduction to License Compliance and My research (D. German)
Introduction to License Compliance and My research (D. German)
 
Maintaining and Releasing Open Source Software
Maintaining and Releasing Open Source SoftwareMaintaining and Releasing Open Source Software
Maintaining and Releasing Open Source Software
 
Flax ovum search-across_the_enterprise
Flax ovum search-across_the_enterpriseFlax ovum search-across_the_enterprise
Flax ovum search-across_the_enterprise
 
SubSift web services and workflows for profiling and comparing scientists and...
SubSift web services and workflows for profiling and comparing scientists and...SubSift web services and workflows for profiling and comparing scientists and...
SubSift web services and workflows for profiling and comparing scientists and...
 
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software EngineeringSoftware Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering
 
How to contribute to Serverless Apache OpenWhisk OpenSource101 NCSU
How to contribute to Serverless Apache OpenWhisk OpenSource101 NCSUHow to contribute to Serverless Apache OpenWhisk OpenSource101 NCSU
How to contribute to Serverless Apache OpenWhisk OpenSource101 NCSU
 
Executable papers
Executable papersExecutable papers
Executable papers
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
 
Publishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHubPublishing and Serving Machine Learning Models with DLHub
Publishing and Serving Machine Learning Models with DLHub
 
Anatomy of Social Networks, a guide for social media strategists
Anatomy of Social Networks, a guide for social media strategistsAnatomy of Social Networks, a guide for social media strategists
Anatomy of Social Networks, a guide for social media strategists
 
AudrisMockus_MSR22.pdf
AudrisMockus_MSR22.pdfAudrisMockus_MSR22.pdf
AudrisMockus_MSR22.pdf
 
Beyond SNEEP: Ideas for Creative Repository Management
Beyond SNEEP: Ideas for Creative Repository ManagementBeyond SNEEP: Ideas for Creative Repository Management
Beyond SNEEP: Ideas for Creative Repository Management
 
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
 
Infusing Social Data Analytics into Future Internet applications for Manufact...
Infusing Social Data Analytics into Future Internet applications for Manufact...Infusing Social Data Analytics into Future Internet applications for Manufact...
Infusing Social Data Analytics into Future Internet applications for Manufact...
 
Guidelines For PhD Research Projects
Guidelines For PhD Research ProjectsGuidelines For PhD Research Projects
Guidelines For PhD Research Projects
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
Showcasing research data tools - Jisc Digifest 2016
Showcasing research data tools - Jisc Digifest 2016Showcasing research data tools - Jisc Digifest 2016
Showcasing research data tools - Jisc Digifest 2016
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 

More from Davide Ruscio

Collaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping StudyCollaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping Study
Davide Ruscio
 

More from Davide Ruscio (11)

Detecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDetecting java software similarities by using different clustering
Detecting java software similarities by using different clustering
 
FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns
FOCUS:  A Recommender System for Mining API Function Calls and  Usage PatternsFOCUS:  A Recommender System for Mining API Function Calls and  Usage Patterns
FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns
 
CrossSim: exploiting mutual relationships to detect similar OSS projects
CrossSim: exploiting mutual relationships to detect similar OSS projectsCrossSim: exploiting mutual relationships to detect similar OSS projects
CrossSim: exploiting mutual relationships to detect similar OSS projects
 
Use of MDE to Analyse Open Source Software
Use of MDE to Analyse Open Source SoftwareUse of MDE to Analyse Open Source Software
Use of MDE to Analyse Open Source Software
 
Consistency Recovery in Interactive Modeling
Consistency Recovery in Interactive ModelingConsistency Recovery in Interactive Modeling
Consistency Recovery in Interactive Modeling
 
Edelta: an approach for defining and applying reusable metamodel refactorings
Edelta: an approach for defining and applying reusable metamodel refactoringsEdelta: an approach for defining and applying reusable metamodel refactorings
Edelta: an approach for defining and applying reusable metamodel refactorings
 
Semantic based model matching with emf compare
Semantic based model matching with emf compareSemantic based model matching with emf compare
Semantic based model matching with emf compare
 
Collaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping StudyCollaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping Study
 
Model repositories: will they become reality?
Model repositories: will they become reality?Model repositories: will they become reality?
Model repositories: will they become reality?
 
Mining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel MetricsMining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel Metrics
 
MDEForge: an extensible Web-based modeling platform
MDEForge: an extensible Web-based modeling platformMDEForge: an extensible Web-based modeling platform
MDEForge: an extensible Web-based modeling platform
 

Recently uploaded

JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
Max Lee
 

Recently uploaded (20)

GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
How To Build a Successful SaaS Design.pdf
How To Build a Successful SaaS Design.pdfHow To Build a Successful SaaS Design.pdf
How To Build a Successful SaaS Design.pdf
 
Benefits of Employee Monitoring Software
Benefits of  Employee Monitoring SoftwareBenefits of  Employee Monitoring Software
Benefits of Employee Monitoring Software
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 

On the way of listening to the crowd for supporting modeling activities

  • 1. http://people.disim.univaq.it/diruscio/ davide.diruscio@univaq.it @ddiruscio Dipartimento di Ingegneria e Scienze Università degli Studi dell’Aquila dell’Informazione e Matematica On the way of listening to the crowd for supporting modeling activities Davide Di Ruscio
  • 2. 2
  • 4. 4 Recommendation systems Information filtering systems Deal with choice overload Focused on user’s: – Preferences – Interest – Observed Behaviour https://www.slideshare.net/CrossingMinds/recommendation-system-explained?from_action=save
  • 5. 5 Recommendation systems - Examples Facebook–“People You May Know” Netflix–“Other Movies You May Enjoy” LinkedIn–“Jobs You May Be Interested In” Amazon–“Customer who bought this item also bought …” YouTube–“Recommended Videos” Google–“Search results adjusted” Pinterest–“Recommended Images” … https://www.slideshare.net/CrossingMinds/recommendation-system-explained?from_action=save
  • 6. 6 Recommendation systems Recommendation systems (RS) help to match users with items – Ease information overload Different system designs / paradigms – Based on availability of exploitable data – Implicit and explicit user feedback – Domain characteristics RS are software agents that elicit the interests and preferences of individual consumers […] and make recommendations accordingly. They have the potential to support and improve the quality of the decision's consumers make while searching for and selecting products online. [Xiao & Benbasat, MISQ, 2007] http://clgiles.ist.psu.edu/IST441/materials/powerpoint/RC/rec.pptx
  • 7. 7 Recommendation systems RS seen as a function Given: – User model (e.g. ratings, preferences, demographics, situational context) – Items (with or without description of item characteristics) Find: – Relevance score. Used for ranking. Finally: – Recommend items that are assumed to be relevant http://clgiles.ist.psu.edu/IST441/materials/powerpoint/RC/rec.pptx
  • 8. 8 The road ahead Recommendation Systems Recommendation Systems in Software Engineering Developing Recommendation Systems: Challenges and Lessons learned What about Model Recommenders?
  • 10. 10 Recommendation Systems in Software Engineering A recommendation system in software engineering is “. . . a software application that provides information items estimated to be valuable for a software engineering task in a given context.”
  • 11. 11 Recommendation Systems in Software Engineering Data Preprocessing Capturing Context Producing Recommendations Presenting Recommendations
  • 14. 14 Software Analytics "Software analytics is analytics on software data for managers and software engineers with the aim of empowering software development individuals and teams to gain and share insight form their data to make better decisions." R. Buse, T. Zimmermann. Information Needs for Software Development Analytics. Proc. Int'l Conf. Software Engineering (ICSE), IEEE CS, 2012
  • 15. 15 Mining Software Repositories field The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects. http://www.msrconf.org/ Q&A systems Bug Reports API Documentation
  • 16. 16 Some numbers on EMSE research Research on empirical software engineering has increasingly used data made available in online repositories or collective efforts Cumulative number of FOSS projects per year Average number of FOSS projects per year
  • 17. Today, GitHub hosts more than 94 Millions of repositoriesPhilippe Krief, Eclipse Foundation
  • 18. Today, GitHub hosts more than 94 Millions of repositoriesPhilippe Krief, Eclipse Foundation
  • 19. Today, GitHub hosts more than 94 Millions of repositoriesPhilippe Krief, Eclipse Foundation
  • 20. Today, GitHub hosts more than 94 Millions of repositoriesPhilippe Krief, Eclipse Foundation
  • 22. 22 Context Source code Q&A systems Bug Reports API Documentation Tutorials Configuration Management Systems Development of new software systems by reusing existing open source components
  • 23. 23 Mining and Knowledge Extraction Tools Source code Q&A systems Bug Reports API Documentation Tutorials Configuration Management Systems Advanced IDEs CROSSMINER: high-level view
  • 24. 24 CROSSMINER: high-level view Data Preprocessing Capturing Context Producing Recommendations Presenting Recommendations
  • 25. 25 Mining and Analysis Tools CROSSMINER: high-level view Data Preprocessing Capturing Context Producing Recommendations Presenting Recommendations Knowledge Base Source Code Miner NLP Miner Configuration Miner Cross project Analysis OSS forges Source Code Natural language channels Configuration Scripts lookup/store mine
  • 26. 26 CROSSMINER: high-level view Data Preprocessing Capturing Context Producing Recommendations Presenting Recommendations Developer IDE Knowledge Base query recommendations Data Storage Real-time recommendations that serve productivity and quality increase
  • 27. 27 Examples of recommendations Use of machine learning algorithms to produce recommendations during development: – Depending on the set of selected third-party libraries, the system is able to recommend additional libraries that should be included in the project being developed – Given a selected library, the system is able to suggest alternative ones that share some similarities with the selected one – Depending on the set of selected libraries, the system shows API documentation and Q&A posts that can help developers to understand how to use the selected libraries – During the development, developers get recommendations about API function calls and usage patterns that might be used – …
  • 28. 28 The CROSSMINER Recommendation Systems CrossSim – Recommending similar projects CrossRec – Recommending third-party libraries FOCUS – Recommending API function calls and usage patterns MNBN – Recommending GitHub topics PostFinder - Recommending StackOverlfow posts MNBN – Recommending GitHub topics
  • 29. 29 The CROSSMINER Recommendation Systems CrossSim – Recommending similar projects CrossRec – Recommending third-party libraries FOCUS – Recommending API function calls and usage patterns MNBN – Recommending GitHub topics PostFinder - Recommending StackOverlfow posts MNBN – Recommending GitHub topics
  • 30. 30
  • 31. 31 Overview of CrossSim Graphs for representing different kinds of relationships in the OSS ecosystem • e.g., developers commit to repositories, users star repositories, projects contain source code files, etc. Cross Project Relationships for Computing Open Source Software Similarity
  • 32. 32 CrossSim – Recommending similar projects CrossRec – Recommending third-party libraries FOCUS – Recommending API function calls and usage patterns MNBN – Recommending GitHub topics PostFinder - Recommending StackOverlfow posts MNBN – Recommending GitHub topics
  • 33. 33
  • 34. 3434 R1 R2 R3 C1 5 5 2 C2 3 3 4 C3 5 5 ? ◼ User-item matrix: Ratings given to Pizza restaurants by customers ◼ Unknown ratings can be deduced from the most similar customers 34CROSSMINER Lisbon Meeting, 27-28 February 2018 Collaborative-Filtering Recommendation
  • 35. 35CROSSMINER Lisbon Meeting, 27-28 February 2018 ◼ Representing the project-library relationships using a user-item ratings matrix ◼ Predict the inclusion of additional libraries CrossRec: Projects-Libraries Representation
  • 36. 36 CrossSim – Recommending similar projects CrossRec – Recommending third-party libraries FOCUS – Recommending API function calls and usage patterns MNBN – Recommending GitHub topics PostFinder - Recommending StackOverlfow posts MNBN – Recommending GitHub topics
  • 37. 37 Problem “Which API methods should this piece of client code invoke, considering that it has already invoked these other API methods?”
  • 38. 38 Explanatory example: method under development
  • 39. 39 Explanatory example: method declaration Method declaration (MD) Method invocations (MI)
  • 40. 40 Explanatory example: complete method declaration
  • 41. 41 Context-aware recommendation University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 41 Examples of context: day of the week, hour of the day, weather conditions, …
  • 42. 42 Context-aware recommendation University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 42 Predict the inclusion of additional invocations
  • 43. 43
  • 44. 44 CrossSim – Recommending similar projects CrossRec – Recommending third-party libraries FOCUS – Recommending API function calls and usage patterns MNBN – Recommending GitHub topics PostFinder - Recommending StackOverlfow posts MNBN – Recommending GitHub topics
  • 45. 45
  • 46.
  • 47. 47 CrossSim – Recommending similar projects CrossRec – Recommending third-party libraries FOCUS – Recommending API function calls and usage patterns MNBN – Recommending GitHub topics PostFinder - Recommending StackOverlfow posts MNBN – Recommending GitHub topics
  • 51. 51 Development of the CROSSMINER recommendation systems: main activities
  • 52. 52 Requirement elicitation phase: main challenge Clear understanding of the needed recommendation systems: • Understanding the functionalities that are expected from the final users of the envisioned recommendation • You might risk spending time on developing systems that are able to provide recommendations, which instead might not be relevant and inline with the actual user needs.
  • 53. 53 Requirement elicitation phase: main challenge Solution employed in CROSSMINER – We implemented demo projects that reflected real-world scenarios – Explanatory context inputs and corresponding recommendation items that the envisioned recommendation systems should have been able to produce.
  • 54. 54 Development phase: main challenge Clear awareness of existing recommendation techniques – Knowledge of techniques and patterns that might be employed – Comparing and evaluating candidate approaches can be a very daunting task
  • 55. 55 Development phase: main challenge Applied solution – Significant effort has been devoted to analyze existing approaches that might have been used as starting points. Data Preprocessing Capturing Context Producing Recommendations Presenting Recommendations
  • 56. 56
  • 57. 57 Evaluation phase: main challenge There is no golden rule for evaluating all possible recommendation systems due to their intrinsic features as well as heterogeneity – Which evaluation methodology is suitable? – Which metric(s) can be used? – Which dataset is eligible/available for evaluation? – Which baseline(s) can be compared with?
  • 58. 58 Lessons learned User scepticism: target users might be sceptical about the relevance of the potential items that can be recommended Quality of data: importance of having the availability of big data and high-quality data for training and evaluation activities Baseline availability: Not always it is possible to reuse tools and data of the identified baselines
  • 59. 59 Lessons learned In the case of the FOCUS evaluation, one of the considered datasets was initially consisting of 5,147 Java projects retrieved from the Software Heritage archive To comply with the requirements of the baseline and of FOCUS, we had to restrict the dataset - we ended up with a dataset consisting of 610 Java projects - we had to create a dataset ten times bigger than the used one for the evaluation
  • 61. 61 Model recommenders A recommender system for model driven software engineering can combine data from different sources in order to infer a list of relevant and actionable model changes in real time. Stefan Kögel, Recommender system for model driven software development ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering
  • 62. 62 Model recommenders Recommendation systems for supporting - the development of metamodels - the development of models - the development of model-to-model transformations …
  • 63. 63 Model recommenders Mussbacher, G., Combemale, B., Kienzle, J. et al. Opportunities in intelligent modeling assistance. Softw Syst Model 19, 1045–1053 (2020).
  • 64. 64 Model recommenders The devil is in the details data
  • 65. 65 Google’s AI-related software The lines of code in Google’s AI-related software D. Sculley et al., Hidden technical debt in machine learning systems, in Proc. 28th Int. Conf. Neural Information Processing Systems, vol. 2. Cambridge, MA: MIT Press, pp. 2503–2511. [Online]. Available: http://dl.acm.org/citation .cfm?id=2969442.2969519
  • 66. 66 Model recommenders The devil is in the details data
  • 67. 67 Model recommenders The devil is in the details data The availability of source code forges enabled so many research directions and possibilities in EMSE What’s the situation concerning repositories of modeling artifacts?
  • 68. 68 Model recommenders The devil is in the details data The availability of source code forges enabled so many research directions and possibilities in EMSE What’s the situation concerning repositories of modeling artifacts? All of them seem to struggle in attracting contributions from the community
  • 69. 69 CloudMDE 2015 Model-Driven Engineering on and for the Cloud Proceedings of the 3rd International Workshop on Model-Driven Engineering on and for the Cloud 18th International Conference on Model Driven Engineering Languages and Systems (MoDELS 2015) Ottawa, Canada, September 29, 2015. Edited by Richard Paige, Jordi Cabot, Marco Brambilla, James H. Hill
  • 70. 70 CloudMDE 2015 Model-Driven Engineering on and for the Cloud Proceedings of the 3rd International Workshop on Model-Driven Engineering on and for the Cloud 18th International Conference on Model Driven Engineering Languages and Systems (MoDELS 2015) Ottawa, Canada, September 29, 2015. Edited by Richard Paige, Jordi Cabot, Marco Brambilla, James H. Hill
  • 71. 71 My main points to conclude The devil is in the details My “fear” is that: - technologies are there - knowledge and expertise are there But we are missing the necessary raw material - there are alternatives (e.g., use of synthetic data) even though they might enable only sub-optimal solutions data
  • 72. 72 Recommendation Systems Recommendation Systems in Software Engineering Developing Recommendation Systems: Challenges and Lessons learned What about Model Recommenders?