SlideShare a Scribd company logo
1 of 37
Download to read offline
http://people.disim.univaq.it/diruscio/
davide.diruscio@univaq.it
@ddiruscio
FOCUS: A Recommender System for
Mining API Function Calls and
Usage Patterns
Davide Di Ruscio
Joint work with Phuong T. Nguyen, Juri Di Rocco, Lina Ochoa, Thomas Degueule, Massimiliano Di Penta
2ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Context
Related activities
• Searching for candidate components
• Evaluating a set of retrieved candidate components
to find the most suitable one
• Understanding how to use the selected components
• Monitoring the selected components
Development of new software systems
by reusing existing open source components
www.crossminer.org
@crossminer
eclipse.org/scava
3ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Mining and
Knowledge Extraction
Tools
Source code
Q&A systems
Bug Reports
API
Documentation
Tutorials
Configuration
Management Systems
Advanced IDEs
CROSSMINER: high-level view
Bringing to the domain of software development the notion of
recommendation systems that are typically used for popular e-commerce
systems to present users with interesting items previously unknown to
them
4ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Kinds of recommendation
Depending on the set of selected third-party libraries, the system is able to
recommend additional libraries that should be included in the project being
developed
Given a selected library, the system is able to suggest alternative ones that share
some similarities with the selected one
Depending on the set of selected libraries, the system shows API documentation
and Q&A posts that can help developers to understand how to use the selected
libraries
During the development, developers get recommendations about API function calls
and usage patterns that might be used
…
5ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Problem
“Which API methods should this piece of client code
invoke, considering that it has already invoked these
other API methods?”
6ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Explanatory example: method under development
7ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Explanatory example: method declaration
Method declaration (MD)
Method invocations (MI)
8ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Explanatory example: complete method declaration
9ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Explanatory example: quested recommendations
10ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Explanatory example: quested recommendations
List of API function calls:
• get, equal, where,
select, ...
11ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Explanatory example: quested recommendations
Usage patterns:
• Snippets of code
containing the
recommended
function calls
12ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
FOCUS
It recommends API FunctiOn Calls and USage patterns
It works on the basis of a context-aware collaborative-filtering system
13ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Recommend products to customers with similar preference
Image source: https://towardsdatascience.com/various-implementations-of-collaborative-filtering-100385c6dfe0
Collaborative-Filtering Technique
14ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Collaborative-Filtering Technique
University of L'Aquila 14
R1 R2 R3
c1 5 5 2
c2 3 3 4
c3 5 5 ?
Internal Meeting, 31 October 2017
User-item matrix: Ratings given to Pizza restaurants by customers
15ICSE 2019 – May 31, 2019 – Montréal, QC, CanadaUniversity of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 15
Context-aware recommendation
16ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Context-aware recommendation
University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 16
Examples of context: day of the
week, hour of the day, weather
conditions, …
17ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Context-aware recommendation
University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 17
Predict the inclusion of additional invocations
18ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
FOCUS architecture
19ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
FOCUS architecture
20ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Code Parser
The available OSS repositories are mined to extract for each project:
- Method declarations
- Method invocations
- Field accesses
- Interface implementations
- Class extensions
- …
Rascal
Metaprogramming Language
https://www.rascal-mpl.org/
21ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Code Parser
The available OSS repositories are mined to extract for each project:
- Method declarations
- Method invocations
- Field accesses
- Interface implementations
- Class extensions
- …
Rascal
Metaprogramming Language
https://www.rascal-mpl.org/
22ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
FOCUS architecture
23ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Data encoder
Extracted method declarations and invocations of each project are
represented in a corresponding rating matrix
24ICSE 2019 – May 31, 2019 – Montréal, QC, CanadaUniversity of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 24
Representation of Projects-MDs-MIs
3D user-item-context
ratings matrix
Mappings:
– contexts ←→ projects
– users ←→ declarations
– items ←→ invocations
25ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Similarity calculator
Given an active declaration in an active project, we find the subset of:
- the most similar projects
- and then the most similar declarations in that similar projects
26ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Similarity calculator: Projects and method declarations
Graph-based representation
of projects and invocations
The similarity of two projects
p and q is calculated by
considering their feature
vectors (TF-IDF)
The similarities among
methods declarations are
calculated using the Jaccard
similarity index
27ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
FOCUS architecture
28ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Recommendation engine: API function calls
Generation of a ranked list of API function calls
• Additional invocations for the active declaration are predicted by
computing the missing ratings
• Ranked list of invocations with scores in descending order
29ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Recommendation engine: API usage patterns
From the ranked list, top-N method invocations are used as query to
search for relevant declarations
Source code snippets containing the identified relevant declarations
are retrieved from the available source code base
30ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Evaluation
Assessing FOCUS capability to recommend API function calls
– Accuracy (precision and recall)
– Success rate
– Time performance
Comparing FOCUS with a state-of-the-art tool (PAM*)
Two dataset sources:
– More than 600 GitHub projects retrieved from Software Heritage
– A set of 3,600 jars retrieved from Maven Central
* Jaroslav Fowkes, Charles Sutton. Parameter-free probabilistic API mining across GitHub, Proceedings of the 24th ACM SIGSOFT
International Symposium on Foundations of Software Engineering (FSE 2016 )
31ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Evaluation process
Source Code
metadata
32ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Evaluation process: testing project
Total number of
declarations
Declarations that are kept
(the rest are discarded)
Total number of
invocations in a given
declaration
Invocations that are used
as query
33ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Evaluation process: testing project
Only the first
invocation is provided
as a query, and the rest
is used as ground-truth
data
Four invocations are
provided as a query,
and the rest is used as
ground-truth data
The first half of the
declarations is used as
testing data and the
second half is removed
C1.1 C1.2
The last method
declaration is selected as
testing and all the
remaining declarations
are used as training data
C2.1 C2.2
Four different configurations
34ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Evaluation key points
The performance of FOCUS relies on the availability of background data
– the system works effectively given that more OSS projects are available for
recommendation
Accuracy improves substantially when the query contains more invocations
Precision and recall for C1.1 and C1.2 on SH dataset Precision and recall for C1.1 and C1.2 on MV dataset
35ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Evaluation key points
A dataset consisting of only 200 projects has been considered
Leave-one-out cross-validation has been performed to exploit as much
as possible the projects available as background data, given a testing
project
PAM requires 9 seconds to provide each
recommendation while FOCUS just
needs 0.095 seconds
36ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
What’s next
Embedding FOCUS directly into the Eclipse IDE
– Under development in CROSSMINER
A user study to thoroughly study the system’s performance
37ICSE 2019 – May 31, 2019 – Montréal, QC, Canada
Conclusions
https://github.com/crossminer/FOCUS

More Related Content

Similar to FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns

#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...
#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...
#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...Obeo
 
'Applying System Science and System Thinking Techniques to BIM Management'
'Applying System Science and System Thinking Techniques to BIM Management' 'Applying System Science and System Thinking Techniques to BIM Management'
'Applying System Science and System Thinking Techniques to BIM Management' Alan Martin Redmond, PhD
 
Towards Design-space Exploration of Component Chains in Vehicle Software
Towards Design-space Exploration of Component Chains in Vehicle SoftwareTowards Design-space Exploration of Component Chains in Vehicle Software
Towards Design-space Exploration of Component Chains in Vehicle SoftwareAlessio Bucaioni
 
RT332: Measuring Progress and Productivity in Model-based Engineering
RT332: Measuring Progress and Productivity in Model-based EngineeringRT332: Measuring Progress and Productivity in Model-based Engineering
RT332: Measuring Progress and Productivity in Model-based EngineeringAVEVA Group plc
 
Conference Identity: persistent identifiers for conferences
Conference Identity: persistent identifiers for conferencesConference Identity: persistent identifiers for conferences
Conference Identity: persistent identifiers for conferencesAliaksandr Birukou
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...ijistjournal
 
Call For papers - 4th International Conference on Machine Learning & Applicat...
Call For papers - 4th International Conference on Machine Learning & Applicat...Call For papers - 4th International Conference on Machine Learning & Applicat...
Call For papers - 4th International Conference on Machine Learning & Applicat...IJITCA Journal
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...caijjournal
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...ijistjournal
 
Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...Davide Ruscio
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyDr. Arif Wider
 
Agile London at Ticketmaster
Agile London at TicketmasterAgile London at Ticketmaster
Agile London at TicketmasterBilly Jenkins
 
Performance Measurement and Improvement of Lean Manufacturing Operations: A L...
Performance Measurement and Improvement of Lean Manufacturing Operations: A L...Performance Measurement and Improvement of Lean Manufacturing Operations: A L...
Performance Measurement and Improvement of Lean Manufacturing Operations: A L...Leandro Silvério
 
A Preliminary Study on Architecting Cyber-Physical Systems
A Preliminary Study on Architecting Cyber-Physical SystemsA Preliminary Study on Architecting Cyber-Physical Systems
A Preliminary Study on Architecting Cyber-Physical SystemsHenry Muccini
 
4th International Conference on Machine Learning & Applications (CMLA 2022)
4th International Conference on Machine Learning & Applications (CMLA 2022)4th International Conference on Machine Learning & Applications (CMLA 2022)
4th International Conference on Machine Learning & Applications (CMLA 2022)ijait
 
CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)
CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)
CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)ijsc
 
Call for Paper - 4th International Conference on Machine Learning & Applicati...
Call for Paper - 4th International Conference on Machine Learning & Applicati...Call for Paper - 4th International Conference on Machine Learning & Applicati...
Call for Paper - 4th International Conference on Machine Learning & Applicati...mlaij
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...ijistjournal
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)acijjournal
 
4 th International Conference on Machine Learning & Applications (CMLA 2022)
4 th International Conference on Machine Learning & Applications (CMLA 2022)4 th International Conference on Machine Learning & Applications (CMLA 2022)
4 th International Conference on Machine Learning & Applications (CMLA 2022)ijscai
 

Similar to FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns (20)

#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...
#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...
#SiriusCon 2015: Talk by Christophe Boudjennah "Experimenting the Open Source...
 
'Applying System Science and System Thinking Techniques to BIM Management'
'Applying System Science and System Thinking Techniques to BIM Management' 'Applying System Science and System Thinking Techniques to BIM Management'
'Applying System Science and System Thinking Techniques to BIM Management'
 
Towards Design-space Exploration of Component Chains in Vehicle Software
Towards Design-space Exploration of Component Chains in Vehicle SoftwareTowards Design-space Exploration of Component Chains in Vehicle Software
Towards Design-space Exploration of Component Chains in Vehicle Software
 
RT332: Measuring Progress and Productivity in Model-based Engineering
RT332: Measuring Progress and Productivity in Model-based EngineeringRT332: Measuring Progress and Productivity in Model-based Engineering
RT332: Measuring Progress and Productivity in Model-based Engineering
 
Conference Identity: persistent identifiers for conferences
Conference Identity: persistent identifiers for conferencesConference Identity: persistent identifiers for conferences
Conference Identity: persistent identifiers for conferences
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...
 
Call For papers - 4th International Conference on Machine Learning & Applicat...
Call For papers - 4th International Conference on Machine Learning & Applicat...Call For papers - 4th International Conference on Machine Learning & Applicat...
Call For papers - 4th International Conference on Machine Learning & Applicat...
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...
 
Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...Developing recommendation systems to support open source software developers ...
Developing recommendation systems to support open source software developers ...
 
Continuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production ReliablyContinuous Intelligence: Moving Machine Learning into Production Reliably
Continuous Intelligence: Moving Machine Learning into Production Reliably
 
Agile London at Ticketmaster
Agile London at TicketmasterAgile London at Ticketmaster
Agile London at Ticketmaster
 
Performance Measurement and Improvement of Lean Manufacturing Operations: A L...
Performance Measurement and Improvement of Lean Manufacturing Operations: A L...Performance Measurement and Improvement of Lean Manufacturing Operations: A L...
Performance Measurement and Improvement of Lean Manufacturing Operations: A L...
 
A Preliminary Study on Architecting Cyber-Physical Systems
A Preliminary Study on Architecting Cyber-Physical SystemsA Preliminary Study on Architecting Cyber-Physical Systems
A Preliminary Study on Architecting Cyber-Physical Systems
 
4th International Conference on Machine Learning & Applications (CMLA 2022)
4th International Conference on Machine Learning & Applications (CMLA 2022)4th International Conference on Machine Learning & Applications (CMLA 2022)
4th International Conference on Machine Learning & Applications (CMLA 2022)
 
CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)
CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)
CFP: 4th International Conference on Machine Learning & Applications (CMLA 2022)
 
Call for Paper - 4th International Conference on Machine Learning & Applicati...
Call for Paper - 4th International Conference on Machine Learning & Applicati...Call for Paper - 4th International Conference on Machine Learning & Applicati...
Call for Paper - 4th International Conference on Machine Learning & Applicati...
 
Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...Call for Papers - 4th International Conference on Machine Learning & Applicat...
Call for Papers - 4th International Conference on Machine Learning & Applicat...
 
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)4thInternational Conference on Machine Learning & Applications (CMLA 2022)
4thInternational Conference on Machine Learning & Applications (CMLA 2022)
 
4 th International Conference on Machine Learning & Applications (CMLA 2022)
4 th International Conference on Machine Learning & Applications (CMLA 2022)4 th International Conference on Machine Learning & Applications (CMLA 2022)
4 th International Conference on Machine Learning & Applications (CMLA 2022)
 

More from Davide Ruscio

Detecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDetecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDavide Ruscio
 
On the way of listening to the crowd for supporting modeling activities
On the way of listening to the crowd for supporting modeling activitiesOn the way of listening to the crowd for supporting modeling activities
On the way of listening to the crowd for supporting modeling activitiesDavide Ruscio
 
CrossSim: exploiting mutual relationships to detect similar OSS projects
CrossSim: exploiting mutual relationships to detect similar OSS projectsCrossSim: exploiting mutual relationships to detect similar OSS projects
CrossSim: exploiting mutual relationships to detect similar OSS projectsDavide Ruscio
 
Use of MDE to Analyse Open Source Software
Use of MDE to Analyse Open Source SoftwareUse of MDE to Analyse Open Source Software
Use of MDE to Analyse Open Source SoftwareDavide Ruscio
 
Consistency Recovery in Interactive Modeling
Consistency Recovery in Interactive ModelingConsistency Recovery in Interactive Modeling
Consistency Recovery in Interactive ModelingDavide Ruscio
 
Edelta: an approach for defining and applying reusable metamodel refactorings
Edelta: an approach for defining and applying reusable metamodel refactoringsEdelta: an approach for defining and applying reusable metamodel refactorings
Edelta: an approach for defining and applying reusable metamodel refactoringsDavide Ruscio
 
Semantic based model matching with emf compare
Semantic based model matching with emf compareSemantic based model matching with emf compare
Semantic based model matching with emf compareDavide Ruscio
 
Collaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping StudyCollaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping StudyDavide Ruscio
 
Model repositories: will they become reality?
Model repositories: will they become reality?Model repositories: will they become reality?
Model repositories: will they become reality?Davide Ruscio
 
Mining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel MetricsMining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel Metrics Davide Ruscio
 
MDEForge: an extensible Web-based modeling platform
MDEForge: an extensible Web-based modeling platformMDEForge: an extensible Web-based modeling platform
MDEForge: an extensible Web-based modeling platformDavide Ruscio
 

More from Davide Ruscio (11)

Detecting java software similarities by using different clustering
Detecting java software similarities by using different clusteringDetecting java software similarities by using different clustering
Detecting java software similarities by using different clustering
 
On the way of listening to the crowd for supporting modeling activities
On the way of listening to the crowd for supporting modeling activitiesOn the way of listening to the crowd for supporting modeling activities
On the way of listening to the crowd for supporting modeling activities
 
CrossSim: exploiting mutual relationships to detect similar OSS projects
CrossSim: exploiting mutual relationships to detect similar OSS projectsCrossSim: exploiting mutual relationships to detect similar OSS projects
CrossSim: exploiting mutual relationships to detect similar OSS projects
 
Use of MDE to Analyse Open Source Software
Use of MDE to Analyse Open Source SoftwareUse of MDE to Analyse Open Source Software
Use of MDE to Analyse Open Source Software
 
Consistency Recovery in Interactive Modeling
Consistency Recovery in Interactive ModelingConsistency Recovery in Interactive Modeling
Consistency Recovery in Interactive Modeling
 
Edelta: an approach for defining and applying reusable metamodel refactorings
Edelta: an approach for defining and applying reusable metamodel refactoringsEdelta: an approach for defining and applying reusable metamodel refactorings
Edelta: an approach for defining and applying reusable metamodel refactorings
 
Semantic based model matching with emf compare
Semantic based model matching with emf compareSemantic based model matching with emf compare
Semantic based model matching with emf compare
 
Collaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping StudyCollaborative model driven software engineering: a Systematic Mapping Study
Collaborative model driven software engineering: a Systematic Mapping Study
 
Model repositories: will they become reality?
Model repositories: will they become reality?Model repositories: will they become reality?
Model repositories: will they become reality?
 
Mining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel MetricsMining Correlations of ATL Transformation and Metamodel Metrics
Mining Correlations of ATL Transformation and Metamodel Metrics
 
MDEForge: an extensible Web-based modeling platform
MDEForge: an extensible Web-based modeling platformMDEForge: an extensible Web-based modeling platform
MDEForge: an extensible Web-based modeling platform
 

Recently uploaded

What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfIdiosysTechnologies1
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 

Recently uploaded (20)

What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Best Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdfBest Web Development Agency- Idiosys USA.pdf
Best Web Development Agency- Idiosys USA.pdf
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 

FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns

  • 1. http://people.disim.univaq.it/diruscio/ davide.diruscio@univaq.it @ddiruscio FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns Davide Di Ruscio Joint work with Phuong T. Nguyen, Juri Di Rocco, Lina Ochoa, Thomas Degueule, Massimiliano Di Penta
  • 2. 2ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Context Related activities • Searching for candidate components • Evaluating a set of retrieved candidate components to find the most suitable one • Understanding how to use the selected components • Monitoring the selected components Development of new software systems by reusing existing open source components www.crossminer.org @crossminer eclipse.org/scava
  • 3. 3ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Mining and Knowledge Extraction Tools Source code Q&A systems Bug Reports API Documentation Tutorials Configuration Management Systems Advanced IDEs CROSSMINER: high-level view Bringing to the domain of software development the notion of recommendation systems that are typically used for popular e-commerce systems to present users with interesting items previously unknown to them
  • 4. 4ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Kinds of recommendation Depending on the set of selected third-party libraries, the system is able to recommend additional libraries that should be included in the project being developed Given a selected library, the system is able to suggest alternative ones that share some similarities with the selected one Depending on the set of selected libraries, the system shows API documentation and Q&A posts that can help developers to understand how to use the selected libraries During the development, developers get recommendations about API function calls and usage patterns that might be used …
  • 5. 5ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Problem “Which API methods should this piece of client code invoke, considering that it has already invoked these other API methods?”
  • 6. 6ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: method under development
  • 7. 7ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: method declaration Method declaration (MD) Method invocations (MI)
  • 8. 8ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: complete method declaration
  • 9. 9ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: quested recommendations
  • 10. 10ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: quested recommendations List of API function calls: • get, equal, where, select, ...
  • 11. 11ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: quested recommendations Usage patterns: • Snippets of code containing the recommended function calls
  • 12. 12ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS It recommends API FunctiOn Calls and USage patterns It works on the basis of a context-aware collaborative-filtering system
  • 13. 13ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Recommend products to customers with similar preference Image source: https://towardsdatascience.com/various-implementations-of-collaborative-filtering-100385c6dfe0 Collaborative-Filtering Technique
  • 14. 14ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Collaborative-Filtering Technique University of L'Aquila 14 R1 R2 R3 c1 5 5 2 c2 3 3 4 c3 5 5 ? Internal Meeting, 31 October 2017 User-item matrix: Ratings given to Pizza restaurants by customers
  • 15. 15ICSE 2019 – May 31, 2019 – Montréal, QC, CanadaUniversity of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 15 Context-aware recommendation
  • 16. 16ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Context-aware recommendation University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 16 Examples of context: day of the week, hour of the day, weather conditions, …
  • 17. 17ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Context-aware recommendation University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 17 Predict the inclusion of additional invocations
  • 18. 18ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  • 19. 19ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  • 20. 20ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Code Parser The available OSS repositories are mined to extract for each project: - Method declarations - Method invocations - Field accesses - Interface implementations - Class extensions - … Rascal Metaprogramming Language https://www.rascal-mpl.org/
  • 21. 21ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Code Parser The available OSS repositories are mined to extract for each project: - Method declarations - Method invocations - Field accesses - Interface implementations - Class extensions - … Rascal Metaprogramming Language https://www.rascal-mpl.org/
  • 22. 22ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  • 23. 23ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Data encoder Extracted method declarations and invocations of each project are represented in a corresponding rating matrix
  • 24. 24ICSE 2019 – May 31, 2019 – Montréal, QC, CanadaUniversity of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 24 Representation of Projects-MDs-MIs 3D user-item-context ratings matrix Mappings: – contexts ←→ projects – users ←→ declarations – items ←→ invocations
  • 25. 25ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Similarity calculator Given an active declaration in an active project, we find the subset of: - the most similar projects - and then the most similar declarations in that similar projects
  • 26. 26ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Similarity calculator: Projects and method declarations Graph-based representation of projects and invocations The similarity of two projects p and q is calculated by considering their feature vectors (TF-IDF) The similarities among methods declarations are calculated using the Jaccard similarity index
  • 27. 27ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  • 28. 28ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Recommendation engine: API function calls Generation of a ranked list of API function calls • Additional invocations for the active declaration are predicted by computing the missing ratings • Ranked list of invocations with scores in descending order
  • 29. 29ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Recommendation engine: API usage patterns From the ranked list, top-N method invocations are used as query to search for relevant declarations Source code snippets containing the identified relevant declarations are retrieved from the available source code base
  • 30. 30ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation Assessing FOCUS capability to recommend API function calls – Accuracy (precision and recall) – Success rate – Time performance Comparing FOCUS with a state-of-the-art tool (PAM*) Two dataset sources: – More than 600 GitHub projects retrieved from Software Heritage – A set of 3,600 jars retrieved from Maven Central * Jaroslav Fowkes, Charles Sutton. Parameter-free probabilistic API mining across GitHub, Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016 )
  • 31. 31ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation process Source Code metadata
  • 32. 32ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation process: testing project Total number of declarations Declarations that are kept (the rest are discarded) Total number of invocations in a given declaration Invocations that are used as query
  • 33. 33ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation process: testing project Only the first invocation is provided as a query, and the rest is used as ground-truth data Four invocations are provided as a query, and the rest is used as ground-truth data The first half of the declarations is used as testing data and the second half is removed C1.1 C1.2 The last method declaration is selected as testing and all the remaining declarations are used as training data C2.1 C2.2 Four different configurations
  • 34. 34ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation key points The performance of FOCUS relies on the availability of background data – the system works effectively given that more OSS projects are available for recommendation Accuracy improves substantially when the query contains more invocations Precision and recall for C1.1 and C1.2 on SH dataset Precision and recall for C1.1 and C1.2 on MV dataset
  • 35. 35ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation key points A dataset consisting of only 200 projects has been considered Leave-one-out cross-validation has been performed to exploit as much as possible the projects available as background data, given a testing project PAM requires 9 seconds to provide each recommendation while FOCUS just needs 0.095 seconds
  • 36. 36ICSE 2019 – May 31, 2019 – Montréal, QC, Canada What’s next Embedding FOCUS directly into the Eclipse IDE – Under development in CROSSMINER A user study to thoroughly study the system’s performance
  • 37. 37ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Conclusions https://github.com/crossminer/FOCUS