Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns

151 views

Published on

Software developers interact with APIs on a daily basis and, therefore, often face the need to learn how to use new APIs suitable for their purposes. Previous work has shown that recommending usage patterns to developers facilitates the learning process. Current approaches to usage pattern recommendation, however, still suffer from high redundancy and poor run-time performance. In this paper, we reformulate the problem of usage pattern recommendation in terms of a collaborative filtering recommender system. We present a new tool, FOCUS, which mines open-source project repositories to recommend API method invocations and usage patterns by analyzing how APIs are used in projects similar to the current project. We evaluate FOCUS on a large number of Java projects extracted from GitHub and Maven Central and find that it outperforms the state-of-the-art approach PAM with regards to success rate, accuracy, and execution time. Results indicate the suitability of context-aware collaborative-filtering recommender systems to provide API usage patterns.

Published in: Software
  • Be the first to comment

  • Be the first to like this

FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns

  1. 1. http://people.disim.univaq.it/diruscio/ davide.diruscio@univaq.it @ddiruscio FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns Davide Di Ruscio Joint work with Phuong T. Nguyen, Juri Di Rocco, Lina Ochoa, Thomas Degueule, Massimiliano Di Penta
  2. 2. 2ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Context Related activities • Searching for candidate components • Evaluating a set of retrieved candidate components to find the most suitable one • Understanding how to use the selected components • Monitoring the selected components Development of new software systems by reusing existing open source components www.crossminer.org @crossminer eclipse.org/scava
  3. 3. 3ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Mining and Knowledge Extraction Tools Source code Q&A systems Bug Reports API Documentation Tutorials Configuration Management Systems Advanced IDEs CROSSMINER: high-level view Bringing to the domain of software development the notion of recommendation systems that are typically used for popular e-commerce systems to present users with interesting items previously unknown to them
  4. 4. 4ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Kinds of recommendation Depending on the set of selected third-party libraries, the system is able to recommend additional libraries that should be included in the project being developed Given a selected library, the system is able to suggest alternative ones that share some similarities with the selected one Depending on the set of selected libraries, the system shows API documentation and Q&A posts that can help developers to understand how to use the selected libraries During the development, developers get recommendations about API function calls and usage patterns that might be used …
  5. 5. 5ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Problem “Which API methods should this piece of client code invoke, considering that it has already invoked these other API methods?”
  6. 6. 6ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: method under development
  7. 7. 7ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: method declaration Method declaration (MD) Method invocations (MI)
  8. 8. 8ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: complete method declaration
  9. 9. 9ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: quested recommendations
  10. 10. 10ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: quested recommendations List of API function calls: • get, equal, where, select, ...
  11. 11. 11ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Explanatory example: quested recommendations Usage patterns: • Snippets of code containing the recommended function calls
  12. 12. 12ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS It recommends API FunctiOn Calls and USage patterns It works on the basis of a context-aware collaborative-filtering system
  13. 13. 13ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Recommend products to customers with similar preference Image source: https://towardsdatascience.com/various-implementations-of-collaborative-filtering-100385c6dfe0 Collaborative-Filtering Technique
  14. 14. 14ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Collaborative-Filtering Technique University of L'Aquila 14 R1 R2 R3 c1 5 5 2 c2 3 3 4 c3 5 5 ? Internal Meeting, 31 October 2017 User-item matrix: Ratings given to Pizza restaurants by customers
  15. 15. 15ICSE 2019 – May 31, 2019 – Montréal, QC, CanadaUniversity of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 15 Context-aware recommendation
  16. 16. 16ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Context-aware recommendation University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 16 Examples of context: day of the week, hour of the day, weather conditions, …
  17. 17. 17ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Context-aware recommendation University of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 17 Predict the inclusion of additional invocations
  18. 18. 18ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  19. 19. 19ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  20. 20. 20ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Code Parser The available OSS repositories are mined to extract for each project: - Method declarations - Method invocations - Field accesses - Interface implementations - Class extensions - … Rascal Metaprogramming Language https://www.rascal-mpl.org/
  21. 21. 21ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Code Parser The available OSS repositories are mined to extract for each project: - Method declarations - Method invocations - Field accesses - Interface implementations - Class extensions - … Rascal Metaprogramming Language https://www.rascal-mpl.org/
  22. 22. 22ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  23. 23. 23ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Data encoder Extracted method declarations and invocations of each project are represented in a corresponding rating matrix
  24. 24. 24ICSE 2019 – May 31, 2019 – Montréal, QC, CanadaUniversity of L'Aquila CROSSMINER Toulouse Meeting, 10-12 June 2018 24 Representation of Projects-MDs-MIs 3D user-item-context ratings matrix Mappings: – contexts ←→ projects – users ←→ declarations – items ←→ invocations
  25. 25. 25ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Similarity calculator Given an active declaration in an active project, we find the subset of: - the most similar projects - and then the most similar declarations in that similar projects
  26. 26. 26ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Similarity calculator: Projects and method declarations Graph-based representation of projects and invocations The similarity of two projects p and q is calculated by considering their feature vectors (TF-IDF) The similarities among methods declarations are calculated using the Jaccard similarity index
  27. 27. 27ICSE 2019 – May 31, 2019 – Montréal, QC, Canada FOCUS architecture
  28. 28. 28ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Recommendation engine: API function calls Generation of a ranked list of API function calls • Additional invocations for the active declaration are predicted by computing the missing ratings • Ranked list of invocations with scores in descending order
  29. 29. 29ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Recommendation engine: API usage patterns From the ranked list, top-N method invocations are used as query to search for relevant declarations Source code snippets containing the identified relevant declarations are retrieved from the available source code base
  30. 30. 30ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation Assessing FOCUS capability to recommend API function calls – Accuracy (precision and recall) – Success rate – Time performance Comparing FOCUS with a state-of-the-art tool (PAM*) Two dataset sources: – More than 600 GitHub projects retrieved from Software Heritage – A set of 3,600 jars retrieved from Maven Central * Jaroslav Fowkes, Charles Sutton. Parameter-free probabilistic API mining across GitHub, Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016 )
  31. 31. 31ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation process Source Code metadata
  32. 32. 32ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation process: testing project Total number of declarations Declarations that are kept (the rest are discarded) Total number of invocations in a given declaration Invocations that are used as query
  33. 33. 33ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation process: testing project Only the first invocation is provided as a query, and the rest is used as ground-truth data Four invocations are provided as a query, and the rest is used as ground-truth data The first half of the declarations is used as testing data and the second half is removed C1.1 C1.2 The last method declaration is selected as testing and all the remaining declarations are used as training data C2.1 C2.2 Four different configurations
  34. 34. 34ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation key points The performance of FOCUS relies on the availability of background data – the system works effectively given that more OSS projects are available for recommendation Accuracy improves substantially when the query contains more invocations Precision and recall for C1.1 and C1.2 on SH dataset Precision and recall for C1.1 and C1.2 on MV dataset
  35. 35. 35ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Evaluation key points A dataset consisting of only 200 projects has been considered Leave-one-out cross-validation has been performed to exploit as much as possible the projects available as background data, given a testing project PAM requires 9 seconds to provide each recommendation while FOCUS just needs 0.095 seconds
  36. 36. 36ICSE 2019 – May 31, 2019 – Montréal, QC, Canada What’s next Embedding FOCUS directly into the Eclipse IDE – Under development in CROSSMINER A user study to thoroughly study the system’s performance
  37. 37. 37ICSE 2019 – May 31, 2019 – Montréal, QC, Canada Conclusions https://github.com/crossminer/FOCUS

×