• Save
On Evaluating Code Recommender Systems for API Usages
Upcoming SlideShare
Loading in...5
×
 

On Evaluating Code Recommender Systems for API Usages

on

  • 1,327 views

Presentation by Marcel Bruch at the RSSE 2008 Workshop.

Presentation by Marcel Bruch at the RSSE 2008 Workshop.

Statistics

Views

Total Views
1,327
Slideshare-icon Views on SlideShare
1,327
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • 6. Juni 2009 | | To ease framework understanding, tools have been developed that analyze existing framework instantiations to extract API usage patterns and present them to the user. However, detailed quantitative evaluations of such recommender systems are lacking. In this paper we present an automated evaluation process which extracts queries and expected results from existing code bases. This enables the validation of recommendation systems with large test beds in an objective manner by means of precision and recall measures.

On Evaluating Code Recommender Systems for API Usages On Evaluating Code Recommender Systems for API Usages Presentation Transcript

  • On Evaluating Recommender Systems for API Usages Fachbereich Informatik | Software Technology Group | Marcel Bruch | bruch@cs.tu-darmstadt.de International Workshop on Recommendation Systems for Software Engineering Marcel Bruch Thorsten Schäfer Mira Mezini
  • Goals of a Code Recommender Evaluation
    • An evaluation should allow us to
      • assess the approach’s overall performance
      • compare performance of different approaches
      • figure out where further research is needed
  • Requirements of a Code Recommender Evaluation
    • An evaluation process should
      • be based on large-scale test suites
      • be customizable w.r.t. different evaluation scenarios
      • execute the test suites and evaluate the results
      • require minimal manual effort
  • Sample Usage of a Code Recommender ... Code Recommender Recommendations Calls: <init> setText setFont setLayoutData «refine » Type: Text Calls: ??? widget Incomplete Code « predict » « send » Type: Text Calls: <init> setText setLayoutData widget Resulting Code « inspect & use » Query Calls for Type= Text ? « create »
  • General Process of Training & Evaluating Code Recommenders c Code Observations 1. Code Analysis Recommender Training Data 2. Training Artifact Processing Step Legend: Visualization Performance Measures 5. Automated Reports Queries Test Data Expected Recommendations 3. Automated Query Creation Recommendations 4. Query Phase
  • Sample Evaluation Scenario 2. Training 5. Reports 3. Automated Query Creation ... 4. Query Phase Recommender Query Calls for Type= Text ? Recommendations Calls: <init> setText setFont setLayoutData Expected Recommendations Calls: <init> setText setLayoutData Type: Text Calls: <init> setText setLayoutData widget Test Data « send » « predict » «refine » « compare »
  • Measuring Performance 5. Reports ... Recommendations Calls: <init> setText setFont setLayoutData Expected Recommendations Calls: <init> setText setLayoutData e.g., ranking-based e.g., set-based 
  • Sample Evaluation Results
  • Summary ... 4. Query Phase Recommender 2. Training 3. Automated Query Creation 5. Reports
    • Approach how to derive queries directly from sample code
    • Automated extraction of expected recommendations
    • Automated query execution
    • Approach to measure
      • set-based accuracy
      • ranking-based accuracy
    • using precision & recall
    Query Calls for Type= Text ? Recommendations Calls: new Text setText setFont setLayoutData Expected Recommendations Calls: new Text setText setLayoutData Type: Text Calls: new Text setText setLayoutData instance01 Test Data «query » «predict » «compare »
  • Q & A
    • What do you think:
      • Is this kind of evaluation helpful to assess the overall performance of code recommender systems?
      • Can you imagine to evaluate your code recommender with such an approach?
      • Do we need user studies to assess the progress caused by novel recommendation models?