Recommendation Systems  for Code Reuse Tao Xie Department of Computer Science North Carolina State University Raleigh, USA
Motivation Programmers commonly  reuse  APIs of existing frameworks or libraries Advantages: Low cost and high efficiency of development Challenges: Complexity and lack of documentation E.g., searching for information nearly ¼ of developer time [metallect.com] Frame works
Example Task from Eclipse Programming Task: How to parse code in a dirty editor of Eclipse? ? Query: “ IEditorPart  -> ICompilationUnit ” Open Source Projects 1 2 N … … Extract MIS 1 MIS 2 ... … MIS k *MIS: Method-Invocation sequence, FMIS: Frequent MIS FMIS 1 FMIS 2 … FMIS n Recommend Mine PARSEWeb [Thummalapenta&Xie ASE 07]
Scenario 1 While reusing APIs of existing open source frameworks or libraries, programmers often  know what type of object they need  but do not know how to write code for getting that object Query: “Source   Destination” How to use these APIs? Prospector [Mandelin et al. PLDI 05 ],  XSnippet [ Sahavechaphan&Claypool  OOPSLA 06 ],  PARSEWeb [Thummalapenta&Xie ASE 07]
Example Task from Eclipse Programming Task: How to parse code in a dirty editor? Query: IEditorPart    ICompilationUnit Example solution from Prospector/PARSEWeb: IEditorPart iep  = ... IEditorInput editorInp = iep.getEditorInput(); IWorkingCopyManager wcm = JavaUI.getWorkingCopyManager(); ICompilationUnit icu  = wcm.getWorkingCopy(editorInp); Difficulties:  a.  Needs an instance of  IWorkingCopyManager  b. Needs to invoke a static method of  JavaUI  for getting the preceding instance Prospector [Mandelin et al. PLDI 05 ],  XSnippet [ Sahavechaphan&Claypool  OOPSLA 06 ],  PARSEWeb [Thummalapenta&Xie ASE 07]
Scenario 2 While reusing APIs of existing open source frameworks or libraries, programmers often  know  what method call they need but do not know  how to write code before and after this method call Query: “Method name” How to use these APIs? MAPO [Xie&Pei MSR 05]
Example Task from BCEL Programming Task: How to  instrument the bytecode of a Java class by adding an extra method to the class? Query :  org.apache.bcel.generic.ClassGen    public void addMethod(Method m ) Example solution from MAPO:   public void generateStubMethod(ClassGen c)   InstructionList il =    new  InstructionList ();    MethodGen m=  genFromISList (il);   m. setMaxLocals ();    m. setMaxStack ();    c. addMethod (m. getMethod ());    System.out. println (“…”);   …  }  MAPO [Xie&Pei MSR 05]
Scenario 3 While reusing APIs of existing open source frameworks or libraries, programmers often  know  structural context such as a class’ type, its parents, and fields’ types, a method’s signature, method or constructor callees but do not know  how to write code in this context Query: Structural context How to use these APIs? Strathcona [Holmes et al. 05],  XSnippet [ Sahavechaphan&Claypool  OOPSLA 06 ]
Example Task from  HttpClient  Programming Task: How to  evolve a system to use a third party library, HttpClient, for handling http connections? Query :  HttpClient, PostMethod classes Example solution from  Strathcona : Strathcona [Holmes et al. 05],  XSnippet [ Sahavechaphan&Claypool  OOPSLA 06 ]
Steps in Recommenders Data collection/extraction Data preprocessing Data analysis/mining Result postprocessing Result representation
Data Collection/Extraction From one or multiple local code repositories Often followed by offline analysis or mining Challenges: lack of  relevant  code examples Ex.: Strathcona, Prospector, XSnippet From the whole open source world with a code search engine! Often followed by on-the-fly analysis and mining Challenges: only partial code files Ex.: MAPO, PARSEWeb
Exploiting A Code Search Engine Accepts queries including keywords of classes or/and method names Interacts with a code search engine such as Google code search to gather related code samples Stores gathered code samples (source files) in a local code repository (later being analyzed and mined) Challenges: gathered code samples are partial and not compilable as code search engines retrieve individual source files instead of entire projects PARSEWeb [Thummalapenta&Xie ASE 07]
Available Code Search Engines Google Code Search  http://www.google.com/codesearch   Krugle:  http://www.krugle.com/ Koders:  http://www.koders.com/ Codase:  http://www.codase.com/ JExamples:  http://www.jexamples.com/ etc.,  Why not using just code search engines?
What are Developers Searching for? Assieme [Hoffmann et al. UIST 07] 339 sessions related to Java programming 15 million queries of Windows Live Search from May 2006. 117  API sessions (34.2%); 70 trouble-shooting sessions (20.6%)
API-related Search Sessions 64.1%  sessions contained queries that were merely descriptive but did not contain actual names of APIs, packages, types, or members. The remaining sessions contained  API or package names ( 12.8% ), Type names ( 17.9% )  Method names ( 5.1% ). Among all these API-related sessions,  17.9%  contained terms like “example”, “using”, or “sample code” Assieme [Hoffmann et al. UIST 07]

Slides

  • 1.
    Recommendation Systems for Code Reuse Tao Xie Department of Computer Science North Carolina State University Raleigh, USA
  • 2.
    Motivation Programmers commonly reuse APIs of existing frameworks or libraries Advantages: Low cost and high efficiency of development Challenges: Complexity and lack of documentation E.g., searching for information nearly ¼ of developer time [metallect.com] Frame works
  • 3.
    Example Task fromEclipse Programming Task: How to parse code in a dirty editor of Eclipse? ? Query: “ IEditorPart -> ICompilationUnit ” Open Source Projects 1 2 N … … Extract MIS 1 MIS 2 ... … MIS k *MIS: Method-Invocation sequence, FMIS: Frequent MIS FMIS 1 FMIS 2 … FMIS n Recommend Mine PARSEWeb [Thummalapenta&Xie ASE 07]
  • 4.
    Scenario 1 Whilereusing APIs of existing open source frameworks or libraries, programmers often know what type of object they need but do not know how to write code for getting that object Query: “Source  Destination” How to use these APIs? Prospector [Mandelin et al. PLDI 05 ], XSnippet [ Sahavechaphan&Claypool OOPSLA 06 ], PARSEWeb [Thummalapenta&Xie ASE 07]
  • 5.
    Example Task fromEclipse Programming Task: How to parse code in a dirty editor? Query: IEditorPart  ICompilationUnit Example solution from Prospector/PARSEWeb: IEditorPart iep = ... IEditorInput editorInp = iep.getEditorInput(); IWorkingCopyManager wcm = JavaUI.getWorkingCopyManager(); ICompilationUnit icu = wcm.getWorkingCopy(editorInp); Difficulties: a. Needs an instance of IWorkingCopyManager b. Needs to invoke a static method of JavaUI for getting the preceding instance Prospector [Mandelin et al. PLDI 05 ], XSnippet [ Sahavechaphan&Claypool OOPSLA 06 ], PARSEWeb [Thummalapenta&Xie ASE 07]
  • 6.
    Scenario 2 Whilereusing APIs of existing open source frameworks or libraries, programmers often know what method call they need but do not know how to write code before and after this method call Query: “Method name” How to use these APIs? MAPO [Xie&Pei MSR 05]
  • 7.
    Example Task fromBCEL Programming Task: How to instrument the bytecode of a Java class by adding an extra method to the class? Query : org.apache.bcel.generic.ClassGen public void addMethod(Method m ) Example solution from MAPO: public void generateStubMethod(ClassGen c) InstructionList il = new InstructionList (); MethodGen m= genFromISList (il); m. setMaxLocals (); m. setMaxStack (); c. addMethod (m. getMethod ()); System.out. println (“…”); … } MAPO [Xie&Pei MSR 05]
  • 8.
    Scenario 3 Whilereusing APIs of existing open source frameworks or libraries, programmers often know structural context such as a class’ type, its parents, and fields’ types, a method’s signature, method or constructor callees but do not know how to write code in this context Query: Structural context How to use these APIs? Strathcona [Holmes et al. 05], XSnippet [ Sahavechaphan&Claypool OOPSLA 06 ]
  • 9.
    Example Task from HttpClient Programming Task: How to evolve a system to use a third party library, HttpClient, for handling http connections? Query : HttpClient, PostMethod classes Example solution from Strathcona : Strathcona [Holmes et al. 05], XSnippet [ Sahavechaphan&Claypool OOPSLA 06 ]
  • 10.
    Steps in RecommendersData collection/extraction Data preprocessing Data analysis/mining Result postprocessing Result representation
  • 11.
    Data Collection/Extraction Fromone or multiple local code repositories Often followed by offline analysis or mining Challenges: lack of relevant code examples Ex.: Strathcona, Prospector, XSnippet From the whole open source world with a code search engine! Often followed by on-the-fly analysis and mining Challenges: only partial code files Ex.: MAPO, PARSEWeb
  • 12.
    Exploiting A CodeSearch Engine Accepts queries including keywords of classes or/and method names Interacts with a code search engine such as Google code search to gather related code samples Stores gathered code samples (source files) in a local code repository (later being analyzed and mined) Challenges: gathered code samples are partial and not compilable as code search engines retrieve individual source files instead of entire projects PARSEWeb [Thummalapenta&Xie ASE 07]
  • 13.
    Available Code SearchEngines Google Code Search http://www.google.com/codesearch Krugle: http://www.krugle.com/ Koders: http://www.koders.com/ Codase: http://www.codase.com/ JExamples: http://www.jexamples.com/ etc., Why not using just code search engines?
  • 14.
    What are DevelopersSearching for? Assieme [Hoffmann et al. UIST 07] 339 sessions related to Java programming 15 million queries of Windows Live Search from May 2006. 117 API sessions (34.2%); 70 trouble-shooting sessions (20.6%)
  • 15.
    API-related Search Sessions64.1% sessions contained queries that were merely descriptive but did not contain actual names of APIs, packages, types, or members. The remaining sessions contained API or package names ( 12.8% ), Type names ( 17.9% ) Method names ( 5.1% ). Among all these API-related sessions, 17.9% contained terms like “example”, “using”, or “sample code” Assieme [Hoffmann et al. UIST 07]

Editor's Notes

  • #6 The frequency domain equalizer is just a complex division per subchannel Channel shortening equalizer is an 20-30 tap FIR filter My focus is on channel shortening