Making Sense of OnlineCode SnippetsSiddharth Subramanian, Reid HolmesUniversity of Waterloo
2Indexes millions of random codesnippets from the internetPublicCode onthe InternetTraditional CodeSearchcrawlerMAKING SEN...
Curated CodeSearch3Indexes millions of random codesnippets from the internetIndexes a limited set of goodquality code snip...
Code Search Challenges
Code Search Challenges5chrono -Type unknown!run()- 20 different methods java.util.TimerTask.run() android.os.HandlerThre...
Code Search ChallengesPROBLEMS WITH LEXICAL SEARCH6 Code is treated as plain-text Underlying API linkage is lost Method...
Code Search Challenges7chrono -android.widget.Chronometerrun()-java.lang.Runnable.run()start()- android.widget.Chronometer...
Code Search ChallengesPROBLEMS WITH LEXICAL SEARCH PROBLEMS WITH PARSING CODE Code snippets are often incomplete Missing...
Approach
Approach[Parnin et. al., Georgia Tech, Tech. Rep., 2012]
ApproachSnipParse1Code Snippet Parser
ApproachMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13 12SnipParseCurated SnippetReposito...
ApproachMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13 13SnipParseCurated SnippetReposito...
Summary Enabling Curated Search http://awenda.cs.uwaterloo.ca/snippet/https://cs.uwaterloo.ca/~rtholmes/papers/msr_2013...
Upcoming SlideShare
Loading in …5
×

Making Sense of Online Code Snippets

1,669 views

Published on

Presentation for MSR '13, San Francisco.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,669
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
21
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Good morning everybody, I’m Siddharth Subramanian and I’m here to explain our submission to the MSR Challenge, which was work done in collaboration with Reid Holmes at the University of Waterloo. We built a system that helps developers better find API usage examples from the internet. We do so by extracting structural information hidden in code snippets on to guide code search and to construct a repository of curated source code examples from StackOVerflow.
  • Developers frequently reuse source code or search for examples to learn about a new API. In the process, they frequently use websites like Google code or Krugle to look for examples. How do these code search engines work? They index millions of code snippets that are publicly available on the internet. However, a lot of these code snippets are of poor quality and there is no assurance if they would actually work. To overcome this issue, we built a curated code search engine that searches through code snippets in accepted answers on stack overflow. This way, developers can search for code examples with have a guarantee that the results they get would actually work.What it does? Problem? What we do? Why android?
  • Developers frequently reuse source code or search for examples to learn about a new API. In the process, they frequently use websites like Google code or Krugle to look for examples. How do these code search engines work? They index millions of code snippets that are publicly available on the internet. However, a lot of these code snippets are of poor quality and there is no assurance if they would actually work. To overcome this issue, we built a curated code search engine that searches through code snippets in accepted answers on stack overflow. This way, developers can search for code examples with have a guarantee that the results they get would actually work.What it does? Problem? What we do? Why android?
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • To summarize, we built a tool that can identify structy
  • Making Sense of Online Code Snippets

    1. 1. Making Sense of OnlineCode SnippetsSiddharth Subramanian, Reid HolmesUniversity of Waterloo
    2. 2. 2Indexes millions of random codesnippets from the internetPublicCode onthe InternetTraditional CodeSearchcrawlerMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13
    3. 3. Curated CodeSearch3Indexes millions of random codesnippets from the internetIndexes a limited set of goodquality code snippetsPublicCode onthe InternetTraditional CodeSearchcrawlerMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13crawler
    4. 4. Code Search Challenges
    5. 5. Code Search Challenges5chrono -Type unknown!run()- 20 different methods java.util.TimerTask.run() android.os.HandlerThread.run() …start()- 26 different methods android.media.MediaPlayer.start() android.animation.Animator.start() …MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13
    6. 6. Code Search ChallengesPROBLEMS WITH LEXICAL SEARCH6 Code is treated as plain-text Underlying API linkage is lost Method name collisionsMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13
    7. 7. Code Search Challenges7chrono -android.widget.Chronometerrun()-java.lang.Runnable.run()start()- android.widget.Chronometer.start()MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13
    8. 8. Code Search ChallengesPROBLEMS WITH LEXICAL SEARCH PROBLEMS WITH PARSING CODE Code snippets are often incomplete Missing class declarations Missing method declarations Incomplete code fragments8 Code is treated as plain-text Underlying API linkage is lost Method name collisionsMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13
    9. 9. Approach
    10. 10. Approach[Parnin et. al., Georgia Tech, Tech. Rep., 2012]
    11. 11. ApproachSnipParse1Code Snippet Parser
    12. 12. ApproachMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13 12SnipParseCurated SnippetRepository• Android Types• Android Methods1Code Snippet Parser
    13. 13. ApproachMAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13 13SnipParseCurated SnippetRepository• Android Types• Android Methods1CodeHunter2Code SearchWeb InterfaceCode Snippet Parser
    14. 14. Summary Enabling Curated Search http://awenda.cs.uwaterloo.ca/snippet/https://cs.uwaterloo.ca/~rtholmes/papers/msr_2013_subramanian.pdf14MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMESMSR 13SnipParseCurated SnippetRepository• Android Types• Android Methods1CodeHunterCode SearchWeb InterfaceCode Snippet Parser

    ×