Successfully reported this slideshow.

Making Sense of Online Code Snippets

1

Share

Making Sense of Online
Code Snippets
Siddharth Subramanian, Reid Holmes
University of Waterloo
2
Indexes millions of random code
snippets from the internet
Public
Code on
the Internet
Traditional Code
Search
crawler
M...
Curated Code
Search
3
Indexes millions of random code
snippets from the internet
Indexes a limited set of good
quality cod...

YouTube videos are no longer supported on SlideShare

View original on YouTube

Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 15
1 of 15

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Making Sense of Online Code Snippets

  1. 1. Making Sense of Online Code Snippets Siddharth Subramanian, Reid Holmes University of Waterloo
  2. 2. 2 Indexes millions of random code snippets from the internet Public Code on the Internet Traditional Code Search crawler MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13
  3. 3. Curated Code Search 3 Indexes millions of random code snippets from the internet Indexes a limited set of good quality code snippets Public Code on the Internet Traditional Code Search crawler MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13 crawler
  4. 4. Code Search Challenges
  5. 5. Code Search Challenges 5 chrono -Type unknown! run()- 20 different methods  java.util.TimerTask.run()  android.os.HandlerThread.run()  … start()- 26 different methods  android.media.MediaPlayer.start()  android.animation.Animator.start()  … MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13
  6. 6. Code Search Challenges PROBLEMS WITH LEXICAL SEARCH 6  Code is treated as plain-text  Underlying API linkage is lost  Method name collisions MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13
  7. 7. Code Search Challenges 7 chrono -android.widget.Chronometer run()-java.lang.Runnable.run() start()- android.widget.Chronometer.start() MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13
  8. 8. Code Search Challenges PROBLEMS WITH LEXICAL SEARCH PROBLEMS WITH PARSING CODE  Code snippets are often incomplete  Missing class declarations  Missing method declarations  Incomplete code fragments 8  Code is treated as plain-text  Underlying API linkage is lost  Method name collisions MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13
  9. 9. Approach
  10. 10. Approach [Parnin et. al., Georgia Tech, Tech. Rep., 2012]
  11. 11. Approach SnipParse 1 Code Snippet Parser
  12. 12. Approach MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13 12 SnipParse Curated Snippet Repository • Android Types • Android Methods 1 Code Snippet Parser
  13. 13. Approach MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13 13 SnipParse Curated Snippet Repository • Android Types • Android Methods 1 CodeHunter 2 Code Search Web Interface Code Snippet Parser
  14. 14. Summary  Enabling Curated Search  http://awenda.cs.uwaterloo.ca/snippet/ https://cs.uwaterloo.ca/~rtholmes/papers/msr_2013_subramanian.pdf 14 MAKING SENSE OF ONLINE CODE SNIPPETS - SIDDHARTH SUBRAMANIAN, REID HOLMES MSR '13 SnipParse Curated Snippet Repository • Android Types • Android Methods 1 CodeHunter Code Search Web Interface Code Snippet Parser

Editor's Notes

  • Good morning everybody, I’m Siddharth Subramanian and I’m here to explain our submission to the MSR Challenge, which was work done in collaboration with Reid Holmes at the University of Waterloo. We built a system that helps developers better find API usage examples from the internet. We do so by extracting structural information hidden in code snippets on to guide code search and to construct a repository of curated source code examples from StackOVerflow.
  • Developers frequently reuse source code or search for examples to learn about a new API. In the process, they frequently use websites like Google code or Krugle to look for examples. How do these code search engines work? They index millions of code snippets that are publicly available on the internet. However, a lot of these code snippets are of poor quality and there is no assurance if they would actually work. To overcome this issue, we built a curated code search engine that searches through code snippets in accepted answers on stack overflow. This way, developers can search for code examples with have a guarantee that the results they get would actually work.What it does? Problem? What we do? Why android?
  • Developers frequently reuse source code or search for examples to learn about a new API. In the process, they frequently use websites like Google code or Krugle to look for examples. How do these code search engines work? They index millions of code snippets that are publicly available on the internet. However, a lot of these code snippets are of poor quality and there is no assurance if they would actually work. To overcome this issue, we built a curated code search engine that searches through code snippets in accepted answers on stack overflow. This way, developers can search for code examples with have a guarantee that the results they get would actually work.What it does? Problem? What we do? Why android?
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • However, lexically searching through source code is lossy. Consider the following code snippet from a post on SO. The type declaration of the chrono object is missing, so we do not know which particular methods run(), setbase() and start() methods are from the android API are being called. The android API has 20 different methods named run() and 26 methods named start() and It is not clear which ones are being called in this context. However, on parsing and analysing the code snippet, we can infer that the chrono object belongs to android.widget.chronometer type and the method run() being overridden is the ___ and start is ___. But since stack overflow deals with code snippets, parsing them is difficult.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • We built a tool called snipparse that can parse through incomplete java source code snippets and extract structural information from them. We populated results using this tool on the posts belonging to the android framework since previous research by Parnin and others has shown that SO discussions cover a significant portion of the android API. We used this tool to build a curated source code repository where code is indexed based on the types and methods that are being used in them. This repository is made accessible through a web interface called codehunter that allows users search for precise API usage examples.
  • To summarize, we built a tool that can identify structy
  • ×