Evaluating Recommended Applications

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Event

    Evaluating Recommended Applications - Presentation Transcript

    1. Evaluating Recommended Applications Mark Grechanik Accenture Technology Labs David Coppit Denys Poshyvanyk The College of William and Mary The College of William and Mary Denys Poshyvanyk The College of William and Mary
    2. How Many Open Source Applications Are There? • Sourceforge.net reports that they host 180,000 projects as of August 1, 2008. • There are dozens of other open source repositories containing tens of thousands of different applications. • Companies have internal source control management systems containing hundreds of thousands of applications. 2
    3. Problem • Finding/checking existing software matching high-level user requirements • Would reduce the cost of many software projects • Would provide users with examples of different implementations • Challenges: • Finding relevant applications is difficult • Evaluating recommended applications is very difficult 3
    4. What Search Engines Do Matching keywords descriptions of apps 4
    5. What Search Engines Do Matching keywords descriptions of apps 5
    6. Fundamental Problems • Vocabulary problem • Mismatch between the high-level intent reflected in the descriptions of applications and their low-level implementation details • Concept assignment problem Code snippet implementing High level concept “Send data” “Send data” s = socket.socket(proto, socket.SOCK_DGRAM) s.sendto(teststring, addr) buf = data = receive(s, 100) while data and '\\n' not in buf: data = receive(s, 100) buf += data • Many application repositories are polluted with poorly functioning projects 6
    7. Working without a Recommender • Find relevant application (s) • Download application • Locate and examine fragments of the code that implement the desired features • Observe the runtime behavior of this application to ensure that this behavior matches requirements • This process is manual since programmers: • study the source code of the retrieved applications • locate various API calls • read information about these calls in help documents • Still, it is difficult for programmers to link high- level concepts from requirements to their implementations in source code 7
    8. Our Goal search 8
    9. Our Goal 9
    10. Key observations • While studying retrieved apps developers : • locate various API calls • read information about these calls in help documents • Help docs are supplied by the same vendors whose packages/APIs are used in software • Programmers read and rely on these API docs • Help docs are written and reviewed by many developers • Help documents are usually more verbose and accurate than project descriptions 10
    11. How K9 Recommender System Works API call1 descriptions API call2 match of API calls API call3  Automatically matching words in user queries against API help docs instead of:  searching in project descriptions;  searching in source code.  K9 uses help documents to produce a list of relevant API calls 11
    12. The Architecture behind K9 API calls API call Integrity API calls Dictionary lookup Engine Applications Help page Recommended User metadata processor applications Help pages Source code Retrieved Static search engine applications Analyzer 12
    13. API Call Lookup (ACL) • After the user enters keywords, the ACL searches through the documentation text to find these keywords • Use Information Retrieval to rank the documents to the user query • The ACL returns the absolute path to the programming entity for the text that contains entered keywords 13
    14. The Architecture behind K9 API calls API call Integrity API calls Dictionary lookup Engine Applications Help page Recommended User metadata processor applications Help pages Source code Retrieved Static search engine applications Analyzer 14
    15. Integrity Engine • We assume that keywords in queries represent logically connected concepts • Consider query “compress encrypt XML” • Queries are often formed by constructing a question first, and then extracting keywords from this sentence • If a relation is present between concepts in the query, then there should be a relation between API calls that implement these concepts in the program code • This heuristics is based on the observation that relations between concepts are often preserved in their implementations in the program code 15
    16. Computing the Ranking • Total number of API calls is n • Total number of connections between these API calls is less or equal to n(n-1) • Each connection i is assign some weight wi • Weak connections have weights 0.5 and strong connections have weights 1.0 • l is the connectivity value • 1 : loop link, 0.5 :single link, 0 : no link • Rank is computed using the following formula wi li rank  n  n  1 16
    17. Current Status • Restricting the scope to Java projects • Challenges: • How to automatically locate and download the latest version of the software (e.g., from sourceforge)? • How to automatically locate the correct entry point (i.e., main) for static analysis? • How to reduce the time for the static analysis? • How and when to update API call dictionary? • Testing other ranking heuristics • Evaluation is pending 17
    18. Related Work • CodeFinder/Helgon • ExpertFinder • CodeBroker • Hipikat • Automated Method Completion (AMC) • Strathcona • Prospector • XSnippet • Google code search, Krugle,… 18
    19. Conclusions & Future Work • K9 recommends/checks relevant applications based on: • the analysis of relevant API help documents; • the connectivity analysis of retrieved API calls. • Indexing available open-source projects and pre-computing data and control flow among API calls • Analyzing multiple releases of the same project 19

    + rsse2008rsse2008, 2 years ago

    custom

    525 views, 0 favs, 0 embeds more stats

    Presentation by Denys Poshyvanyk at the RSSE 2008 W more

    More info about this presentation

    © All Rights Reserved

    • Total Views 525
      • 525 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories