By Sanif S S
Reg
No:10007399
S7 IT
Overview
 Understanding the terms.
 Objectives.
 In detail
o Keyword Retrieval
o Variable Retrieval
o API Specification...
API Specification-Based Function Search Engine
Using Natural Language Query
 API – Application Programming Interface.
 A...
API Specification-Based Function Search Engine
Using Natural Language Query
 Description about the classes and methods in...
API Specification-Based Function Search Engine
Using Natural Language Query
 Function search engine is nothing but as the...
API Specification-Based Function Search Engine
Using Natural Language Query
 Natural Language Query is a query that uses ...
API Specification-Based Function Search Engine
Using Natural Language Query
 Means a search engine to search all the func...
 Programmers nearly always use existing functions while developing their applications.
 The functions have grown more nu...
 There are two main objectives in this paper:
o Retrieving functions, and
o Generating code for function calls.
 Two dif...
Code
Generation
Variable
Retrieval
Function
Description
API
Document
Fig:Function Search Model
Function
Search Query
Keywo...
 There are several methods to identify keywords in a natural
language sequence.
 Some methods identify keyword as a simp...
Word/POS
POS
Filter
POS tagging (part-ofspeech tagging) is the
technology to mark up a word in a natural language
sentence...
 For the natural language query “Gets an element in the collection”. The followings are
results obtained in the above sta...
 Two kinds of objects in a function call query:
-Words and Variables.
 Many words related to each variable in the query....
 Every relation between words and variable is represented by a “variable retrieval rule”
derived from a corresponding syn...
 In figure 3, a query in natural
language (“Insert element e in a set
at index k”) is parsed in a tree
structure by using...
 This subsection focuses on mining the API specification of Java ,called Java API
specification.
 In the Java API specif...
 Function specification: is a structured data that describes the usage of function.
 information, which can be extracted...
Example:
 The function add() is described in the Java API specification ArrayList as follows.
 Function specification: p...
 There are three stages in the process of retrieving function.
 Stage 1: extracting the functions related to user’s quer...
 The standard syntax of a function call statement is object.callName(arg1, arg2,…., argk)
 To generate code for a functi...
 In the first Step , the function retrieval method is used to identify a set of functions
related to user’s query.
 Howe...
A. User Study
 In the first user study, ten common search tasks are designed and assigned them to the
participants.
 The...
 In the second user study, the participants suggested over 100 requests that generate
code for function call.
 Then, the...
B. Results
0
0.1
0.2
0.3
0.4
0.5
0.6
User 1 User 2 User 3
Krugle
Koder
FSE
B. Results
In this figure
 92% -correct functions that were
relevant to user’s request.
 71% -correct function in the fi...
 Efficient function search approach by using the API specification is proposed in this
paper
 Also presented a novel fun...
[1] A. J. Ko, B. A. Myers, and H. H. Aung, “Six learning barriers in enduser programming systems,” in
Proc. of the 2004 IE...
[5] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for reusing open source code on
the web,” in Proc. of t...
[9] R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar, “Inferring method specifications from
natural languag...
[15] R. Hemayati, W. Meng, and C. Yu, “Semantic-based grouping of search engine results using
wordnet,” in Proc. of the jo...
Api specification based function search engine using natural language query-Seminar Conducted by me
Upcoming SlideShare
Loading in …5
×

Api specification based function search engine using natural language query-Seminar Conducted by me

330 views

Published on

This is the seminar I have conducted as a part of my syllabus.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
330
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Api specification based function search engine using natural language query-Seminar Conducted by me

  1. 1. By Sanif S S Reg No:10007399 S7 IT
  2. 2. Overview  Understanding the terms.  Objectives.  In detail o Keyword Retrieval o Variable Retrieval o API Specification Mining o Function Retrieval o Code Generation  Experiment  Conclusion  References
  3. 3. API Specification-Based Function Search Engine Using Natural Language Query  API – Application Programming Interface.  An API is a set of commands, functions, and protocols which programmers can use when building software for a specific operating system  APIs are usually Implemented as Header Files.  EX: o Java APIs o ODBC for Microsoft Windows
  4. 4. API Specification-Based Function Search Engine Using Natural Language Query  Description about the classes and methods inside the API.  Each method(or function) and its uses are briefly described in the API Specifications.
  5. 5. API Specification-Based Function Search Engine Using Natural Language Query  Function search engine is nothing but as the name suggests a search engine for all the methods in the API.
  6. 6. API Specification-Based Function Search Engine Using Natural Language Query  Natural Language Query is a query that uses a complete sentence or question to begin a search.  Ex: o “What is the capital of India?” o “How to make pizza?”
  7. 7. API Specification-Based Function Search Engine Using Natural Language Query  Means a search engine to search all the functions/methods in an Application programming interface(API) using simple queries.  Additionally this paper also suggests a means of generating automatic function calls based on the search.
  8. 8.  Programmers nearly always use existing functions while developing their applications.  The functions have grown more numerous and more diverse.  The Problem is that ‘what functions they want’ and know ‘how to call those functions?’.  The Solution:- o This paper present two novel approaches to address these problems. o The first is the approach to find right functions based on the API specification. o The second is approach to automatically generate code for “function call”
  9. 9.  There are two main objectives in this paper: o Retrieving functions, and o Generating code for function calls.  Two different forms of queries corresponding to these objectives. o The first is “function search query” which requests to look for functions. o The second is “function call query” which requests to generate code for function calls.
  10. 10. Code Generation Variable Retrieval Function Description API Document Fig:Function Search Model Function Search Query Keyword Retrieval Mining Function Retrieval Function retrieval is the process of finding suitable functions by matching “the extracted keywords from a function search query” to “descriptions of functions in the API specification”. Keyword retrieval is the process of extracting keywords from a function search query Mining is the process of extracting contents in the API specification to support function retrieval Function Call Query Function Call Variable retrieval is the process of extracting Variables from a function call query Code generation is the process of generating code for a function call based on both the variables extracted from function call query.
  11. 11.  There are several methods to identify keywords in a natural language sequence.  Some methods identify keyword as a simple word, while others identify a keyword phrase.  In this paper Introducing four technologies of natural language processing to extract keywords. -POS tagging, POS filtering, Stemming, Synonym generation.
  12. 12. Word/POS POS Filter POS tagging (part-ofspeech tagging) is the technology to mark up a word in a natural language sentence (NL Sentence). Fig Keyword Retrieval Process NL Sentence POS Tagging Stemming keywordsSynonym Generation Main Word Original Word POS filtering is the technology to remove stopwords such as prepositions, pronouns, conjunctions, and interjections. Stemming is the technology to reduce inflected (or sometimes derived) words to their root form. (Ex: ‘return’ is the root form of words “returns, returning, returned”. Synonym generation is the technology to identify synonyms of the retrieved keywords
  13. 13.  For the natural language query “Gets an element in the collection”. The followings are results obtained in the above stages. o POS Tagging: Gets/VB an/DT element/NN in/IN the/DF collection/NN. o POS Filtering: Gets element collection. o Stemming: Get element collection. o Synonym Generation: Get-have/return element-object/component collection-list/set. NOTE: VB-Verb DT-Determiner NN-Noun IN-Preposition DF-Adjective
  14. 14.  Two kinds of objects in a function call query: -Words and Variables.  Many words related to each variable in the query.  Also each word in the query is only relevant to one(or zero) variable.  words, which are relevant to a variable, is called features of this variable.
  15. 15.  Every relation between words and variable is represented by a “variable retrieval rule” derived from a corresponding syntactic rule.  Ex:Some variable retrieval rules o Root(sf V ) -> V B(wf W)NP(sf V ) o NP(sf fv1; v2g) -> NP(vf v1)PP(vf v2) o NP(sf V [ fvg) -> NP(sf V )PP(vf v) o NP(sf V1 [ V2) -> NP(sf V1)PP(sf V2) o PP(vf v[W1 W2]) -> IN(wf W1)NP(wf v[W2]) o PP(sf V ) -> IN(wf W)NP(sf V ) o NP(wf W1 W2) -> NN(wf W1)NN(wf W2) o NP(vf v[W1 W2]) -> NN(wf W1)NN(vf v[W2]) o NP(vf v[W1 W2 W3]) ->DT(wf W1) V BN(wf W2) NN(vf v[W3])
  16. 16.  In figure 3, a query in natural language (“Insert element e in a set at index k”) is parsed in a tree structure by using Stanford-Parser tool.  The last result is: o e[element]; o a[a set]; o k[at index]; Fig. 3: Parsing tree for the function call query
  17. 17.  This subsection focuses on mining the API specification of Java ,called Java API specification.  In the Java API specification, there are many contents related to function which may be mined to support the function retrieval process and the code generation process.  They are:- o function specification o functionality description o parameter features
  18. 18.  Function specification: is a structured data that describes the usage of function.  information, which can be extracted from this content, is:Function name, function scope, return type, a list of parameters,and so on…  Functionality description: is an unstructured data in the form of natural language that describes the functionality of the function.  To extract information in this content, the keyword retrieval method (presented in previous slide) is used.  Parameter features: is an unstructured data in the form of natural language that describes  features of the parameters in the function specification.  The necessary information in this content are extracted by usingnatural language processing technologies.
  19. 19. Example:  The function add() is described in the Java API specification ArrayList as follows.  Function specification: public void add(int index,Object element).  Functionality description: “Inserts the specified element at the specifiedposition in this list”.  Parameter features: “index - index at which the specified element is to be inserted” and “element - element to be inserted”.
  20. 20.  There are three stages in the process of retrieving function.  Stage 1: extracting the functions related to user’s query based on some constraints.  Stage 2: refining the obtained result in the previous stage by removing some irrelevant functions.  Stage 3: ranking the collected relevant functions in descending order of appropriate degree of query.
  21. 21.  The standard syntax of a function call statement is object.callName(arg1, arg2,…., argk)  To generate code for a function call, we map user’s query to the corresponding function call based on its function definition.  Two Steps: i. identifying certain variable vj as the object o , and ii. mapping the remaining variables to the corresponding arguments arg1, arg2, argk
  22. 22.  In the first Step , the function retrieval method is used to identify a set of functions related to user’s query.  However, to use this method, the “function call query” need to be transferred to the “function search query” by removing all variables in this query.  The variable, whose type contains at least one function related to the new query, is the desired object o  In the second step all Other variables are set as parameters.  For example, give the query “inserts an element <e:Object> in a collection <a:ArrayList>”, the variable a with type ArrayList contains the function add related to the new query “inserts an element in a collection”, so a:add(?) is a suitable function call.
  23. 23. A. User Study  In the first user study, ten common search tasks are designed and assigned them to the participants.  Then, each participant used FSE and some other search engines to complete these tasks.  Three search engines are given to users for study: FSE, Krugle, Koder.
  24. 24.  In the second user study, the participants suggested over 100 requests that generate code for function call.  Then, they checked degree of fitness between obtained results and their requests to calculate accuracy for FSE.  There are four degrees of fitness: Highly Relevant, Somewhat Relevant, Somewhat Irrelevant, Highly Irrelevant.  Hightly Relevant- The top result in the set of the returned solutions is absolutely fit with user’s request.  Somewhat Relevant- The desired result in result set was not in the first position.  Somewhat Irrelevant- If it contains the function with correct name but wrong parameters.  Highly Irrelevant- The lowest level.
  25. 25. B. Results 0 0.1 0.2 0.3 0.4 0.5 0.6 User 1 User 2 User 3 Krugle Koder FSE
  26. 26. B. Results In this figure  92% -correct functions that were relevant to user’s request.  71% -correct function in the first position of solution set.  7% -did not find any proper function.
  27. 27.  Efficient function search approach by using the API specification is proposed in this paper  Also presented a novel function call generation method that generates source code to invoke the functions based on variable features extracted from user’s query.  Finally, we have implemented FSE, a function search engine that helps programmers to quickly examine different functions that might be appropriate for a problem, obtain more information about particular functions, and automatically generate code for function calls to know how to use a function.
  28. 28. [1] A. J. Ko, B. A. Myers, and H. H. Aung, “Six learning barriers in enduser programming systems,” in Proc. of the 2004 IEEE Symposium on Visual Languages - Human Centric Computing, ser. VLHCC ’04. IEEE Computer Society, 2004, pp. 199– 206. [2] D. Mandelin, L. Xu, R. Bod´ık, and D. Kimelman, “Jungloid mining: helping to navigate the api jungle,” in Proc. of the 2005 ACM SIGPLAN conference on Programming language design and implementation, ser. PLDI ’05. ACM, 2005, pp. 48–61. [3] J. Stylos and B. A. Myers, “Mica: A web-search tool for finding api components and examples,” in Proc. of the Visual Languages and Human-Centric Computing, ser. VLHCC ’06. IEEE Computer Society, 2006, pp. 195–202. [4] R. Hoffmann, J. Fogarty, and D. S. Weld, “Assieme: finding and leveraging implicit references in a web search interface for programmers,” in Proc. of the 20th annual ACM symposium on User interface software and technology, ser. UIST ’07. ACM, 2007, pp. 13–22.
  29. 29. [5] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for reusing open source code on the web,” in Proc. of the twentysecond IEEE/ACM international conference on Automated software engineering, ser. ASE ’07. ACM, 2007, pp. 204–213. [6] M. Grechanik, C. Fu, Q. Xie, C. McMillan, D. Poshyvanyk, and C. Cumby, “A search engine for finding highly relevant applications,” in Proc. of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ser. ICSE ’10. ACM, 2010, pp. 475–484. [7] S. Chatterjee, S. Juvekar, and K. Sen, “Sniff: A search engine for java using free-form queries,” in Proc. of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, ser. FASE ’09. Springer-Verlag, 2009, pp. 385–400. [8] M. Grechanik, K. M. Conroy, and K. A. Probst, “Finding relevant applications for prototyping,” in Proc. of the Fourth International Workshop on Mining Software Repositories, ser. MSR ’07. IEEE Computer Society, 2007, pp. 12–.
  30. 30. [9] R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar, “Inferring method specifications from natural language api descriptions,” in Proceedings of the 2012 International Conference on Software Engineering, ser. ICSE 2012. IEEE Press, 2012, pp. 815–825. [10] A. Fantechi, S. Gnesi, G. Lami, and A. Maccari, “Application of linguistic techniques for use case analysis,” in Proc. of the 10th Anniversary IEEE Joint International Conference on Requirements Engineering, ser. RE ’02. IEEE Computer Society, 2002, pp. 157–164. [11] D. Klein and C. D. Manning, “Accurate unlexicalized parsing,” in Proc. of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ser. ACL ’03. Association for Computational Linguistics, 2003, pp. 423–430. [12] L. Kof, “Scenarios: Identifying missing objects and actions by means of computational linguistics.” in RE. IEEE, 2007, pp. 121–130. [13] K. Rothenhausler and H. Schutze, “Part of speech filtered word spaces,” in Proc. of the 2007 Workshop on Contextual Information in Semantic Space Models: Beyond Words and Documents, 2007, pp. 25–32. [14] D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker, “Using natural language program analysis to locate and understand action-oriented concerns,” in Proc. of the 6th international conference on Aspect-oriented software development, ser. AOSD ’07. ACM, 2007, pp. 212–224.
  31. 31. [15] R. Hemayati, W. Meng, and C. Yu, “Semantic-based grouping of search engine results using wordnet,” in Proc. of the joint 9th Asia- Pacific web and 8th international conference on web-age information management conference on Advances in data and web management, ser. APWeb/WAIM’07. Springer-Verlag, 2007, pp. 678–686. [16] C. Manning and D. Klein. The stanford parser. [Online]. Available: http://nlp.stanford.edu/software/lex-parser.shtml [17] Java api. [Online]. Available: docs.oracle.com/javase/1.4.2/docs/api [18] L. Vaughan, “New measurements for search engine evaluation proposed and tested,” Inf. Process. Manage., vol. 40, no. 4, pp. 677–691, May 2004. [19] Krugle inc. [Online]. Available: http://opensearch.krugle.com/ [20] Koder inc. [Online]. Available: http://www.koders.com/ [21] S. E. Sim, M. Umarji, S. Ratanotayanon, and C. V. Lopes, “How well do search engines support code retrieval on the web?” ACM Trans. Softw. Eng. Methodol., vol. 21, no. 1, pp. 4:1–4:25, Dec. 2011

×