Successfully reported this slideshow.

Jeeves -natural language interface application


Published on

A simple natural language interface application for launching applications and showing user information based on voice input processed by using natural language programming concepts

Published in: Technology
  • Be the first to comment

Jeeves -natural language interface application

  1. 1. JEEVES - A NATURAL LANGUAGE PROCESSING APPLICATION FOR ANDROID Presented by, Anshul Agarwal (IP108IS017) Karan Harsh Wardhan(1PI08IS045) Pavani Deepak Mehta(1PI08IS070)Jan 2012 - May 2012 Dept. of ISE 1
  2. 2. AGENDA• Project Overview• Relevance• Requirements• Introduction• Technologies Used• System DesignJan 2012 - May 2012 Dept. of ISE 2
  3. 3. AGENDA• Software Development Strategy• Implementation• Difficulties• Screenshots• Conclusion• Future Enhancements• ReferencesJan 2012 - May 2012 Dept. of ISE 3
  4. 4. PROJECT OVERVIEW• Goal of the Project: – To design a Natural Language Processing (NLP) Interface Application for Android Platform• Scope of the Project – Input from user using Voice Recognition – Application should process input and recognize user commands spoken in natural English – Basic features should be to make calls and send messages to contacts – Advanced features such as showing weather, Google search and launching appsJan 2012 - May 2012 Dept. of ISE 4
  5. 5. RELEVANCE• Allows for more intuitive human-computer interaction• More convenient if computer can automate tasks usually performed by humans• Goal is to reduce keypad usage as much as possible, and allow user to speak naturally• Can help the disabled to easily use phones• Acts as a virtual assistant, enhancing productivityJan 2012 - May 2012 Dept. of ISE 5
  6. 6. REQUIREMENTSJan 2012 - May 2012 Dept. of ISE 6
  7. 7. INTRODUCTION• NLP is a field of computer science and linguistics concerned with the interactions between computers and human languages• Aim is to design software that will analyze, understand, and generate languages that humans use naturally• Eventually user will be able to address the computer as they talk to another personJan 2012 - May 2012 Dept. of ISE 7
  8. 8. TECHNOLOGIES USED - ANDROIDWHY ANDROID?• Android is an open-source linux-based operating system for mobile devices• Requires developer license only to publish, not develop• Google voice recognition inbuilt in most Android devices• Layered architecture facilitates rapid development of applicationsJan 2012 - May 2012 Dept. of ISE 8
  10. 10. TECHNOLOGIES USED - ANDROIDINTENTS• Messaging facility for late run-time binding between components• Intent object is a passive data structure holding description of operation to be performed• Three core components of an application — activities, services, and broadcast receivers — are activated through intentsJan 2012 - May 2012 Dept. of ISE 10
  11. 11. TECHNOLOGIES USED - ANDROIDINTENTS• Two types - Implicit And Explicit• Intent filters are implemented help with intent resolution• Used to turn apps into high-level libraries and make code modular and reusable Intent intent = new Intent (App Package name); startActivity(intent);• Various additional info can be added intent.putExtra(“title”,”Hello codeandroid”);Jan 2012 - May 2012 Dept. of ISE 11
  12. 12. TECHNOLOGIES USED – HTTP POST• POST request is used to send data to server• The string detected by voice recognizer is passed to server using this method• Accomplished using in-built HttpCore API i.e org.apache.http package• The server performs processing and returns a JSON responseJan 2012 - May 2012 Dept. of ISE 12
  13. 13. TECHNOLOGIES USED - JSON• JavaScript Object Notation (JSON) is a lightweight data-interchange format• Based on a subset of the JavaScript Programming Language• Is completely language independent• In java, org.json.JSONObject is used to parse these strings JSONObject responseJSON = new JSONObject(responseString); String workId = responseJSON.getString("id");Jan 2012 - May 2012 Dept. of ISE 13
  14. 14. TECHNOLOGIES USED - JSON• Example:{"menu": { "id": "file","value": "File","popup": {"menuitem": [{"value": "New", "onclick": "CreateNewDoc()"},{"value": "Open", "onclick": "OpenDoc()"},{"value": "Close", "onclick": "CloseDoc()"} ] } }}Jan 2012 - May 2012 Dept. of ISE 14
  15. 15. TECHNOLOGIES USED – VOICE RECOGNITIONWHY GOOGLE VOICE RECOGNITION?• Focus of project was not automatic speech recognition• Pre-installed on most android phones, easy to access• Requires no special permission/payment to be used• Developed, optimized and maintained by Google since 2007• Occurs off-site i.e. on Google’s servers so no “weighty” voice recognition s/w needs to be installed on phone• Only need “android.speech.RecognizerIntent” packageJan 2012 - May 2012 Dept. of ISE 15
  16. 16. TECHNOLOGIES USED – VOICE RECOGNITIONHOW DOES RECOGNITION WORK?• Google uses artificial intelligence algorithms to recognize spoken sentences• Stores voice data anonymously for analysis purposes• Cross matches spoken data with written queries on server• Key problems of computational power, data availability and managing large amounts of information are handled with easeJan 2012 - May 2012 Dept. of ISE 16
  17. 17. TECHNOLOGIES USED - NLTKNATURAL LANGUAGE TOOLKIT• Open-source suite of libraries for NLP for the Python language• Includes graphical demonstrations and sample data• Provides NLP API’s, such as for importing a corpus, loading grammar from a file, etc.Jan 2012 - May 2012 Dept. of ISE 17
  18. 18. SYSTEM DESIGN• DFD Level 0Jan 2012 - May 2012 Dept. of ISE 18
  19. 19. SYSTEM DESIGN• DFD Level 1Jan 2012 - May 2012 Dept. of ISE 19
  20. 20. SYSTEM DESIGN• DFD Level 2Jan 2012 - May 2012 Dept. of ISE 20
  21. 21. SOFTWARE DEVELOPMENT STRATEGY• Software Development Strategy used is Extreme Programming• It is a type of agile software development• Advocates frequent releases in multiple short development cycles rather than one long cycle• Involves programming in groups and doing extensive code review• Works best with smaller groupsJan 2012 - May 2012 Dept. of ISE 21
  22. 22. SOFTWARE DEVELOPMENT STRATEGYJan 2012 - May 2012 Dept. of ISE 22
  23. 23. IMPLEMENTATIONWHAT WE HAVE IMPLEMENTED• Corpus – modification of Brown• Tokenizer• Part-of-Speech tagger• Grammar• Syntactic and Semantic Analysis• Client application on Android – Takes voice input from user, converts to text, passes text to NLP server, receives id from server and launches corresponding intentJan 2012 - May 2012 Dept. of ISE 23
  24. 24. IMPLEMENTATION• Client application starts up and prompts user to input using Google Voice Recognition• Input data is sent to Google servers for processing and text is returned to client• Input text is now passed to the NLP server for processing using HTTP POST• Server performs Natural Language ProcessingJan 2012 - May 2012 Dept. of ISE 24
  25. 25. IMPLEMENTATION• Steps involved in NLP: – Lexical Analysis: converts sequence of characters into a sequence of tokens – Morphological Analysis: identification, analysis and description of the structure of a given languages linguistic units – Syntactic Analysis: analyzing text, made up of a sequence of tokens, to determine its grammatical structure – Semantic Analysis: relating syntactic structures from the levels of phrases and sentences to their language- independent meaningsJan 2012 - May 2012 Dept. of ISE 25
  26. 26. IMPLEMENTATION• A corpus is a large and structured set of texts• Used in part-of-speech tagging to tag words as parts of a sentence• Tags stored along with the words in the corpus• We have modified the Brown corpus to include more relevant phrases and commands• Contains data from books, news articles, journals, etc.• TaggedCorpusReader needed to import the corpusJan 2012 - May 2012 Dept. of ISE 26
  27. 27. IMPLEMENTATION• During lexical analysis, the string is split up into various tokens, the separator being space• Tokens, which are words in this case, are passed to a part-of-speech(POS) tagger• POS tagger assigns a tag to each word depending on what part of speech it is• Eg of tags – adjective(ADJ), common noun(NN), proper noun(NP), verb(VB) etc.• Custom tag for command(CMD) created to recognize commands for application, such as call, message, launch, etc.Jan 2012 - May 2012 Dept. of ISE 27
  28. 28. IMPLEMENTATION• Different POS taggers available• Simplest is Default Tagger – Tagging Accuracy is 13%• Unigram Tagger – Tagging Accuracy is 81%• Bigram Tagger – Used alone, accuracy is 10%, but when used with Unigram Tagger as a backoff, accuracy is 85%• Trigram Tagger – Accuracy when used with the prev Bigram Tagger as backoff is 91.3%Jan 2012 - May 2012 Dept. of ISE 28
  29. 29. IMPLEMENTATION• As it gives the highest accuracy, a model for tagging is used as follows: – First, a Default Tagger is used, which assigns a default noun tag – Then, a Unigram Tagger is used, which uses Default Tagger as backoff – Then, a Bigram Tagger is used, which uses the above Unigram Tagger as backoff – Lastly, a Trigram Tagger is used, which uses the above Bigram Tagger as backoffJan 2012 - May 2012 Dept. of ISE 29
  30. 30. IMPLEMENTATION• Taggers need to be trained with the corpus so that they can recognize words and tag them accordingly• Training needs time• Taggers can be pre-trained with the data to save time• PICKLE files in python are used to save such pre-trained taggersJan 2012 - May 2012 Dept. of ISE 30
  31. 31. IMPLEMENTATION• Tagger uses technique of statistics and probability to assign tags• Tagged tokens passed to parser• Parser makes sure sentence conforms to rules of grammar, hence only grammatically valid sentences are accepted• Predefined commands specify the functionality they representJan 2012 - May 2012 Dept. of ISE 31
  32. 32. IMPLEMENTATION• Word, tag tuples are parsed to recognize commands and entities to which those commands apply• Command words and receivers are extracted• A unique id is returned to the client along with the receiver• The id represents the functionality required, and the receiver indicates the variable parameter to which the functionality is applied• This id is used to launch intents which also take into account the parametersJan 2012 - May 2012 Dept. of ISE 32
  33. 33. DIFFICULTIES• Modification of corpus to suit the application needs• Choosing a POS tagger• Advanced semantics – multiple meanings of the same sentence. Eg., call, make a call, make a phone call, message, send a message, etc.• Separation of training of data and processingJan 2012 - May 2012 Dept. of ISE 33
  34. 34. SCREENSHOTSJan 2012 - May 2012 Dept. of ISE 34
  35. 35. SCREENSHOTSJan 2012 - May 2012 Dept. of ISE 35
  36. 36. SCREENSHOTSJan 2012 - May 2012 Dept. of ISE 36
  37. 37. SCREENSHOTSJan 2012 - May 2012 Dept. of ISE 37
  38. 38. SCREENSHOTSJan 2012 - May 2012 Dept. of ISE 38
  39. 39. CONCLUSION• Aim is to create a Natural Language Interface application which acts as a virtual assistant• App is able to perform basic functions such as calling and messaging• Also performs advanced functions like search, showing the weather and launching apps• Works for multiple semantics for the same command, spoken in natural English• Does not require much time, only a few secondsJan 2012 - May 2012 Dept. of ISE 39
  40. 40. FUTURE ENHANCEMENTS• Include more semantics for the same command• Increasing accuracy for longer sentences• Processing on mobile device for short basic commands such as ‘call smith’• Providing custom settings to userJan 2012 - May 2012 Dept. of ISE 40
  41. 41. REFERENCES[1] Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing withPython. United States of America: O’Reilly Media, Inc. June 2009[2] Edward Loper. “NLTK: Building a Pedagogical Toolkit in Python”, Department ofComputer and Information Science, University of Pennsylvania, Philadelphia, PA19104-6389, USA*3+ Cheng Juan. “Research and Implementation English Morphological Analysis andPart-of-Speech Tagging”, Normal Education Department, Bohai Shipbuilding VocationalCollege, Huludao,China[4] W. Wang, J. Auer, R. Parasuraman, I. Zubarev, D. Brandyberry and M. P. Harper. “AQuestion Answering System Developed as a Project in a Natural Language ProcessingCourse”, Purdue University, West Lafayette IN.[5] Ivan Archeurov, “Architecture of an NLP engine” Internet:*6+ World Weather Online, “How Free Local Weather API Works”, Internet:[7] Natural Language Toolkit, Internet:[8] Andrew Montalenti. “Just Enough NLP With Python”, Internet: 2012 - May 2012 Dept. of ISE 41
  42. 42. Jan 2012 - May 2012 Dept. of ISE 42