Voice Enabled Desktop Interaction and Control System (VEDICS).


Published on

Nischal Rao & Bharat Joshi, Rashtreeya Vidyalaya College of Engineering

Published in: Technology
1 Comment
  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Voice Enabled Desktop Interaction and Control System (VEDICS).

  1. 1. Desktop at Your Command
  2. 2. • Team Members:  Nischal E Rao  Bharat Joshi  Suhas Kamath N  Sharath M Puranik • Project Guide: Prof. Shantharam Nayak • Carried out at: R.V. College of Engineering, Bangalore, India.
  3. 3. • Voice Enabled Desktop Interaction and Control System(VEDICS) is a software solution for controlling the desktop system using voice based commands. • The system takes audio signals as input, processes it, recognizes it and executes the desired action on the desktop system.
  4. 4. • All software products should incorporate accessibility features to enable differently-abled people to use the software easily and efficiently. • For persons with physical disabilities, the ability to simply talk to a computer could be a priceless asset. • Hands-free computing is more convenient than conventional I/O.
  5. 5. • The user should be able to o access any element present on the user’s screen. o run common programs and applications. o navigate through the file system. o perform common window operations like minimize, maximize, close etc. • User commands should be easy to remember and use. • The user must be able to turn the system on and off whenever required.
  6. 6. • VEDICS follows MVC design pattern. • Flexibility of using any speech-to-text converter for use with VEDICS. • VEDICS uses a feedback mechanism to learn what is being displayed on the desktop. • Increased accuracy since only relevant words are recognized.
  7. 7. Recognized Text Desktop Speech-to-text Control Converter System Grammar and Names of visible elements Command Currently visible objects User’s Desktop
  8. 8. • Speech to text Conversion Speech To Text Converter
  9. 9. • Grammar and Dictionary are used to convert sound signals into text. Speech To Text Converter Grammar Dictionary
  10. 10. • The recognized text is given as input to the Desktop Control System. Speech To “Open Firefox” Desktop Text Converter Control System Grammar Dictionary
  11. 11. • The Desktop Control System determines the command to execute on the desktop. Speech To Desktop Text Converter Control System Open_firefox command
  12. 12. • After successful execution, the names of objects visible on the screen are collected. Speech To Desktop Text Converter Control System “File” | “Edit” | “Google”
  13. 13. • The collected names are used to update the grammar and the dictionary files. Speech To Desktop Text Converter Control System “File”, “Edit”, “Google” Grammar Dictionary
  14. 14. • The updated grammar and dictionary files are used in the next recognition cycle. Speech To Text Converter Updated Grammar Updated Dictionary
  15. 15. • VEDICS consists of the following parts: o Sphinx 4 Sub-system : Open Source tool used to convert speech to text. o Desktop Control Sub-system: Used to execute the converted text into corresponding command on the desktop. It re-creates the grammar file based on what is displayed on the screen. o Logios Tool : Used to generate a new dictionary based on what is displayed on the screen.
  16. 16. • Accuracy of VEDICS depends on accuracy of Sphinx 4. • Summary of performance of Sphinx 4: Parameters Performance Vocabulary Size 79 Word Error Rate (in %) 1.192 RT Ratio in Single CPU Configuration* 0.25 RT Ratio in Dual CPU Configuration* 0.20 * RT Ratio: Ratio of utterance duration to the time taken to decode the utterance.
  17. 17. • Increased accuracy due to context aware nature of VEDICS. • Use of small vocabulary further improves accuracy. • Use of Logios enables recognition of custom words. Words with any sequence of characters can be recognized. • Almost all components accessible on the desktop.
  18. 18. • VEDICS can be used to perform most actions that can be done using a pointing device. • Using voice to access and control the desktop has many advantages. This feature can be a boon to the differently-abled people. • VEDICS can navigate through file system, open applications, control the desktop window, and recognize almost any word. • VEDICS is context aware. It determines what is currently being displayed on the desktop and dynamically generates the grammar and the dictionary.
  19. 19. • Dictation facility: The ability to dictate into a text editor or text field. • Artificial Intelligence in VEDICS. • If there is a conflict in name of object on the screen then the user should be able to select the right object. • The user should be able to either pronounce the entire word or spell individual characters of the word. • Facility to add custom commands to suit the user. • Screen Reader Facility.
  20. 20. Project Link: http://vedics.sourceforge.net/ References: • Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh, Evandro Gouvea, Peter Wolf, Joe Woelfel, “Sphinx-4: A Flexible Open Source Framework for Speech Recognition”, SML Technical Report, Sun Microsystems, SMLI TR-2004-139, Nov. 2004 • Kai-Fu Lee, Hsiao-Wuen Hon, Raj Reddy, “An Overview of the SPHINX Speech Recognition System”, IEEE Transactions on Acoustics Speech and Signal Processing, Vol 38, No. 1, Jan, 1990. • Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad, Michael Stal, “Pattern-Oriented Software Architecture – Vol 1: A System of Patterns”, Wiley Publications, 1996.
  21. 21. • Gnome Voice Control [Online]. Available: http://live.gnome.org/GnomeVoiceControl • “Java Speech Grammar Format (JSGF)” [Online]. Available: http://java.sun.com/products/java- media/speech/forDevelopers/JSGF/ • “Logios Lexicon Tool” [Online]. Available: http://www.speech.cs.cmu.edu/ tools/lextool.html • “Gnome Accessibility API” [Online]. Available: http://library.gnome.org/devel/at-spi-cspi/ • “Libwnck: Window Navigator Construction Kit” [Online]. Available: http://library.gnome.org/devel/libwnck/ • “GConf Configuration System” [Online]. Available: http://library.gnome.org/devel/gconf/