SlideShare a Scribd company logo
1 of 23
Download to read offline
Desktop at Your Command
• Team Members:
       Nischal E Rao
       Bharat Joshi
       Suhas Kamath N
       Sharath M Puranik


• Project Guide: Prof. Shantharam Nayak
• Carried out at:
        R.V. College of Engineering,
              Bangalore, India.
• Voice Enabled Desktop Interaction and
  Control System(VEDICS) is a software
  solution for controlling the desktop system
  using voice based commands.

• The system takes audio signals as input,
  processes it, recognizes it and executes
  the desired action on the desktop system.
• All software products should incorporate
  accessibility features to enable differently-abled
  people to use the software easily and efficiently.

• For persons with physical disabilities, the ability
  to simply talk to a computer could be a priceless
  asset.

• Hands-free computing is more convenient than
  conventional I/O.
• The user should be able to
   o access any element present on the user’s screen.
   o run common programs and applications.
   o navigate through the file system.
   o perform common window operations like minimize,
     maximize, close etc.

• User commands should be easy to remember and use.

• The user must be able to turn the system on and off
  whenever required.
• VEDICS follows MVC design pattern.

• Flexibility of using any speech-to-text converter for use
  with VEDICS.

• VEDICS uses a feedback mechanism to learn what is
  being displayed on the desktop.

• Increased accuracy since only relevant words are
  recognized.
Recognized Text
                                                Desktop
Speech-to-text
                                                Control
  Converter
                                                System
                   Grammar and
                   Names of visible
                   elements
                                      Command             Currently visible
                                                          objects




                                                 User’s
                                                Desktop
• Speech to text Conversion


          Speech To
        Text Converter
• Grammar and Dictionary are used to
  convert sound signals into text.
          Speech To
        Text Converter




                         Grammar




                         Dictionary
• The recognized text is given as input to
  the Desktop Control System.
           Speech To      “Open Firefox”   Desktop
         Text Converter                    Control
                                           System



                            Grammar




                            Dictionary
• The Desktop Control System determines
  the command to execute on the desktop.
          Speech To         Desktop
        Text Converter      Control
                            System
                                Open_firefox
                                command
• After successful execution, the names of
  objects visible on the screen are collected.
           Speech To            Desktop
         Text Converter         Control
                                System
                                   “File” | “Edit” | “Google”
• The collected names are used to update
  the grammar and the dictionary files.
          Speech To                   Desktop
        Text Converter                Control
                                      System


                                          “File”, “Edit”, “Google”
                         Grammar




                         Dictionary
• The updated grammar and dictionary files
  are used in the next recognition cycle.
          Speech To
        Text Converter



                         Updated
                         Grammar



                         Updated
                         Dictionary
• VEDICS consists of the following parts:
  o   Sphinx 4 Sub-system : Open Source tool used to convert
      speech to text.

  o   Desktop Control Sub-system: Used to execute the converted
      text into corresponding command on the desktop. It re-creates
      the grammar file based on what is displayed on the screen.

  o   Logios Tool : Used to generate a new dictionary based on
      what is displayed on the screen.
• Accuracy of VEDICS depends on accuracy of Sphinx 4.
• Summary of performance of Sphinx 4:
                        Parameters                                 Performance


    Vocabulary Size                                                       79


    Word Error Rate (in %)                                              1.192


    RT Ratio in Single CPU Configuration*                                0.25


    RT Ratio in Dual CPU Configuration*                                  0.20


    * RT Ratio: Ratio of utterance duration to the time taken to decode the utterance.
• Increased accuracy due to context aware nature of
  VEDICS.

• Use of small vocabulary further improves accuracy.

• Use of Logios enables recognition of custom words.
  Words with any sequence of characters can be
  recognized.

• Almost all components accessible on the desktop.
• VEDICS can be used to perform most actions that can
  be done using a pointing device.

• Using voice to access and control the desktop has many
  advantages. This feature can be a boon to the
  differently-abled people.

• VEDICS can navigate through file system, open
  applications, control the desktop window, and recognize
  almost any word.

• VEDICS is context aware. It determines what
  is currently being displayed on the desktop and
  dynamically generates the grammar and the dictionary.
• Dictation facility: The ability to dictate into a text editor or
  text field.

• Artificial Intelligence in VEDICS.

• If there is a conflict in name of object on the screen then
  the user should be able to select the right object.

• The user should be able to either pronounce the entire
  word or spell individual characters of the word.

• Facility to add custom commands to suit the user.

• Screen Reader Facility.
Project Link: http://vedics.sourceforge.net/
References:
• Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh,
  Evandro Gouvea, Peter Wolf, Joe Woelfel, “Sphinx-4: A Flexible
  Open Source Framework for Speech Recognition”, SML Technical
  Report, Sun Microsystems, SMLI TR-2004-139, Nov. 2004
• Kai-Fu Lee, Hsiao-Wuen Hon, Raj Reddy, “An Overview of the
  SPHINX Speech Recognition System”, IEEE Transactions on
  Acoustics Speech and Signal Processing, Vol 38, No. 1, Jan,
  1990.
• Frank Buschmann, Regine Meunier, Hans Rohnert, Peter
  Sommerlad, Michael Stal, “Pattern-Oriented Software Architecture
  – Vol 1: A System of Patterns”, Wiley Publications, 1996.
• Gnome Voice Control [Online]. Available:
  http://live.gnome.org/GnomeVoiceControl
• “Java Speech Grammar Format (JSGF)” [Online]. Available:
  http://java.sun.com/products/java-
  media/speech/forDevelopers/JSGF/
• “Logios Lexicon Tool” [Online]. Available:
  http://www.speech.cs.cmu.edu/ tools/lextool.html
• “Gnome Accessibility API” [Online]. Available:
  http://library.gnome.org/devel/at-spi-cspi/
• “Libwnck: Window Navigator Construction Kit” [Online]. Available:
  http://library.gnome.org/devel/libwnck/
• “GConf Configuration System” [Online]. Available:
  http://library.gnome.org/devel/gconf/
Voice Enabled Desktop Interaction and Control System (VEDICS).

More Related Content

Similar to Voice Enabled Desktop Interaction and Control System (VEDICS).

computer-science_engineering_principles-of-programming-languages_introduction...
computer-science_engineering_principles-of-programming-languages_introduction...computer-science_engineering_principles-of-programming-languages_introduction...
computer-science_engineering_principles-of-programming-languages_introduction...AshutoshSharma874829
 
Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognitionDhiraj Singh
 
Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...
Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...
Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...Perforce
 
DSL, the absolute weapon for the development
DSL, the absolute weapon for the developmentDSL, the absolute weapon for the development
DSL, the absolute weapon for the developmentESUG
 
A 10 Point Localisation Plan For Games
A 10 Point Localisation Plan For GamesA 10 Point Localisation Plan For Games
A 10 Point Localisation Plan For GamesShamusd
 
Learn PHP Lacture1
Learn PHP Lacture1Learn PHP Lacture1
Learn PHP Lacture1ADARSH BHATT
 
Localization (l10n) - The Process
Localization (l10n) - The ProcessLocalization (l10n) - The Process
Localization (l10n) - The ProcessSundeep Anand
 
Programming Languages
Programming LanguagesProgramming Languages
Programming LanguagesLiam Dunphy
 
CH # 1 preliminaries
CH # 1 preliminariesCH # 1 preliminaries
CH # 1 preliminariesMunawar Ahmed
 
Cochrane von Suchodoletz File Creation, Rendering and Formats
Cochrane von Suchodoletz File Creation, Rendering and FormatsCochrane von Suchodoletz File Creation, Rendering and Formats
Cochrane von Suchodoletz File Creation, Rendering and FormatsFuture Perfect 2012
 
2 Person Class [5 marks Complete the provided Person class Add appro.pdf
2 Person Class [5 marks Complete the provided Person class Add appro.pdf2 Person Class [5 marks Complete the provided Person class Add appro.pdf
2 Person Class [5 marks Complete the provided Person class Add appro.pdfarpitcomputronics
 
INTRODUCTION TO COMPUTER SOFTWARE
INTRODUCTION TO COMPUTER SOFTWAREINTRODUCTION TO COMPUTER SOFTWARE
INTRODUCTION TO COMPUTER SOFTWAREabiramiabi21
 
Perforce on Tour 2015 - Optimising the Developer Pipeline: Deliver Faster & ...
Perforce on Tour 2015 -  Optimising the Developer Pipeline: Deliver Faster & ...Perforce on Tour 2015 -  Optimising the Developer Pipeline: Deliver Faster & ...
Perforce on Tour 2015 - Optimising the Developer Pipeline: Deliver Faster & ...Perforce
 
Concepts of Malicious Windows Programs
Concepts of Malicious Windows ProgramsConcepts of Malicious Windows Programs
Concepts of Malicious Windows ProgramsNatraj G
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesMarkus Voelter
 

Similar to Voice Enabled Desktop Interaction and Control System (VEDICS). (20)

computer-science_engineering_principles-of-programming-languages_introduction...
computer-science_engineering_principles-of-programming-languages_introduction...computer-science_engineering_principles-of-programming-languages_introduction...
computer-science_engineering_principles-of-programming-languages_introduction...
 
Online handwritten script recognition
Online handwritten script recognitionOnline handwritten script recognition
Online handwritten script recognition
 
Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...
Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...
Code Reuse Made Easy: Uncovering the Hidden Gems of Corporate and Open Source...
 
DSL, the absolute weapon for the development
DSL, the absolute weapon for the developmentDSL, the absolute weapon for the development
DSL, the absolute weapon for the development
 
A 10 Point Localisation Plan For Games
A 10 Point Localisation Plan For GamesA 10 Point Localisation Plan For Games
A 10 Point Localisation Plan For Games
 
Learn PHP Lacture1
Learn PHP Lacture1Learn PHP Lacture1
Learn PHP Lacture1
 
Puneetsingh
PuneetsinghPuneetsingh
Puneetsingh
 
Localization (l10n) - The Process
Localization (l10n) - The ProcessLocalization (l10n) - The Process
Localization (l10n) - The Process
 
Programming Languages
Programming LanguagesProgramming Languages
Programming Languages
 
CH # 1 preliminaries
CH # 1 preliminariesCH # 1 preliminaries
CH # 1 preliminaries
 
Cochrane von Suchodoletz File Creation, Rendering and Formats
Cochrane von Suchodoletz File Creation, Rendering and FormatsCochrane von Suchodoletz File Creation, Rendering and Formats
Cochrane von Suchodoletz File Creation, Rendering and Formats
 
Unit 1
Unit 1Unit 1
Unit 1
 
2 Person Class [5 marks Complete the provided Person class Add appro.pdf
2 Person Class [5 marks Complete the provided Person class Add appro.pdf2 Person Class [5 marks Complete the provided Person class Add appro.pdf
2 Person Class [5 marks Complete the provided Person class Add appro.pdf
 
Programming language
Programming languageProgramming language
Programming language
 
INTRODUCTION TO COMPUTER SOFTWARE
INTRODUCTION TO COMPUTER SOFTWAREINTRODUCTION TO COMPUTER SOFTWARE
INTRODUCTION TO COMPUTER SOFTWARE
 
Perforce on Tour 2015 - Optimising the Developer Pipeline: Deliver Faster & ...
Perforce on Tour 2015 -  Optimising the Developer Pipeline: Deliver Faster & ...Perforce on Tour 2015 -  Optimising the Developer Pipeline: Deliver Faster & ...
Perforce on Tour 2015 - Optimising the Developer Pipeline: Deliver Faster & ...
 
Computer programming concepts
Computer programming conceptsComputer programming concepts
Computer programming concepts
 
Concepts of Malicious Windows Programs
Concepts of Malicious Windows ProgramsConcepts of Malicious Windows Programs
Concepts of Malicious Windows Programs
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language Workbenches
 
2 software
2 software2 software
2 software
 

More from AEGIS-ACCESSIBLE Projects

Aegis concertation - 2nd International AEGIS conference
Aegis concertation - 2nd International AEGIS conferenceAegis concertation - 2nd International AEGIS conference
Aegis concertation - 2nd International AEGIS conferenceAEGIS-ACCESSIBLE Projects
 
Mobile applications (Panagiotis Tsoris, Steficon)
Mobile applications (Panagiotis Tsoris, Steficon)Mobile applications (Panagiotis Tsoris, Steficon)
Mobile applications (Panagiotis Tsoris, Steficon)AEGIS-ACCESSIBLE Projects
 
ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...
ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...
ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...AEGIS-ACCESSIBLE Projects
 
Basic ICT Training curriculum (Andy Burton, NTU)
Basic ICT Training curriculum (Andy Burton, NTU)Basic ICT Training curriculum (Andy Burton, NTU)
Basic ICT Training curriculum (Andy Burton, NTU)AEGIS-ACCESSIBLE Projects
 
General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)
General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)
General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)AEGIS-ACCESSIBLE Projects
 
Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...
Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...
Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...AEGIS-ACCESSIBLE Projects
 
Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...
Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...
Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...AEGIS-ACCESSIBLE Projects
 
AEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile applicationAEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile applicationAEGIS-ACCESSIBLE Projects
 
AEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web applicationAEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web applicationAEGIS-ACCESSIBLE Projects
 
Conference proceedings 2011 AEGIS International Workshop and Conference
Conference proceedings 2011 AEGIS International Workshop and ConferenceConference proceedings 2011 AEGIS International Workshop and Conference
Conference proceedings 2011 AEGIS International Workshop and ConferenceAEGIS-ACCESSIBLE Projects
 

More from AEGIS-ACCESSIBLE Projects (20)

Newsletter 7 AEGIS project
Newsletter 7 AEGIS projectNewsletter 7 AEGIS project
Newsletter 7 AEGIS project
 
Veritas newsletter no 5 final
Veritas newsletter no 5 finalVeritas newsletter no 5 final
Veritas newsletter no 5 final
 
Aegis concertation - 2nd International AEGIS conference
Aegis concertation - 2nd International AEGIS conferenceAegis concertation - 2nd International AEGIS conference
Aegis concertation - 2nd International AEGIS conference
 
Mobile applications (Panagiotis Tsoris, Steficon)
Mobile applications (Panagiotis Tsoris, Steficon)Mobile applications (Panagiotis Tsoris, Steficon)
Mobile applications (Panagiotis Tsoris, Steficon)
 
ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...
ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...
ViPi platform technologies and integration pathway (Karel Van Isacker, Phoeni...
 
Basic ICT Training curriculum (Andy Burton, NTU)
Basic ICT Training curriculum (Andy Burton, NTU)Basic ICT Training curriculum (Andy Burton, NTU)
Basic ICT Training curriculum (Andy Burton, NTU)
 
ViPi Survey (Andy Burton, NTU)
ViPi Survey (Andy Burton, NTU)ViPi Survey (Andy Burton, NTU)
ViPi Survey (Andy Burton, NTU)
 
General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)
General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)
General introduction of the ViPi project (Karel Van Isacker, PhoenixKM)
 
Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...
Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...
Semantic Content Management enhancements (George Milis, G.M EuroCy Innovation...
 
Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...
Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...
Gelijke kansen op informatie, toegankelijke documenten en communicatiekanalen...
 
AEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile applicationAEGIS SP4 story - building an accessible mobile application
AEGIS SP4 story - building an accessible mobile application
 
AEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web applicationAEGIS SP3 story - building an accessible web application
AEGIS SP3 story - building an accessible web application
 
ACCESSIBLE newsletter n° 6
ACCESSIBLE newsletter n° 6ACCESSIBLE newsletter n° 6
ACCESSIBLE newsletter n° 6
 
AEGIS Newsletter n° 6
AEGIS Newsletter n° 6AEGIS Newsletter n° 6
AEGIS Newsletter n° 6
 
VERITAS newsletter n° 3
VERITAS newsletter n° 3VERITAS newsletter n° 3
VERITAS newsletter n° 3
 
VERITAS newsletter n° 2
VERITAS newsletter n° 2VERITAS newsletter n° 2
VERITAS newsletter n° 2
 
VERITAS newsletter n° 4
VERITAS newsletter n° 4VERITAS newsletter n° 4
VERITAS newsletter n° 4
 
Conference proceedings 2011 AEGIS International Workshop and Conference
Conference proceedings 2011 AEGIS International Workshop and ConferenceConference proceedings 2011 AEGIS International Workshop and Conference
Conference proceedings 2011 AEGIS International Workshop and Conference
 
Aegis concertation certh
Aegis concertation certhAegis concertation certh
Aegis concertation certh
 
Veritas iti aegis_conf
Veritas iti aegis_confVeritas iti aegis_conf
Veritas iti aegis_conf
 

Recently uploaded

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Voice Enabled Desktop Interaction and Control System (VEDICS).

  • 1. Desktop at Your Command
  • 2. • Team Members:  Nischal E Rao  Bharat Joshi  Suhas Kamath N  Sharath M Puranik • Project Guide: Prof. Shantharam Nayak • Carried out at: R.V. College of Engineering, Bangalore, India.
  • 3. • Voice Enabled Desktop Interaction and Control System(VEDICS) is a software solution for controlling the desktop system using voice based commands. • The system takes audio signals as input, processes it, recognizes it and executes the desired action on the desktop system.
  • 4. • All software products should incorporate accessibility features to enable differently-abled people to use the software easily and efficiently. • For persons with physical disabilities, the ability to simply talk to a computer could be a priceless asset. • Hands-free computing is more convenient than conventional I/O.
  • 5. • The user should be able to o access any element present on the user’s screen. o run common programs and applications. o navigate through the file system. o perform common window operations like minimize, maximize, close etc. • User commands should be easy to remember and use. • The user must be able to turn the system on and off whenever required.
  • 6. • VEDICS follows MVC design pattern. • Flexibility of using any speech-to-text converter for use with VEDICS. • VEDICS uses a feedback mechanism to learn what is being displayed on the desktop. • Increased accuracy since only relevant words are recognized.
  • 7. Recognized Text Desktop Speech-to-text Control Converter System Grammar and Names of visible elements Command Currently visible objects User’s Desktop
  • 8. • Speech to text Conversion Speech To Text Converter
  • 9. • Grammar and Dictionary are used to convert sound signals into text. Speech To Text Converter Grammar Dictionary
  • 10. • The recognized text is given as input to the Desktop Control System. Speech To “Open Firefox” Desktop Text Converter Control System Grammar Dictionary
  • 11. • The Desktop Control System determines the command to execute on the desktop. Speech To Desktop Text Converter Control System Open_firefox command
  • 12. • After successful execution, the names of objects visible on the screen are collected. Speech To Desktop Text Converter Control System “File” | “Edit” | “Google”
  • 13. • The collected names are used to update the grammar and the dictionary files. Speech To Desktop Text Converter Control System “File”, “Edit”, “Google” Grammar Dictionary
  • 14. • The updated grammar and dictionary files are used in the next recognition cycle. Speech To Text Converter Updated Grammar Updated Dictionary
  • 15. • VEDICS consists of the following parts: o Sphinx 4 Sub-system : Open Source tool used to convert speech to text. o Desktop Control Sub-system: Used to execute the converted text into corresponding command on the desktop. It re-creates the grammar file based on what is displayed on the screen. o Logios Tool : Used to generate a new dictionary based on what is displayed on the screen.
  • 16.
  • 17. • Accuracy of VEDICS depends on accuracy of Sphinx 4. • Summary of performance of Sphinx 4: Parameters Performance Vocabulary Size 79 Word Error Rate (in %) 1.192 RT Ratio in Single CPU Configuration* 0.25 RT Ratio in Dual CPU Configuration* 0.20 * RT Ratio: Ratio of utterance duration to the time taken to decode the utterance.
  • 18. • Increased accuracy due to context aware nature of VEDICS. • Use of small vocabulary further improves accuracy. • Use of Logios enables recognition of custom words. Words with any sequence of characters can be recognized. • Almost all components accessible on the desktop.
  • 19. • VEDICS can be used to perform most actions that can be done using a pointing device. • Using voice to access and control the desktop has many advantages. This feature can be a boon to the differently-abled people. • VEDICS can navigate through file system, open applications, control the desktop window, and recognize almost any word. • VEDICS is context aware. It determines what is currently being displayed on the desktop and dynamically generates the grammar and the dictionary.
  • 20. • Dictation facility: The ability to dictate into a text editor or text field. • Artificial Intelligence in VEDICS. • If there is a conflict in name of object on the screen then the user should be able to select the right object. • The user should be able to either pronounce the entire word or spell individual characters of the word. • Facility to add custom commands to suit the user. • Screen Reader Facility.
  • 21. Project Link: http://vedics.sourceforge.net/ References: • Willie Walker, Paul Lamere, Philip Kwok, Bhiksha Raj, Rita Singh, Evandro Gouvea, Peter Wolf, Joe Woelfel, “Sphinx-4: A Flexible Open Source Framework for Speech Recognition”, SML Technical Report, Sun Microsystems, SMLI TR-2004-139, Nov. 2004 • Kai-Fu Lee, Hsiao-Wuen Hon, Raj Reddy, “An Overview of the SPHINX Speech Recognition System”, IEEE Transactions on Acoustics Speech and Signal Processing, Vol 38, No. 1, Jan, 1990. • Frank Buschmann, Regine Meunier, Hans Rohnert, Peter Sommerlad, Michael Stal, “Pattern-Oriented Software Architecture – Vol 1: A System of Patterns”, Wiley Publications, 1996.
  • 22. • Gnome Voice Control [Online]. Available: http://live.gnome.org/GnomeVoiceControl • “Java Speech Grammar Format (JSGF)” [Online]. Available: http://java.sun.com/products/java- media/speech/forDevelopers/JSGF/ • “Logios Lexicon Tool” [Online]. Available: http://www.speech.cs.cmu.edu/ tools/lextool.html • “Gnome Accessibility API” [Online]. Available: http://library.gnome.org/devel/at-spi-cspi/ • “Libwnck: Window Navigator Construction Kit” [Online]. Available: http://library.gnome.org/devel/libwnck/ • “GConf Configuration System” [Online]. Available: http://library.gnome.org/devel/gconf/