SlideShare a Scribd company logo
1 of 17
BRANA (ብራና): APPLICATION OF AMHARIC SPEECH RCOGNITION SYSTEM
FOR
DICTATION IN JUDICIAL DOMAIN
Presenter: Bantegize Addis
Adviser: Solomon Teferra
June 05, 2015
What I’m going to talk about…
1. Introduction…
2. Related Works….
3. Speech database….
4. Architecture…
5. Implementation…
6. Our Results….
7. Conclusion & future works…
2
1. Introduction
General Background
 Automatic speech recognition gives us a new channel for
communication with computers.
 Speech technology is the technology of today and
tomorrow.
 It has practical implementations for both fun and serious
works.
 Mostly applied in command and control, data entry
and retrieval, and dictation functions.
3
Intro.(Cont.)
Statement of the Problem
 Dictation has been a common application area for ASRS for a
long period.
 Amharic is the second most-spoken Semitic language in the
world .
 It is the official working language of the FDRE.
 To the best of our knowledge, there is no attempted works
about application of speech recognition for dictation in
Amharic.
4
2. Related works
 Automatic speech recognition for Amharic
5
Author Type of
recognizer
Unit of
recognition
Recognition result
Solomon
Berhanu
Isolated Consonant-Vowel
(CV) Syllable
Speaker dependent : 87.68%
Speaker independent: 72.75%
Speaker not involved in the
training: 49.21%
Kinfe
Taddesse
Isolated Phoneme
Tri-phone
Consonant-Vowel
(CV) Syllable
Speaker independent tri-phone
(Test set I): 91.46 %
Speaker independent tri-phone
(Test set I): 77.87%
Related…(Cont.)
Author Type of
Recognizer
Unit of
Recognition
Recognition result
Solomon
Teferra
Continuous Tri-phone
Consonant-Vowel
(CV) Syllable
Speaker Independent tri-
phone: 91.31%
Speaker Independent CV: of
90.43%
6
Application of Amharic Speech Recognition
Martha developed Amharic speech input interface to command and control
Microsoft Word
3. Speech Corpus & annotation
 Two types of Speech Corpus
I. Spontaneous speech corpus
II. Read speech corpus
7
4. The Architecture of BRANA
8
5. Implementation of BRANA
Development Tools
 JDK jdk1.7.0_05 with Eclipse
 Sphinx-4
 Sphinx Trainer
 SRILM
 cmuclmtk-0.7-win32
9
Implementation…(Cont.)
User Interface Implementation
 The user interface of our system is responsible for:
 editing rtf text documents
 PPT(Push To Talk) functionality
 displaying the uttered word or sentence hypothesis to the
user
 Developed using Java
10
Implementation…(Cont.)
11
Push button to initiate recogntion
GUI snapshot of BRANA(ብራና)
Implementation…(Cont.)
Python program implementation
 Pronunciation Dictionary
Python script program is implemented for generating a
grapheme-based canonical pronunciation dictionary
12
6. Our results…
13
Recognition performance of spontaneous speech recognizer
(In a Batch Recognition Mode)
Acoustic
Model
Language
Model
Accuracy
(%)
WER (%)
AM-CD_8
LM-ABS 49.537 55.093
LM-GT 49.769 56.019
LM-MKN 50.463 54.861
LM-WB 49.769 55.093
Spontaneous
Speech Recognizer
result…(Cont.)
14
Recognition performance of continuous read speech recognizer
(In a Batch Recognition Mode)
Acoustic
Model
Language
Model
Accuracy
(%)
WER (%)
AM-CD_8
LM-ABS 84.867 16.134
LM-GT 77.227 24.310
LM-MKN 84.306 16.768
LM-WB 84.550 16.475
Continuous Read
Speech Recognizer
7. CONCLUSION AND FUTURE WORKS
Conclusion
 Amharic speech recognition application for dictation
 BRANA
 Main components : speech recognizer engine module and dictation
application module
 Sub modules of speech recognizer : Acoustic model module, language
model module and pronunciation dictionary module
 Front end user Interface
 90 min spontaneous speech corpus and 20 hours read speech corpus
 HMM approach
15
Concl…(Cont.)
Future Works…
 Automatic Error Correction
 Improve performance of speech recognizer
 Incorporate command and control
 Noise canceling
 Generic dictator
 Spontaneous speech corpus
16
Thank You!
Questions?
17

More Related Content

Similar to 8th Ethiopian ICT Conference Bazaar and Exhibition.pptx

4Developers 2015: Talking and listening to web pages - Aurelio De Rosa
4Developers 2015: Talking and listening to web pages - Aurelio De Rosa4Developers 2015: Talking and listening to web pages - Aurelio De Rosa
4Developers 2015: Talking and listening to web pages - Aurelio De RosaPROIDEA
 
Python | What is Python | History of Python | Python Tutorial
Python | What is Python | History of Python | Python TutorialPython | What is Python | History of Python | Python Tutorial
Python | What is Python | History of Python | Python TutorialQA TrainingHub
 
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...nehachhh
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Concept of computer programming iv
Concept of computer programming ivConcept of computer programming iv
Concept of computer programming ivEyelean xilef
 
Computer Science Is The Study Of Principals And How The...
Computer Science Is The Study Of Principals And How The...Computer Science Is The Study Of Principals And How The...
Computer Science Is The Study Of Principals And How The...Laura Martin
 
Generations Of Programming Languages
Generations Of Programming LanguagesGenerations Of Programming Languages
Generations Of Programming Languagespy7rjs
 
Instant speech translation 10BM60080 - VGSOM
Instant speech translation   10BM60080 - VGSOMInstant speech translation   10BM60080 - VGSOM
Instant speech translation 10BM60080 - VGSOMsathiyaseelanm
 
B tech project_report
B tech project_reportB tech project_report
B tech project_reportabhiuaikey
 
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IRJET Journal
 
Hindi speech enabled windows application using microsoft
Hindi speech enabled windows application using microsoftHindi speech enabled windows application using microsoft
Hindi speech enabled windows application using microsoftIAEME Publication
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesNicole Heredia
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to pythonJaya Kumari
 
Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...IOSR Journals
 

Similar to 8th Ethiopian ICT Conference Bazaar and Exhibition.pptx (20)

Seminar
SeminarSeminar
Seminar
 
4Developers 2015: Talking and listening to web pages - Aurelio De Rosa
4Developers 2015: Talking and listening to web pages - Aurelio De Rosa4Developers 2015: Talking and listening to web pages - Aurelio De Rosa
4Developers 2015: Talking and listening to web pages - Aurelio De Rosa
 
Python | What is Python | History of Python | Python Tutorial
Python | What is Python | History of Python | Python TutorialPython | What is Python | History of Python | Python Tutorial
Python | What is Python | History of Python | Python Tutorial
 
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
10 World’s Leading Speech or Voice Recognition Software That Can 3X Your Prod...
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Concept of computer programming iv
Concept of computer programming ivConcept of computer programming iv
Concept of computer programming iv
 
Computer Science Is The Study Of Principals And How The...
Computer Science Is The Study Of Principals And How The...Computer Science Is The Study Of Principals And How The...
Computer Science Is The Study Of Principals And How The...
 
Generations Of Programming Languages
Generations Of Programming LanguagesGenerations Of Programming Languages
Generations Of Programming Languages
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal Code
 
Computer Programming
Computer Programming Computer Programming
Computer Programming
 
Computer
ComputerComputer
Computer
 
Instant speech translation 10BM60080 - VGSOM
Instant speech translation   10BM60080 - VGSOMInstant speech translation   10BM60080 - VGSOM
Instant speech translation 10BM60080 - VGSOM
 
B tech project_report
B tech project_reportB tech project_report
B tech project_report
 
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
 
Hindi speech enabled windows application using microsoft
Hindi speech enabled windows application using microsoftHindi speech enabled windows application using microsoft
Hindi speech enabled windows application using microsoft
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
 
An Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile EnvironmentAn Application for Performing Real Time Speech Translation in Mobile Environment
An Application for Performing Real Time Speech Translation in Mobile Environment
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...Designing the Workflow of a Language Interpretation Device Using Artificial I...
Designing the Workflow of a Language Interpretation Device Using Artificial I...
 
Ayushi
AyushiAyushi
Ayushi
 

Recently uploaded

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

8th Ethiopian ICT Conference Bazaar and Exhibition.pptx

  • 1. BRANA (ብራና): APPLICATION OF AMHARIC SPEECH RCOGNITION SYSTEM FOR DICTATION IN JUDICIAL DOMAIN Presenter: Bantegize Addis Adviser: Solomon Teferra June 05, 2015
  • 2. What I’m going to talk about… 1. Introduction… 2. Related Works…. 3. Speech database…. 4. Architecture… 5. Implementation… 6. Our Results…. 7. Conclusion & future works… 2
  • 3. 1. Introduction General Background  Automatic speech recognition gives us a new channel for communication with computers.  Speech technology is the technology of today and tomorrow.  It has practical implementations for both fun and serious works.  Mostly applied in command and control, data entry and retrieval, and dictation functions. 3
  • 4. Intro.(Cont.) Statement of the Problem  Dictation has been a common application area for ASRS for a long period.  Amharic is the second most-spoken Semitic language in the world .  It is the official working language of the FDRE.  To the best of our knowledge, there is no attempted works about application of speech recognition for dictation in Amharic. 4
  • 5. 2. Related works  Automatic speech recognition for Amharic 5 Author Type of recognizer Unit of recognition Recognition result Solomon Berhanu Isolated Consonant-Vowel (CV) Syllable Speaker dependent : 87.68% Speaker independent: 72.75% Speaker not involved in the training: 49.21% Kinfe Taddesse Isolated Phoneme Tri-phone Consonant-Vowel (CV) Syllable Speaker independent tri-phone (Test set I): 91.46 % Speaker independent tri-phone (Test set I): 77.87%
  • 6. Related…(Cont.) Author Type of Recognizer Unit of Recognition Recognition result Solomon Teferra Continuous Tri-phone Consonant-Vowel (CV) Syllable Speaker Independent tri- phone: 91.31% Speaker Independent CV: of 90.43% 6 Application of Amharic Speech Recognition Martha developed Amharic speech input interface to command and control Microsoft Word
  • 7. 3. Speech Corpus & annotation  Two types of Speech Corpus I. Spontaneous speech corpus II. Read speech corpus 7
  • 8. 4. The Architecture of BRANA 8
  • 9. 5. Implementation of BRANA Development Tools  JDK jdk1.7.0_05 with Eclipse  Sphinx-4  Sphinx Trainer  SRILM  cmuclmtk-0.7-win32 9
  • 10. Implementation…(Cont.) User Interface Implementation  The user interface of our system is responsible for:  editing rtf text documents  PPT(Push To Talk) functionality  displaying the uttered word or sentence hypothesis to the user  Developed using Java 10
  • 11. Implementation…(Cont.) 11 Push button to initiate recogntion GUI snapshot of BRANA(ብራና)
  • 12. Implementation…(Cont.) Python program implementation  Pronunciation Dictionary Python script program is implemented for generating a grapheme-based canonical pronunciation dictionary 12
  • 13. 6. Our results… 13 Recognition performance of spontaneous speech recognizer (In a Batch Recognition Mode) Acoustic Model Language Model Accuracy (%) WER (%) AM-CD_8 LM-ABS 49.537 55.093 LM-GT 49.769 56.019 LM-MKN 50.463 54.861 LM-WB 49.769 55.093 Spontaneous Speech Recognizer
  • 14. result…(Cont.) 14 Recognition performance of continuous read speech recognizer (In a Batch Recognition Mode) Acoustic Model Language Model Accuracy (%) WER (%) AM-CD_8 LM-ABS 84.867 16.134 LM-GT 77.227 24.310 LM-MKN 84.306 16.768 LM-WB 84.550 16.475 Continuous Read Speech Recognizer
  • 15. 7. CONCLUSION AND FUTURE WORKS Conclusion  Amharic speech recognition application for dictation  BRANA  Main components : speech recognizer engine module and dictation application module  Sub modules of speech recognizer : Acoustic model module, language model module and pronunciation dictionary module  Front end user Interface  90 min spontaneous speech corpus and 20 hours read speech corpus  HMM approach 15
  • 16. Concl…(Cont.) Future Works…  Automatic Error Correction  Improve performance of speech recognizer  Incorporate command and control  Noise canceling  Generic dictator  Spontaneous speech corpus 16