SlideShare a Scribd company logo
1 of 2
Download to read offline
CN 711 Speech Recognition

                        Course Instructor: Dr. M. Sabarimalai Manikandan
                                 E-mail: msm.sabari@gmail.com
CN 711: Speech Recognition Course Topics
Course Objectives:                                          B.   Introduction to Speech Signals
This course provides an introduction to the field of        •    Speech production mechanism
digital speech processing and applications. Speech          •    Types of Sounds, Vowels and consonants
Processing offers a practical and theoretical               •    Loudness, Sound Pressure
understanding of how human speech can be processed          •    Nature of speech signal, models of speech production
by computers. It covers speech analysis and synthesis,      •    Silence, Voiced and Unvoiced Speech
speech features, speech and speaker recognition, speech     •    Naturalness and Intelligibility
synthesis and applications. The course involves practical   •    Speech data acquisition system
where the student will build working text-to-speech         •    Why speech processing
system in his native language, speech recognition           •    Speech perception model
systems, build their own synthetic voice and build a
complete telephone spoken dialog system.

A. Review some basic DSP concepts

C.   Speech Analysis and Synthesis                          D.   Speech Features for Recognition
•    Short-time Fourier Analysis, Spectrogram               •    Temporal and Short-Time Fourier Transform Features
•    Autocorrelation and cross-correlation                  •    Teager Energy Based Features, Entropy
•    Human speech production model                          •    Cepstral Coefficients
•    Temporal and spectral characteristics                  •    Linear Prediction-based Cepstral coefficients (LPCC)
•    Linear prediction (LP) filter theory                   •    Mel Frequency Cepstral Coefficients (MFCCs)
•    All-pole Filter, Inverse Filtering                     •    AM-FM Features, Time-Frequency Analysis
•    Formants and Pitch Determination                       •    Wavelet Octave Coefficients of Residues (WOCR)
•    LP Residuals and Hilbert Transform                     •    Voice Activity Detection
•    Vocal tract length normalization                       •    Silence, Voiced, and Unvoiced Speech Classification

E.             Enhancement
                 nhancement,
     Speech Enhancement, Coding and Quality                 F.         Recognition
                                                               Speaker Recognition
     Assessment                                             •  Basic ASR System
•    Acoustic echo cancellation                             •  Close-set and Open-set ASR System
•    Reverberant speech enhancement                         •  Speaker Identification and Verification
•    Removal of Different Types of noise and artifacts      •  Text-Independent and Text-Dependent Recognition
•    Speech Coding                                          •  Mean Normalization, Feature Smoothing
•    Subjective and Objective Metrics                       •  Dynamic Time Warping (DTW), Vector Quantization
                                                            •  Gaussian Mixture Models (GMMs) and Universal
                                                               Background Model (UBM)
                                                            • Log-Likelihood Ratio (LLR)
                                                            • False Acceptance Probability, False Rejection
                                                               probability
                                                            • Detection Error Trade-off (DET) curve
                                                            • Equal Error Rate (EER)
G.   Speech Recognition                                     H. Speech Preprocessing Applications
•    Signal Processing, Template matching                   • Voice Conversion, Text-Speech Synthesis
•    Phoneme-Recognition                                    • Spoken Dialogue System,
•    HMMs, Acoustic Modeling, Language Modeling             • Interactive Voice Response (IVR) System
•    Continuous and Emotional Speech Recognition            • Identify Your ID
•    Performance Evaluation
Textbooks and Materials
[1].    Li Tan, Digital Signal Processing: Fundamentals and Applications, Elsevier, 2008.
[2].    Jayant, N.S.; Noll, P. Digital coding of waveforms: principles and applications to speech and video. Englewood
        Cliffs, NJ: Prentice Hall, 1984. ISBN 0132119137.
[3].    Rabiner, L.R.; Juang, B. Fundamentals of speech recognition. Englewood Cliffs: Prentice Hall, 1993. ISBN
        0130151572.
[4].    L.R. Rabiner and R.E Schafer : Digital processing of speech signals, Prentice Hall, 1978.
[5].    J.L Flanagan : Speech Analysis Synthesis and Perception - 2nd Edition - Sprenger Vertag, 1972.
[6].    Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1997.
[7].    Jurafsky & Martin. Speech and Language Processing: An Introduction to NLP, CL, and Speech Recognition,
        Prentice Hall, 2000.
[8].    T.F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice, Prentice-Hall, 2001.
[9].    J. R. Deller, J. H. L. Hansen, and J. G. Proakis, Discrete-Time Processing of Speech Signals, 2nd edition, IEEE
        Press, 2000.
[10].   T. W. Parsons, Voice and Speech Processing, McGraw-Hill, 1987.
[11].   X. Huang, A. Acero, H. Hon, and R. Reddy, Spoken Language Processing: A Guide to Theory, Algorithm and
        System Development, Prentice-Hall, 2001.
[12]. Instructor's Notes

Programming Languages: MATLAB and Jave Media Framework
            Languages:


Important Standard Journals in the Field of Audio and Speech            Important Conferences in the Field of Audio
Processing                                                              and Speech Processing
• IEEE Transactions on Audio, Speech and Language Processing            • IEEE Int. Conf. on Acoustics, Speech and
• IEEE Transactions on Signal Processing                                    Signal Processing (ICASSP)
• IEEE Signal Processing Magazine                                       • Eurospeech
• IEEE Transactions on Information Forensics and Security               • Int. Conf. on Spoken Language Processing
• ACM Transactions on Speech and Language Processing                        (ICSLP)
• IEEE Multimedia                                                       • Acoustical Society of America
• Speech Communication (by Elsevier)
• IEEE Signal Processing Letters
• Signal Processing (by Elsevier)
• Digital Signal Processing (by Elsevier)
• International Journal of Speech Technology
• International Journal of Speech Technology (by Springer)
• Signal, Image and Video Processing (by Springer)
• Computer Speech and Language
• EURASIP Journal on Audio, Speech, and Music Processing wi)
• Journal of Acoustical Society of America (JASA )
• Audio Engineering Society

More Related Content

What's hot

Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...
Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...
Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...Guy De Pauw
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
Speech recognition challenges
Speech recognition challengesSpeech recognition challenges
Speech recognition challengesAlexandru Chica
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Forensic phonetics[1]
Forensic phonetics[1]Forensic phonetics[1]
Forensic phonetics[1]PAHELI SHARMA
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Speaker identification based on temporal parameters
Speaker identification based on temporal parametersSpeaker identification based on temporal parameters
Speaker identification based on temporal parametersAlexandria University
 
K12Translate Webinar Slides: Engaging ELL Parents
K12Translate Webinar Slides: Engaging ELL ParentsK12Translate Webinar Slides: Engaging ELL Parents
K12Translate Webinar Slides: Engaging ELL ParentsVIA
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech RecognitionYogesh Vijay
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speechBilgin Aksoy
 
American Standard Sign Language Representation Using Speech Recognition
American Standard Sign Language Representation Using Speech RecognitionAmerican Standard Sign Language Representation Using Speech Recognition
American Standard Sign Language Representation Using Speech Recognitionpaperpublications3
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Voice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediaVoice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediacidroypaes
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniquessonukumar142
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technologyKalluri Madhuri
 
Via language webinar_tips_to_streamline_and_save_on_healthcare_translations
Via language webinar_tips_to_streamline_and_save_on_healthcare_translationsVia language webinar_tips_to_streamline_and_save_on_healthcare_translations
Via language webinar_tips_to_streamline_and_save_on_healthcare_translationsVIA
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-reportAnakali Mahesh
 

What's hot (20)

Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...
Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...
Collecting and Evaluating Speech Recognition Corpora for Nine Southern Bantu ...
 
Speech processing
Speech processingSpeech processing
Speech processing
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Speech recognition challenges
Speech recognition challengesSpeech recognition challenges
Speech recognition challenges
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Forensic phonetics[1]
Forensic phonetics[1]Forensic phonetics[1]
Forensic phonetics[1]
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speaker identification based on temporal parameters
Speaker identification based on temporal parametersSpeaker identification based on temporal parameters
Speaker identification based on temporal parameters
 
K12Translate Webinar Slides: Engaging ELL Parents
K12Translate Webinar Slides: Engaging ELL ParentsK12Translate Webinar Slides: Engaging ELL Parents
K12Translate Webinar Slides: Engaging ELL Parents
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
American Standard Sign Language Representation Using Speech Recognition
American Standard Sign Language Representation Using Speech RecognitionAmerican Standard Sign Language Representation Using Speech Recognition
American Standard Sign Language Representation Using Speech Recognition
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Voice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social mediaVoice input and speech recognition system in tourism/social media
Voice input and speech recognition system in tourism/social media
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
 
Via language webinar_tips_to_streamline_and_save_on_healthcare_translations
Via language webinar_tips_to_streamline_and_save_on_healthcare_translationsVia language webinar_tips_to_streamline_and_save_on_healthcare_translations
Via language webinar_tips_to_streamline_and_save_on_healthcare_translations
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-report
 

Similar to Speech recognition (dr. m. sabarimalai manikandan)

Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionZachary S. Brown
 
CEC356 SPEECH PROCESSING.pptx
CEC356 SPEECH PROCESSING.pptxCEC356 SPEECH PROCESSING.pptx
CEC356 SPEECH PROCESSING.pptxRavi554618
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxVishnuRajuV
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
An HLT profile of the official South African languages
An HLT profile of the official South African languagesAn HLT profile of the official South African languages
An HLT profile of the official South African languagesGuy De Pauw
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
WiHear - We Can Hear You with Wi-Fi!
WiHear - We Can Hear You with Wi-Fi!WiHear - We Can Hear You with Wi-Fi!
WiHear - We Can Hear You with Wi-Fi!Pop Trinh
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition systemavinash raibole
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCCHira Shaukat
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNIJCSEA Journal
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNIJCSEA Journal
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identificationIJCSEA Journal
 
IV_WORKSHOP_NVIDIA-Audio_Processing
IV_WORKSHOP_NVIDIA-Audio_ProcessingIV_WORKSHOP_NVIDIA-Audio_Processing
IV_WORKSHOP_NVIDIA-Audio_Processingdiegogee
 

Similar to Speech recognition (dr. m. sabarimalai manikandan) (20)

Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Automatic Speech Recognion
Automatic Speech RecognionAutomatic Speech Recognion
Automatic Speech Recognion
 
Speech-Recognition.pptx
Speech-Recognition.pptxSpeech-Recognition.pptx
Speech-Recognition.pptx
 
Amadou
AmadouAmadou
Amadou
 
CEC356 SPEECH PROCESSING.pptx
CEC356 SPEECH PROCESSING.pptxCEC356 SPEECH PROCESSING.pptx
CEC356 SPEECH PROCESSING.pptx
 
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptxLiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
 
Iitdmj 1
Iitdmj 1Iitdmj 1
Iitdmj 1
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
An HLT profile of the official South African languages
An HLT profile of the official South African languagesAn HLT profile of the official South African languages
An HLT profile of the official South African languages
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
WiHear - We Can Hear You with Wi-Fi!
WiHear - We Can Hear You with Wi-Fi!WiHear - We Can Hear You with Wi-Fi!
WiHear - We Can Hear You with Wi-Fi!
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 
Asr
AsrAsr
Asr
 
Speaker recognition using MFCC
Speaker recognition using MFCCSpeaker recognition using MFCC
Speaker recognition using MFCC
 
CoLing 2016
CoLing 2016CoLing 2016
CoLing 2016
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identification
 
IV_WORKSHOP_NVIDIA-Audio_Processing
IV_WORKSHOP_NVIDIA-Audio_ProcessingIV_WORKSHOP_NVIDIA-Audio_Processing
IV_WORKSHOP_NVIDIA-Audio_Processing
 
Assign
AssignAssign
Assign
 

Recently uploaded

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 

Recently uploaded (20)

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 

Speech recognition (dr. m. sabarimalai manikandan)

  • 1. CN 711 Speech Recognition Course Instructor: Dr. M. Sabarimalai Manikandan E-mail: msm.sabari@gmail.com CN 711: Speech Recognition Course Topics Course Objectives: B. Introduction to Speech Signals This course provides an introduction to the field of • Speech production mechanism digital speech processing and applications. Speech • Types of Sounds, Vowels and consonants Processing offers a practical and theoretical • Loudness, Sound Pressure understanding of how human speech can be processed • Nature of speech signal, models of speech production by computers. It covers speech analysis and synthesis, • Silence, Voiced and Unvoiced Speech speech features, speech and speaker recognition, speech • Naturalness and Intelligibility synthesis and applications. The course involves practical • Speech data acquisition system where the student will build working text-to-speech • Why speech processing system in his native language, speech recognition • Speech perception model systems, build their own synthetic voice and build a complete telephone spoken dialog system. A. Review some basic DSP concepts C. Speech Analysis and Synthesis D. Speech Features for Recognition • Short-time Fourier Analysis, Spectrogram • Temporal and Short-Time Fourier Transform Features • Autocorrelation and cross-correlation • Teager Energy Based Features, Entropy • Human speech production model • Cepstral Coefficients • Temporal and spectral characteristics • Linear Prediction-based Cepstral coefficients (LPCC) • Linear prediction (LP) filter theory • Mel Frequency Cepstral Coefficients (MFCCs) • All-pole Filter, Inverse Filtering • AM-FM Features, Time-Frequency Analysis • Formants and Pitch Determination • Wavelet Octave Coefficients of Residues (WOCR) • LP Residuals and Hilbert Transform • Voice Activity Detection • Vocal tract length normalization • Silence, Voiced, and Unvoiced Speech Classification E. Enhancement nhancement, Speech Enhancement, Coding and Quality F. Recognition Speaker Recognition Assessment • Basic ASR System • Acoustic echo cancellation • Close-set and Open-set ASR System • Reverberant speech enhancement • Speaker Identification and Verification • Removal of Different Types of noise and artifacts • Text-Independent and Text-Dependent Recognition • Speech Coding • Mean Normalization, Feature Smoothing • Subjective and Objective Metrics • Dynamic Time Warping (DTW), Vector Quantization • Gaussian Mixture Models (GMMs) and Universal Background Model (UBM) • Log-Likelihood Ratio (LLR) • False Acceptance Probability, False Rejection probability • Detection Error Trade-off (DET) curve • Equal Error Rate (EER) G. Speech Recognition H. Speech Preprocessing Applications • Signal Processing, Template matching • Voice Conversion, Text-Speech Synthesis • Phoneme-Recognition • Spoken Dialogue System, • HMMs, Acoustic Modeling, Language Modeling • Interactive Voice Response (IVR) System • Continuous and Emotional Speech Recognition • Identify Your ID • Performance Evaluation
  • 2. Textbooks and Materials [1]. Li Tan, Digital Signal Processing: Fundamentals and Applications, Elsevier, 2008. [2]. Jayant, N.S.; Noll, P. Digital coding of waveforms: principles and applications to speech and video. Englewood Cliffs, NJ: Prentice Hall, 1984. ISBN 0132119137. [3]. Rabiner, L.R.; Juang, B. Fundamentals of speech recognition. Englewood Cliffs: Prentice Hall, 1993. ISBN 0130151572. [4]. L.R. Rabiner and R.E Schafer : Digital processing of speech signals, Prentice Hall, 1978. [5]. J.L Flanagan : Speech Analysis Synthesis and Perception - 2nd Edition - Sprenger Vertag, 1972. [6]. Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1997. [7]. Jurafsky & Martin. Speech and Language Processing: An Introduction to NLP, CL, and Speech Recognition, Prentice Hall, 2000. [8]. T.F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice, Prentice-Hall, 2001. [9]. J. R. Deller, J. H. L. Hansen, and J. G. Proakis, Discrete-Time Processing of Speech Signals, 2nd edition, IEEE Press, 2000. [10]. T. W. Parsons, Voice and Speech Processing, McGraw-Hill, 1987. [11]. X. Huang, A. Acero, H. Hon, and R. Reddy, Spoken Language Processing: A Guide to Theory, Algorithm and System Development, Prentice-Hall, 2001. [12]. Instructor's Notes Programming Languages: MATLAB and Jave Media Framework Languages: Important Standard Journals in the Field of Audio and Speech Important Conferences in the Field of Audio Processing and Speech Processing • IEEE Transactions on Audio, Speech and Language Processing • IEEE Int. Conf. on Acoustics, Speech and • IEEE Transactions on Signal Processing Signal Processing (ICASSP) • IEEE Signal Processing Magazine • Eurospeech • IEEE Transactions on Information Forensics and Security • Int. Conf. on Spoken Language Processing • ACM Transactions on Speech and Language Processing (ICSLP) • IEEE Multimedia • Acoustical Society of America • Speech Communication (by Elsevier) • IEEE Signal Processing Letters • Signal Processing (by Elsevier) • Digital Signal Processing (by Elsevier) • International Journal of Speech Technology • International Journal of Speech Technology (by Springer) • Signal, Image and Video Processing (by Springer) • Computer Speech and Language • EURASIP Journal on Audio, Speech, and Music Processing wi) • Journal of Acoustical Society of America (JASA ) • Audio Engineering Society