SlideShare a Scribd company logo
1 of 12
WOLAITA SODO UNIVERSITY
SHOOL OF INFORMATICS
DEPARTMENT OF INFORMATION TECHNOLOGY
IT MSc Regular
Course:- IMS
Article review on speech segmentation.
By:-
Abebe Tora Pgr/82835/15
Submitted To: - Dr. siraj
Sub. Date: - Jan 12/2014
2
Contents
1. Title and authors
2. Introduction
3. Description
4. Comparison
5. Future work
3
Title and authors
 Automatic Speech Segmentation. Alaa Ehab Sakran 1, et al (April
2017).
 Amharic Speech Search Using Text Word Query Based on
Automatic Sentence-like Segmentation Getnet Mezgebu et al
(Nov. 2022
 Automatic Speech Segmentation for Amharic Phonemes Using
Hidden Markov Model Toolkit (HTK). Eshete Derb Emiru [1],
Walelign Tewabe Sewunetie [2] ( Aug 2016).
 Phoneme level automatic speech segmentation for Amharic
language using HMM approach.by Dr. Sebsbie Hailmariam.
4
Introduction
• For more than thirty years, researchers have been studying
automated speech segmentation in an effort to divide speech
signals into smaller pieces for use in voice synthesis and
recognition, among other applications.
• Speech segmentation is the process of identifying the boundaries
between words, syllables or phonemes in spoken natural languages.
• I present a thorough analysis of four research papers that examine
various strategies and developments in automatic voice
segmentation in this article.
• All articles are used their own methods and techniques.
5
Introduction
• In this review, I aim to provide insights into the
advancements and challenges in automatic speech
segmentation techniques.
• By examining these four articles, I will gain a better
understanding of the various approaches,
methodologies, and applications in this field.
• More detail descriptions are clearly described bellow by
table format and additionally comparisons and future
works are also listed.
6
Description
No. Authors Titles Methods Findings Limitations
1. Alaa Ehab
Sakran. et
al (April-
2017)
Automatic
Speech
Segmentation.
Wavelet, Fuzzy methods
(based on IOT devices),
Artificial Neural
Networks, and Hidden
Markov Models.
Speech synthesis, training
for speech recognizers, and
prosodic database creation.
The authors highlight the
advantages of automatic
segmentation over manual
segmentation, such as
consistency and time
efficiency.
Lack of up-to-
date information
Incomplete
information
Does not
explicitly mention
the specific
evaluation metrics.
The article does
not explicitly
mention future
work
2. Getnet
Mezgebu
Brhaneme
skel. et al
(2022)
Amharic
Speech Search
Using Text
Word Query
Based on
Automatic
Sentence-like
They used manual
segmentation as a
baseline for Word error
rate (WER) of the
automatic segmentation
approach, Artificial
Neural Network
The sentence-like
automatic segmentation
resulted in a WER closer
to the WER achieved on
manually segmented test
speech. They used two
speech bodies, broadcast
news domain and spiritual
domain,
a limited training
dataset
lack of detailed
information on the
dataset and
validation process
6
7
3. Eshete
Derb
Emiru [1],
Walelign
Tewabe
Sewunetie
[2].(Aug
2016).
Automatic
Speech
Segmentation
for Amharic
Phonemes
Using Hidden
Markov
Model Toolkit
(HTK)
Unsupervised method
for automatic speech
segmentation using the
Hidden Markov Model
(HMM) Toolkit (HTK).
Techniques, such as
context-independent,
context-dependent with
single Gaussian
mixture, and context-
dependent with
multiple Gaussian
mixtures.
In a context-
dependent setting
with two Gaussian
mixtures, the
phoneme-based
technique produced
the best results in
terms of the lowest
percentage of time
boundary deviations.
For the purpose of
several speech
research fields, the
suggested approach
effectively divided
Amharic speech into
phonemes.
The article
does not
explicitly
discuss the
limitations of
the proposed
method.
Does not
address the
performance of
the method on
different
speakers or in
noisy
environments.
speech corpus
was recorded by
a single female
speaker
7
8
8
4. Dr. Sebsbie
Hailmaria
m.
Phoneme level
automatic
speech
segmentation
for Amharic
language using
HMM
approach.
Hidden Markov Model
(HMM) approach for
modeling the Amharic
phonemes.
Techniques used are
context-independent,
context-dependent with
single Gaussian mixture,
and context-dependent
with multiple Gaussian
mixtures.
The proposed method
effectively segments
continuous speech into
phonemes in the
Amharic language.
The performance
of the system in
capturing
variations in
speech due to
different speakers,
accents, and other
factors not
recognized.
Study focuses on
the Amharic
language only.
Comparison of articles
No. Strengths contributions Evaluation metrics
1. Mentions various approaches and
methods used in speech segmentation.
(Wavelet Method, Artificial Neural
Networks, Blocking Black Area
Method, Short Term Energy, Hybrid
Speech Segmentation Algorithm,
Word Chopper Technique and Hidden
Markov Model).
the basics of speech segmentation,
discussing state-of-the-art
solutions, exploring different
segmentation units, examining
evaluation methods, and
highlighting the challenges and
trends in automatic speech
segmentation.
 does not explicitly
mention the specific
evaluation metrics
2. Focuses on the issue of speech
search using text word queries for the
Amharic language, which can have
practical applications.
Introduces the concept of automatic
sentence-like segmentation, which
may enhance the accuracy of the
speech search system.
The proposed approach aims to
enable efficient and accurate
searching of Amharic speech by
automatically segmenting the
speech into meaningful units and
aligning them with text queries.
Word Error Rate (WER)
as a measure of
performance.
3. Novelty:- introduces an
unsupervised method for automatic
speech
Methodology:- describes the use of
Hidden Markov Model (HMM)
toolkit (HTK) for modeling Amharic
phonemes,
Data Preparation: - collection and
preparation of both the text and
speech corpora used in the
experiments.
Evaluation: - evaluates the
performance of the segmentation
system by comparing it to manual
segmentation results.
 Contributes to the field of
speech segmentation by
proposing an automated
approach specifically
designed for the Amharic
language and demonstrating
its effectiveness through
experimental evaluation.
Percentage of boundary
deviations tolerance
values (5ms, 10ms,
15ms, and 20ms)
compared to manual
segmentation results.
This measure accuracy
of the system
4. Focuses on automatic speech
segmentation for the Amharic
language, which is a valuable
contribution to the field.
Utilizes the Hidden Markov Model
(HMM) approach, which is a
commonly used and effective method
for speech segmentation.
By proposing an HMM (Hidden
Markov Model) approach for
automatically segmenting
Amharic language at the phoneme
level.
The proposed method aims to
improve speech processing
systems, such as speech
recognition and synthesis,
Percentage of
boundary deviations
Recognition accuracy.
Boundary alignment:-
measures the
consistency and
precision of the system.
Time efficiency.
Future works
1. The article does not explicitly mention future work.
2. In the first, they need to investigate and develop more advanced
algorithms that can handle the challenges posed by noisy and non-
standard speech data. authors suggest exploring novel features and
techniques for improved segmentation accuracy and efficiency.
3. The authors propose a method that combines automatic speech
recognition with automatic sentence-like segmentation and provide
experimental results to support their findings.
4. The study has potential limitations related to the size of the text
corpus and the speaker characteristics. Further research can
address these limitations and explore the generalization and
robustness of the proposed method in diverse settings.
• Based on this review I will try to do Automatic Speech
Segmentation for wolaita language.
12
12

More Related Content

Similar to speech segmentation based on four articles in one.

Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)kevig
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languageiosrjce
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionEditorIJAERD
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...iosrjce
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifiereSAT Publishing House
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifiereSAT Journals
 
A survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classificationA survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classificationijnlc
 
MULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGE
MULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGEMULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGE
MULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGEcsandit
 
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...IRJET Journal
 
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemEvaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemIJERA Editor
 
AUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEYAUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEYIJCERT
 
A novel method for arabic multi word term extraction
A novel method for arabic multi word term extractionA novel method for arabic multi word term extraction
A novel method for arabic multi word term extractionijdms
 
Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...
Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...
Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...IRJET Journal
 
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...ijnlc
 
Quality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingQuality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingijcsa
 
A New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in MalayalamA New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in Malayalamijcsit
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 

Similar to speech segmentation based on four articles in one. (20)

Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)Named Entity Recognition using Hidden Markov Model (HMM)
Named Entity Recognition using Hidden Markov Model (HMM)
 
Supervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured TextSupervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured Text
 
Performance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi languagePerformance Calculation of Speech Synthesis Methods for Hindi language
Performance Calculation of Speech Synthesis Methods for Hindi language
 
The sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regressionThe sarcasm detection with the method of logistic regression
The sarcasm detection with the method of logistic regression
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
A survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classificationA survey on phrase structure learning methods for text classification
A survey on phrase structure learning methods for text classification
 
MULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGE
MULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGEMULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGE
MULTI-WORD TERM EXTRACTION BASED ON NEW HYBRID APPROACH FOR ARABIC LANGUAGE
 
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
A Novel Method for An Intelligent Based Voice Meeting System Using Machine Le...
 
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemEvaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
 
AUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEYAUTOMATIC SPEECH RECOGNITION- A SURVEY
AUTOMATIC SPEECH RECOGNITION- A SURVEY
 
A novel method for arabic multi word term extraction
A novel method for arabic multi word term extractionA novel method for arabic multi word term extraction
A novel method for arabic multi word term extraction
 
Ijetcas14 390
Ijetcas14 390Ijetcas14 390
Ijetcas14 390
 
Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...
Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...
Rendering Of Voice By Using Convolutional Neural Network And With The Help Of...
 
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...
EXTENDING THE KNOWLEDGE OF THE ARABIC SENTIMENT CLASSIFICATION USING A FOREIG...
 
Quality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingQuality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemming
 
SCTUR: A Sentiment Classification Technique for URDU
SCTUR: A Sentiment Classification Technique for URDUSCTUR: A Sentiment Classification Technique for URDU
SCTUR: A Sentiment Classification Technique for URDU
 
A New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in MalayalamA New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in Malayalam
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 

Recently uploaded

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

speech segmentation based on four articles in one.

  • 1. WOLAITA SODO UNIVERSITY SHOOL OF INFORMATICS DEPARTMENT OF INFORMATION TECHNOLOGY IT MSc Regular Course:- IMS Article review on speech segmentation. By:- Abebe Tora Pgr/82835/15 Submitted To: - Dr. siraj Sub. Date: - Jan 12/2014
  • 2. 2 Contents 1. Title and authors 2. Introduction 3. Description 4. Comparison 5. Future work
  • 3. 3 Title and authors  Automatic Speech Segmentation. Alaa Ehab Sakran 1, et al (April 2017).  Amharic Speech Search Using Text Word Query Based on Automatic Sentence-like Segmentation Getnet Mezgebu et al (Nov. 2022  Automatic Speech Segmentation for Amharic Phonemes Using Hidden Markov Model Toolkit (HTK). Eshete Derb Emiru [1], Walelign Tewabe Sewunetie [2] ( Aug 2016).  Phoneme level automatic speech segmentation for Amharic language using HMM approach.by Dr. Sebsbie Hailmariam.
  • 4. 4 Introduction • For more than thirty years, researchers have been studying automated speech segmentation in an effort to divide speech signals into smaller pieces for use in voice synthesis and recognition, among other applications. • Speech segmentation is the process of identifying the boundaries between words, syllables or phonemes in spoken natural languages. • I present a thorough analysis of four research papers that examine various strategies and developments in automatic voice segmentation in this article. • All articles are used their own methods and techniques.
  • 5. 5 Introduction • In this review, I aim to provide insights into the advancements and challenges in automatic speech segmentation techniques. • By examining these four articles, I will gain a better understanding of the various approaches, methodologies, and applications in this field. • More detail descriptions are clearly described bellow by table format and additionally comparisons and future works are also listed.
  • 6. 6 Description No. Authors Titles Methods Findings Limitations 1. Alaa Ehab Sakran. et al (April- 2017) Automatic Speech Segmentation. Wavelet, Fuzzy methods (based on IOT devices), Artificial Neural Networks, and Hidden Markov Models. Speech synthesis, training for speech recognizers, and prosodic database creation. The authors highlight the advantages of automatic segmentation over manual segmentation, such as consistency and time efficiency. Lack of up-to- date information Incomplete information Does not explicitly mention the specific evaluation metrics. The article does not explicitly mention future work 2. Getnet Mezgebu Brhaneme skel. et al (2022) Amharic Speech Search Using Text Word Query Based on Automatic Sentence-like They used manual segmentation as a baseline for Word error rate (WER) of the automatic segmentation approach, Artificial Neural Network The sentence-like automatic segmentation resulted in a WER closer to the WER achieved on manually segmented test speech. They used two speech bodies, broadcast news domain and spiritual domain, a limited training dataset lack of detailed information on the dataset and validation process 6
  • 7. 7 3. Eshete Derb Emiru [1], Walelign Tewabe Sewunetie [2].(Aug 2016). Automatic Speech Segmentation for Amharic Phonemes Using Hidden Markov Model Toolkit (HTK) Unsupervised method for automatic speech segmentation using the Hidden Markov Model (HMM) Toolkit (HTK). Techniques, such as context-independent, context-dependent with single Gaussian mixture, and context- dependent with multiple Gaussian mixtures. In a context- dependent setting with two Gaussian mixtures, the phoneme-based technique produced the best results in terms of the lowest percentage of time boundary deviations. For the purpose of several speech research fields, the suggested approach effectively divided Amharic speech into phonemes. The article does not explicitly discuss the limitations of the proposed method. Does not address the performance of the method on different speakers or in noisy environments. speech corpus was recorded by a single female speaker 7
  • 8. 8 8 4. Dr. Sebsbie Hailmaria m. Phoneme level automatic speech segmentation for Amharic language using HMM approach. Hidden Markov Model (HMM) approach for modeling the Amharic phonemes. Techniques used are context-independent, context-dependent with single Gaussian mixture, and context-dependent with multiple Gaussian mixtures. The proposed method effectively segments continuous speech into phonemes in the Amharic language. The performance of the system in capturing variations in speech due to different speakers, accents, and other factors not recognized. Study focuses on the Amharic language only.
  • 9. Comparison of articles No. Strengths contributions Evaluation metrics 1. Mentions various approaches and methods used in speech segmentation. (Wavelet Method, Artificial Neural Networks, Blocking Black Area Method, Short Term Energy, Hybrid Speech Segmentation Algorithm, Word Chopper Technique and Hidden Markov Model). the basics of speech segmentation, discussing state-of-the-art solutions, exploring different segmentation units, examining evaluation methods, and highlighting the challenges and trends in automatic speech segmentation.  does not explicitly mention the specific evaluation metrics 2. Focuses on the issue of speech search using text word queries for the Amharic language, which can have practical applications. Introduces the concept of automatic sentence-like segmentation, which may enhance the accuracy of the speech search system. The proposed approach aims to enable efficient and accurate searching of Amharic speech by automatically segmenting the speech into meaningful units and aligning them with text queries. Word Error Rate (WER) as a measure of performance.
  • 10. 3. Novelty:- introduces an unsupervised method for automatic speech Methodology:- describes the use of Hidden Markov Model (HMM) toolkit (HTK) for modeling Amharic phonemes, Data Preparation: - collection and preparation of both the text and speech corpora used in the experiments. Evaluation: - evaluates the performance of the segmentation system by comparing it to manual segmentation results.  Contributes to the field of speech segmentation by proposing an automated approach specifically designed for the Amharic language and demonstrating its effectiveness through experimental evaluation. Percentage of boundary deviations tolerance values (5ms, 10ms, 15ms, and 20ms) compared to manual segmentation results. This measure accuracy of the system 4. Focuses on automatic speech segmentation for the Amharic language, which is a valuable contribution to the field. Utilizes the Hidden Markov Model (HMM) approach, which is a commonly used and effective method for speech segmentation. By proposing an HMM (Hidden Markov Model) approach for automatically segmenting Amharic language at the phoneme level. The proposed method aims to improve speech processing systems, such as speech recognition and synthesis, Percentage of boundary deviations Recognition accuracy. Boundary alignment:- measures the consistency and precision of the system. Time efficiency.
  • 11. Future works 1. The article does not explicitly mention future work. 2. In the first, they need to investigate and develop more advanced algorithms that can handle the challenges posed by noisy and non- standard speech data. authors suggest exploring novel features and techniques for improved segmentation accuracy and efficiency. 3. The authors propose a method that combines automatic speech recognition with automatic sentence-like segmentation and provide experimental results to support their findings. 4. The study has potential limitations related to the size of the text corpus and the speaker characteristics. Further research can address these limitations and explore the generalization and robustness of the proposed method in diverse settings. • Based on this review I will try to do Automatic Speech Segmentation for wolaita language.
  • 12. 12 12