speech segmentation based on four articles in one.

WOLAITA SODO UNIVERSITY
SHOOL OF INFORMATICS
DEPARTMENT OF INFORMATION TECHNOLOGY
IT MSc Regular
Course:- IMS
Article review on speech segmentation.
By:-
Abebe Tora Pgr/82835/15
Submitted To: - Dr. siraj
Sub. Date: - Jan 12/2014

2
Contents
1. Title and authors
2. Introduction
3. Description
4. Comparison
5. Future work

3
Title and authors
 Automatic Speech Segmentation. Alaa Ehab Sakran 1, et al (April
2017).
 Amharic Speech Search Using Text Word Query Based on
Automatic Sentence-like Segmentation Getnet Mezgebu et al
(Nov. 2022
 Automatic Speech Segmentation for Amharic Phonemes Using
Hidden Markov Model Toolkit (HTK). Eshete Derb Emiru [1],
Walelign Tewabe Sewunetie [2] ( Aug 2016).
 Phoneme level automatic speech segmentation for Amharic
language using HMM approach.by Dr. Sebsbie Hailmariam.

4
Introduction
• For more than thirty years, researchers have been studying
automated speech segmentation in an effort to divide speech
signals into smaller pieces for use in voice synthesis and
recognition, among other applications.
• Speech segmentation is the process of identifying the boundaries
between words, syllables or phonemes in spoken natural languages.
• I present a thorough analysis of four research papers that examine
various strategies and developments in automatic voice
segmentation in this article.
• All articles are used their own methods and techniques.

5
Introduction
• In this review, I aim to provide insights into the
advancements and challenges in automatic speech
segmentation techniques.
• By examining these four articles, I will gain a better
understanding of the various approaches,
methodologies, and applications in this field.
• More detail descriptions are clearly described bellow by
table format and additionally comparisons and future
works are also listed.

6
Description
No. Authors Titles Methods Findings Limitations
1. Alaa Ehab
Sakran. et
al (April-
2017)
Automatic
Speech
Segmentation.
Wavelet, Fuzzy methods
(based on IOT devices),
Artificial Neural
Networks, and Hidden
Markov Models.
Speech synthesis, training
for speech recognizers, and
prosodic database creation.
The authors highlight the
advantages of automatic
segmentation over manual
segmentation, such as
consistency and time
efficiency.
Lack of up-to-
date information
Incomplete
information
Does not
explicitly mention
the specific
evaluation metrics.
The article does
not explicitly
mention future
work
2. Getnet
Mezgebu
Brhaneme
skel. et al
(2022)
Amharic
Speech Search
Using Text
Word Query
Based on
Automatic
Sentence-like
They used manual
segmentation as a
baseline for Word error
rate (WER) of the
automatic segmentation
approach, Artificial
Neural Network
The sentence-like
automatic segmentation
resulted in a WER closer
to the WER achieved on
manually segmented test
speech. They used two
speech bodies, broadcast
news domain and spiritual
domain,
a limited training
dataset
lack of detailed
information on the
dataset and
validation process
6

7
3. Eshete
Derb
Emiru [1],
Walelign
Tewabe
Sewunetie
[2].(Aug
2016).
Automatic
Speech
Segmentation
for Amharic
Phonemes
Using Hidden
Markov
Model Toolkit
(HTK)
Unsupervised method
for automatic speech
segmentation using the
Hidden Markov Model
(HMM) Toolkit (HTK).
Techniques, such as
context-independent,
context-dependent with
single Gaussian
mixture, and context-
dependent with
multiple Gaussian
mixtures.
In a context-
dependent setting
with two Gaussian
mixtures, the
phoneme-based
technique produced
the best results in
terms of the lowest
percentage of time
boundary deviations.
For the purpose of
several speech
research fields, the
suggested approach
effectively divided
Amharic speech into
phonemes.
The article
does not
explicitly
discuss the
limitations of
the proposed
method.
Does not
address the
performance of
the method on
different
speakers or in
noisy
environments.
speech corpus
was recorded by
a single female
speaker
7

8
8
4. Dr. Sebsbie
Hailmaria
m.
Phoneme level
automatic
speech
segmentation
for Amharic
language using
HMM
approach.
Hidden Markov Model
(HMM) approach for
modeling the Amharic
phonemes.
Techniques used are
context-independent,
context-dependent with
single Gaussian mixture,
and context-dependent
with multiple Gaussian
mixtures.
The proposed method
effectively segments
continuous speech into
phonemes in the
Amharic language.
The performance
of the system in
capturing
variations in
speech due to
different speakers,
accents, and other
factors not
recognized.
Study focuses on
the Amharic
language only.

Comparison of articles
No. Strengths contributions Evaluation metrics
1. Mentions various approaches and
methods used in speech segmentation.
(Wavelet Method, Artificial Neural
Networks, Blocking Black Area
Method, Short Term Energy, Hybrid
Speech Segmentation Algorithm,
Word Chopper Technique and Hidden
Markov Model).
the basics of speech segmentation,
discussing state-of-the-art
solutions, exploring different
segmentation units, examining
evaluation methods, and
highlighting the challenges and
trends in automatic speech
segmentation.
 does not explicitly
mention the specific
evaluation metrics
2. Focuses on the issue of speech
search using text word queries for the
Amharic language, which can have
practical applications.
Introduces the concept of automatic
sentence-like segmentation, which
may enhance the accuracy of the
speech search system.
The proposed approach aims to
enable efficient and accurate
searching of Amharic speech by
automatically segmenting the
speech into meaningful units and
aligning them with text queries.
Word Error Rate (WER)
as a measure of
performance.

3. Novelty:- introduces an
unsupervised method for automatic
speech
Methodology:- describes the use of
Hidden Markov Model (HMM)
toolkit (HTK) for modeling Amharic
phonemes,
Data Preparation: - collection and
preparation of both the text and
speech corpora used in the
experiments.
Evaluation: - evaluates the
performance of the segmentation
system by comparing it to manual
segmentation results.
 Contributes to the field of
speech segmentation by
proposing an automated
approach specifically
designed for the Amharic
language and demonstrating
its effectiveness through
experimental evaluation.
Percentage of boundary
deviations tolerance
values (5ms, 10ms,
15ms, and 20ms)
compared to manual
segmentation results.
This measure accuracy
of the system
4. Focuses on automatic speech
segmentation for the Amharic
language, which is a valuable
contribution to the field.
Utilizes the Hidden Markov Model
(HMM) approach, which is a
commonly used and effective method
for speech segmentation.
By proposing an HMM (Hidden
Markov Model) approach for
automatically segmenting
Amharic language at the phoneme
level.
The proposed method aims to
improve speech processing
systems, such as speech
recognition and synthesis,
Percentage of
boundary deviations
Recognition accuracy.
Boundary alignment:-
measures the
consistency and
precision of the system.
Time efficiency.

Future works
1. The article does not explicitly mention future work.
2. In the first, they need to investigate and develop more advanced
algorithms that can handle the challenges posed by noisy and non-
standard speech data. authors suggest exploring novel features and
techniques for improved segmentation accuracy and efficiency.
3. The authors propose a method that combines automatic speech
recognition with automatic sentence-like segmentation and provide
experimental results to support their findings.
4. The study has potential limitations related to the size of the text
corpus and the speaker characteristics. Further research can
address these limitations and explore the generalization and
robustness of the proposed method in diverse settings.
• Based on this review I will try to do Automatic Speech
Segmentation for wolaita language.

speech segmentation based on four articles in one.

Recommended

Recommended

More Related Content

Similar to speech segmentation based on four articles in one.

Similar to speech segmentation based on four articles in one. (20)

Recently uploaded

Recently uploaded (20)

speech segmentation based on four articles in one.