1. Presented by-
ASHISH MAURYA
(2015 VLSI-13)
AN EFFICIENT FIXED CODEBOOK SEARCH
METHOD FOR G.729 SPEECH CODEC
ABV-Indian Institute of Information Technology and Management Gwalior,
Morena Link Road, Gwalior, Madhya Pradesh, INDIA - 474015.
Presented by- ASHISH MAURYA (2015 VLSI-13)
May 13, 2016
2. I. Speech coding
i. Block Diagram and Description
II. Speech coder
III. G.729???
IV. Algebraic Code Excited Linear Prediction (ACELP)
i. Block Diagram
ii. Fixed Codebook Structure
iii. Process Flowgraph
V. Proposed Search Method
i. IFPR Search Method
ii. RCM Search Method
iii. Combined Search Method
- Process Flowgraph
VI. Conclusion & Scope
VII. Reference Paper
CONTENTS
Presented by- ASHISH MAURYA (2015 VLSI-13)
3. Speech coding is a procedure to represent a digitized speech
signal using as few bits as possible, maintaining at the same time a
reasonable level of speech quality.
Due to the increasing demand for speech communication, speech
coding technology has received augmenting levels of interest from
the research, standardization, and business communities.
SPEECH CODING
1/20Presented by- ASHISH MAURYA (2015 VLSI-13)
5. Speech source: A continuous time analog speech signal source.
Filter: Used to select required frequency signal by filtering
unwanted signals.
Sampler: Convert continuous time signal to discrete time signal.
Sampling is done at 8 KHz to satisfy Nyquist criterion .
ADC: Discrete signal is quantized by quantizer to get digital signal.
Source encoder: Encode digitized signal to reduce the bit rate.
3/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DESCRIPTION
6. Channel encoder: Provides error protection to the bit-stream
before transmission to the communication channel.
Channel decoder: Processes the error-protected data to
recover the encoded data.
Source decoder: Generates the digital speech signal having
the original bit rate.
DAC: Converts digital speech signal to continuous-time
analog signal.
4/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DESCRIPTION (CONTD..)
7. The encoder/decoder structure represented in Figure is known as a speech
coder.
The input speech is encoded to produce a low-rate bit-stream.
This bit-stream is input to the decoder, which constructs an approximation of
the original signal.
5/20Presented by- ASHISH MAURYA (2015 VLSI-13)
SPEECH CODER
8. Encoding
Derive the filter coefficients from the speech frame.
Derive the scale factor from the speech frame.
Transmit filter coefficients and scale factor to the decoder.
Decoding
Generate white noise sequence.
Multiply the white noise samples by the scale factor.
Construct the filter using the coefficients from the encoder and
filter the scaled white noise sequence.
Output speech is the output of the filter.
6/20Presented by- ASHISH MAURYA (2015 VLSI-13)
SPEECH CODER (CONTD..)
9. G.729 is a speech coding technique
Most important usage is in VoIP
Compress speech signal from 64kbps to 8kbps
It uses the CS-ACELP algorithm
CS-ACELP stands for conjugate structure algebraic code
excited linear prediction
7/20Presented by- ASHISH MAURYA (2015 VLSI-13)
G.729 ???
10. Algebraic CELP or ACELP is an attempt to reduce the
computational cost of standard CELP coders.
The term “Algebra” means the use of simple algebra or
mathematical rules to create the excitation code vectors ,with the
rules being addition and shifting.
The advantage of this method is that no physical storage is
required, resulting in significant memory saving.
8/20Presented by- ASHISH MAURYA (2015 VLSI-13)
Algebraic CELP or ACELP
12. The fixed codebook is based on an algebraic codebook structure
using an interleaved single-pulse permutation (ISPP) design
40 samples ACELP fixed codebook made up of five single pulse
interleaved permutation codes (5 tracks)
Hence called, Interleaved Single-Pulse Permutation (ISPP) .
10/20Presented by- ASHISH MAURYA (2015 VLSI-13)
STRUCTURE OF FIXED CODEBOOK
13. Each pulse can have either the amplitudes +1 or –1
Each pulse requires 1 bit per sign.
For m0, m1, and m2, 3 bits are needed for position;
while for m3, 4 bits are required.
Total of 17 bits are needed to index the whole codebook.
11/20Presented by- ASHISH MAURYA (2015 VLSI-13)
STRUCTURE OF FIXED CODEBOOK (CONTD..)
14. 12/20Presented by- ASHISH MAURYA (2015 VLSI-13)
PROCESS FLOW GRAPH
k
t
k
n
k
k
k
cc
ncnd
E
C
239
0
2
Maximizing Term
15. 13/20Presented by- ASHISH MAURYA (2015 VLSI-13)
PROPOSED SEARCH METHOD
A combined version of reduced candidate mechanism (RCM) and
iteration-free pulse replacement (IFPR).
Individual pulse contribution in each track is given by RCM.
The replacement of a pulse is performed through the search over
the sorted top N pulses by IFPR.
This method requires a search load about to 7.5% of G.729A.
16. 14/20
IFPR SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
In the IFPR method, new pulses are
sought by a number of pulse
replacements at a time following pulse
contributions evaluated for every track.
This is done to maximize over all
combinations a search criterion, which
replaces the pulses pertaining to the
initial codevector with the most
significant pulses for every track.
To replace the pulses of the initial
codevector with the most significant
pulses for every track has overall search
complexity of 48.
17. 15/20
RCM SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
The number of candidate pulses in each
track is reduced for the purpose of search
complexity reduction.
A pulse sorting is made by the
contribution thereof in descending order as
the first step, and then, the top N pulses are
chosen as the candidate pulses for a full
search.
In this way, the search process needs to
be performed for merely N^4 number of
times for the optimal pulse combination.
So the best combination of the candidates
will be selected.
19. 17/20
PROCESS FLOW GRAPH
Presented by- ASHISH MAURYA (2015 VLSI-13)
Individual pulse contribution is
evaluated by, and a sorting is made
by pulse contribution within the
associated track.
The one with the global maximum
pulse contribution, named as G1, is
located out of all the top 1 pulses
among all the tracks.
G1 is presumed to be one of four
optimal pulses.
20. 18/20
PROCESS FLOW GRAPH (CONTD..)
Presented by- ASHISH MAURYA (2015 VLSI-13)
The value of N is determined for the
searching task conducted over the
remaining three tracks through RCM.
The searching task terminates the
moment the combination of optimal
pulses is acquired.
21. 19/20
CONCLUSION & SCOPE
Presented by- ASHISH MAURYA (2015 VLSI-13)
A pulse with a high contribution is more likely to serve as one of the optimal
pulses in the associated track.
This proposal requires eight searches in the case of N = 2, i.e, a search load is
about 2.5% of that in G.729A, and similar reduction in searching for other values of
‘N’.
The improved G.729A speech codec can be utilized to improve the Voice over
Internet Protocol (VoIP) performance on smartphone.
As a consequence, the energy efficiency requirement is met for an extended
operation time period due to computational load reduction.
22. 20/20
REFERENCE PAPER
Presented by- ASHISH MAURYA (2015 VLSI-13)
C. Y. Yeh, "An efficient fixed codebook search for G.729
speech codec derived from RCM-based search algorithm,"
Signal and Information Processing (ChinaSIP), 2014 IEEE China
Summit & International Conference on, Xi'an, 2014, pp. 76-79.