Presented by-
ASHISH MAURYA
(2015 VLSI-13)
AN EFFICIENT FIXED CODEBOOK SEARCH
METHOD FOR G.729 SPEECH CODEC
ABV-Indian Institute of Information Technology and Management Gwalior,
Morena Link Road, Gwalior, Madhya Pradesh, INDIA - 474015.
Presented by- ASHISH MAURYA (2015 VLSI-13)
May 13, 2016
I. Speech coding
i. Block Diagram and Description
II. Speech coder
III. G.729???
IV. Algebraic Code Excited Linear Prediction (ACELP)
i. Block Diagram
ii. Fixed Codebook Structure
iii. Process Flowgraph
V. Proposed Search Method
i. IFPR Search Method
ii. RCM Search Method
iii. Combined Search Method
- Process Flowgraph
VI. Conclusion & Scope
VII. Reference Paper
CONTENTS
Presented by- ASHISH MAURYA (2015 VLSI-13)
 Speech coding is a procedure to represent a digitized speech
signal using as few bits as possible, maintaining at the same time a
reasonable level of speech quality.
 Due to the increasing demand for speech communication, speech
coding technology has received augmenting levels of interest from
the research, standardization, and business communities.
SPEECH CODING
1/20Presented by- ASHISH MAURYA (2015 VLSI-13)
2/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DIAGRAM
 Speech source: A continuous time analog speech signal source.
 Filter: Used to select required frequency signal by filtering
unwanted signals.
 Sampler: Convert continuous time signal to discrete time signal.
Sampling is done at 8 KHz to satisfy Nyquist criterion .
 ADC: Discrete signal is quantized by quantizer to get digital signal.
 Source encoder: Encode digitized signal to reduce the bit rate.
3/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DESCRIPTION
 Channel encoder: Provides error protection to the bit-stream
before transmission to the communication channel.
 Channel decoder: Processes the error-protected data to
recover the encoded data.
 Source decoder: Generates the digital speech signal having
the original bit rate.
 DAC: Converts digital speech signal to continuous-time
analog signal.
4/20Presented by- ASHISH MAURYA (2015 VLSI-13)
BLOCK DESCRIPTION (CONTD..)
 The encoder/decoder structure represented in Figure is known as a speech
coder.
 The input speech is encoded to produce a low-rate bit-stream.
 This bit-stream is input to the decoder, which constructs an approximation of
the original signal.
5/20Presented by- ASHISH MAURYA (2015 VLSI-13)
SPEECH CODER
Encoding
 Derive the filter coefficients from the speech frame.
 Derive the scale factor from the speech frame.
 Transmit filter coefficients and scale factor to the decoder.
Decoding
 Generate white noise sequence.
 Multiply the white noise samples by the scale factor.
 Construct the filter using the coefficients from the encoder and
filter the scaled white noise sequence.
 Output speech is the output of the filter.
6/20Presented by- ASHISH MAURYA (2015 VLSI-13)
SPEECH CODER (CONTD..)
 G.729 is a speech coding technique
 Most important usage is in VoIP
 Compress speech signal from 64kbps to 8kbps
 It uses the CS-ACELP algorithm
 CS-ACELP stands for conjugate structure algebraic code
excited linear prediction
7/20Presented by- ASHISH MAURYA (2015 VLSI-13)
G.729 ???
 Algebraic CELP or ACELP is an attempt to reduce the
computational cost of standard CELP coders.
 The term “Algebra” means the use of simple algebra or
mathematical rules to create the excitation code vectors ,with the
rules being addition and shifting.
 The advantage of this method is that no physical storage is
required, resulting in significant memory saving.
8/20Presented by- ASHISH MAURYA (2015 VLSI-13)
Algebraic CELP or ACELP
Fixed CB search
9/20Presented by- ASHISH MAURYA (2015 VLSI-13)
ACELP Encoder Block Diagram
 The fixed codebook is based on an algebraic codebook structure
using an interleaved single-pulse permutation (ISPP) design
 40 samples ACELP fixed codebook made up of five single pulse
interleaved permutation codes (5 tracks)
 Hence called, Interleaved Single-Pulse Permutation (ISPP) .
10/20Presented by- ASHISH MAURYA (2015 VLSI-13)
STRUCTURE OF FIXED CODEBOOK
 Each pulse can have either the amplitudes +1 or –1
 Each pulse requires 1 bit per sign.
 For m0, m1, and m2, 3 bits are needed for position;
while for m3, 4 bits are required.
Total of 17 bits are needed to index the whole codebook.
11/20Presented by- ASHISH MAURYA (2015 VLSI-13)
STRUCTURE OF FIXED CODEBOOK (CONTD..)
12/20Presented by- ASHISH MAURYA (2015 VLSI-13)
PROCESS FLOW GRAPH
   
k
t
k
n
k
k
k
cc
ncnd
E
C












239
0
2
Maximizing Term
13/20Presented by- ASHISH MAURYA (2015 VLSI-13)
PROPOSED SEARCH METHOD
 A combined version of reduced candidate mechanism (RCM) and
iteration-free pulse replacement (IFPR).
 Individual pulse contribution in each track is given by RCM.
 The replacement of a pulse is performed through the search over
the sorted top N pulses by IFPR.
 This method requires a search load about to 7.5% of G.729A.
14/20
IFPR SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
 In the IFPR method, new pulses are
sought by a number of pulse
replacements at a time following pulse
contributions evaluated for every track.
 This is done to maximize over all
combinations a search criterion, which
replaces the pulses pertaining to the
initial codevector with the most
significant pulses for every track.
 To replace the pulses of the initial
codevector with the most significant
pulses for every track has overall search
complexity of 48.
15/20
RCM SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
 The number of candidate pulses in each
track is reduced for the purpose of search
complexity reduction.
 A pulse sorting is made by the
contribution thereof in descending order as
the first step, and then, the top N pulses are
chosen as the candidate pulses for a full
search.
 In this way, the search process needs to
be performed for merely N^4 number of
times for the optimal pulse combination.
 So the best combination of the candidates
will be selected.
16/20
COMBINED SEARCH METHOD
Presented by- ASHISH MAURYA (2015 VLSI-13)
COMPARISON TABLE
17/20
PROCESS FLOW GRAPH
Presented by- ASHISH MAURYA (2015 VLSI-13)
 Individual pulse contribution is
evaluated by, and a sorting is made
by pulse contribution within the
associated track.
 The one with the global maximum
pulse contribution, named as G1, is
located out of all the top 1 pulses
among all the tracks.
 G1 is presumed to be one of four
optimal pulses.
18/20
PROCESS FLOW GRAPH (CONTD..)
Presented by- ASHISH MAURYA (2015 VLSI-13)
 The value of N is determined for the
searching task conducted over the
remaining three tracks through RCM.
 The searching task terminates the
moment the combination of optimal
pulses is acquired.
19/20
CONCLUSION & SCOPE
Presented by- ASHISH MAURYA (2015 VLSI-13)
 A pulse with a high contribution is more likely to serve as one of the optimal
pulses in the associated track.
 This proposal requires eight searches in the case of N = 2, i.e, a search load is
about 2.5% of that in G.729A, and similar reduction in searching for other values of
‘N’.
 The improved G.729A speech codec can be utilized to improve the Voice over
Internet Protocol (VoIP) performance on smartphone.
 As a consequence, the energy efficiency requirement is met for an extended
operation time period due to computational load reduction.
20/20
REFERENCE PAPER
Presented by- ASHISH MAURYA (2015 VLSI-13)
C. Y. Yeh, "An efficient fixed codebook search for G.729
speech codec derived from RCM-based search algorithm,"
Signal and Information Processing (ChinaSIP), 2014 IEEE China
Summit & International Conference on, Xi'an, 2014, pp. 76-79.
Presented by- ASHISH MAURYA (2015 VLSI-13)

Presentation

  • 1.
    Presented by- ASHISH MAURYA (2015VLSI-13) AN EFFICIENT FIXED CODEBOOK SEARCH METHOD FOR G.729 SPEECH CODEC ABV-Indian Institute of Information Technology and Management Gwalior, Morena Link Road, Gwalior, Madhya Pradesh, INDIA - 474015. Presented by- ASHISH MAURYA (2015 VLSI-13) May 13, 2016
  • 2.
    I. Speech coding i.Block Diagram and Description II. Speech coder III. G.729??? IV. Algebraic Code Excited Linear Prediction (ACELP) i. Block Diagram ii. Fixed Codebook Structure iii. Process Flowgraph V. Proposed Search Method i. IFPR Search Method ii. RCM Search Method iii. Combined Search Method - Process Flowgraph VI. Conclusion & Scope VII. Reference Paper CONTENTS Presented by- ASHISH MAURYA (2015 VLSI-13)
  • 3.
     Speech codingis a procedure to represent a digitized speech signal using as few bits as possible, maintaining at the same time a reasonable level of speech quality.  Due to the increasing demand for speech communication, speech coding technology has received augmenting levels of interest from the research, standardization, and business communities. SPEECH CODING 1/20Presented by- ASHISH MAURYA (2015 VLSI-13)
  • 4.
    2/20Presented by- ASHISHMAURYA (2015 VLSI-13) BLOCK DIAGRAM
  • 5.
     Speech source:A continuous time analog speech signal source.  Filter: Used to select required frequency signal by filtering unwanted signals.  Sampler: Convert continuous time signal to discrete time signal. Sampling is done at 8 KHz to satisfy Nyquist criterion .  ADC: Discrete signal is quantized by quantizer to get digital signal.  Source encoder: Encode digitized signal to reduce the bit rate. 3/20Presented by- ASHISH MAURYA (2015 VLSI-13) BLOCK DESCRIPTION
  • 6.
     Channel encoder:Provides error protection to the bit-stream before transmission to the communication channel.  Channel decoder: Processes the error-protected data to recover the encoded data.  Source decoder: Generates the digital speech signal having the original bit rate.  DAC: Converts digital speech signal to continuous-time analog signal. 4/20Presented by- ASHISH MAURYA (2015 VLSI-13) BLOCK DESCRIPTION (CONTD..)
  • 7.
     The encoder/decoderstructure represented in Figure is known as a speech coder.  The input speech is encoded to produce a low-rate bit-stream.  This bit-stream is input to the decoder, which constructs an approximation of the original signal. 5/20Presented by- ASHISH MAURYA (2015 VLSI-13) SPEECH CODER
  • 8.
    Encoding  Derive thefilter coefficients from the speech frame.  Derive the scale factor from the speech frame.  Transmit filter coefficients and scale factor to the decoder. Decoding  Generate white noise sequence.  Multiply the white noise samples by the scale factor.  Construct the filter using the coefficients from the encoder and filter the scaled white noise sequence.  Output speech is the output of the filter. 6/20Presented by- ASHISH MAURYA (2015 VLSI-13) SPEECH CODER (CONTD..)
  • 9.
     G.729 isa speech coding technique  Most important usage is in VoIP  Compress speech signal from 64kbps to 8kbps  It uses the CS-ACELP algorithm  CS-ACELP stands for conjugate structure algebraic code excited linear prediction 7/20Presented by- ASHISH MAURYA (2015 VLSI-13) G.729 ???
  • 10.
     Algebraic CELPor ACELP is an attempt to reduce the computational cost of standard CELP coders.  The term “Algebra” means the use of simple algebra or mathematical rules to create the excitation code vectors ,with the rules being addition and shifting.  The advantage of this method is that no physical storage is required, resulting in significant memory saving. 8/20Presented by- ASHISH MAURYA (2015 VLSI-13) Algebraic CELP or ACELP
  • 11.
    Fixed CB search 9/20Presentedby- ASHISH MAURYA (2015 VLSI-13) ACELP Encoder Block Diagram
  • 12.
     The fixedcodebook is based on an algebraic codebook structure using an interleaved single-pulse permutation (ISPP) design  40 samples ACELP fixed codebook made up of five single pulse interleaved permutation codes (5 tracks)  Hence called, Interleaved Single-Pulse Permutation (ISPP) . 10/20Presented by- ASHISH MAURYA (2015 VLSI-13) STRUCTURE OF FIXED CODEBOOK
  • 13.
     Each pulsecan have either the amplitudes +1 or –1  Each pulse requires 1 bit per sign.  For m0, m1, and m2, 3 bits are needed for position; while for m3, 4 bits are required. Total of 17 bits are needed to index the whole codebook. 11/20Presented by- ASHISH MAURYA (2015 VLSI-13) STRUCTURE OF FIXED CODEBOOK (CONTD..)
  • 14.
    12/20Presented by- ASHISHMAURYA (2015 VLSI-13) PROCESS FLOW GRAPH     k t k n k k k cc ncnd E C             239 0 2 Maximizing Term
  • 15.
    13/20Presented by- ASHISHMAURYA (2015 VLSI-13) PROPOSED SEARCH METHOD  A combined version of reduced candidate mechanism (RCM) and iteration-free pulse replacement (IFPR).  Individual pulse contribution in each track is given by RCM.  The replacement of a pulse is performed through the search over the sorted top N pulses by IFPR.  This method requires a search load about to 7.5% of G.729A.
  • 16.
    14/20 IFPR SEARCH METHOD Presentedby- ASHISH MAURYA (2015 VLSI-13)  In the IFPR method, new pulses are sought by a number of pulse replacements at a time following pulse contributions evaluated for every track.  This is done to maximize over all combinations a search criterion, which replaces the pulses pertaining to the initial codevector with the most significant pulses for every track.  To replace the pulses of the initial codevector with the most significant pulses for every track has overall search complexity of 48.
  • 17.
    15/20 RCM SEARCH METHOD Presentedby- ASHISH MAURYA (2015 VLSI-13)  The number of candidate pulses in each track is reduced for the purpose of search complexity reduction.  A pulse sorting is made by the contribution thereof in descending order as the first step, and then, the top N pulses are chosen as the candidate pulses for a full search.  In this way, the search process needs to be performed for merely N^4 number of times for the optimal pulse combination.  So the best combination of the candidates will be selected.
  • 18.
    16/20 COMBINED SEARCH METHOD Presentedby- ASHISH MAURYA (2015 VLSI-13) COMPARISON TABLE
  • 19.
    17/20 PROCESS FLOW GRAPH Presentedby- ASHISH MAURYA (2015 VLSI-13)  Individual pulse contribution is evaluated by, and a sorting is made by pulse contribution within the associated track.  The one with the global maximum pulse contribution, named as G1, is located out of all the top 1 pulses among all the tracks.  G1 is presumed to be one of four optimal pulses.
  • 20.
    18/20 PROCESS FLOW GRAPH(CONTD..) Presented by- ASHISH MAURYA (2015 VLSI-13)  The value of N is determined for the searching task conducted over the remaining three tracks through RCM.  The searching task terminates the moment the combination of optimal pulses is acquired.
  • 21.
    19/20 CONCLUSION & SCOPE Presentedby- ASHISH MAURYA (2015 VLSI-13)  A pulse with a high contribution is more likely to serve as one of the optimal pulses in the associated track.  This proposal requires eight searches in the case of N = 2, i.e, a search load is about 2.5% of that in G.729A, and similar reduction in searching for other values of ‘N’.  The improved G.729A speech codec can be utilized to improve the Voice over Internet Protocol (VoIP) performance on smartphone.  As a consequence, the energy efficiency requirement is met for an extended operation time period due to computational load reduction.
  • 22.
    20/20 REFERENCE PAPER Presented by-ASHISH MAURYA (2015 VLSI-13) C. Y. Yeh, "An efficient fixed codebook search for G.729 speech codec derived from RCM-based search algorithm," Signal and Information Processing (ChinaSIP), 2014 IEEE China Summit & International Conference on, Xi'an, 2014, pp. 76-79.
  • 23.
    Presented by- ASHISHMAURYA (2015 VLSI-13)