SlideShare a Scribd company logo
Front-end Audio Processing:
Reflections on Issues, Requirements, and Solutions
Tomas Gaensler
mh acoustics
www.mhacoustics.com
Summit NJ/Burlington VT
USA
Front-end Audio Processing
Processing to enhance perceived and/or measured sound quality
in communication and recording devices
Not So Famous Quotes (Acoustic Jewelry/Bluetooth Headset)
 Gary Elko (mh/Bell labs colleague)
 At IWAENC 1995: “Acoustic Echo
cancellation will not be needed in the
future when people wear acoustic
jewelry”
 Arno Penzias (1978 Nobel prize laureate)
 “No one would want acoustic jewelry
because people would think the users
talking to themselves are crazy”
 I’m glad the success of Bluetooth headsets
show that both were completely wrong!
Classical Front-end Architectures - POTS
BPF
Receive side
(Rx)
Send side
(Tx)
BPF
Carbon microphone with
expansion effect that
reduces noise
Large coupling
loss in handset
mode
Switch
Loss
Switch loss in
speakerphone
supporting telephones
Classical Front-end Architectures – Cellphone 1995
BPF/
ADC
Receive side
(Rx)
Send side
(Tx)
BPF/
DAC
EQ
EQ
Encoder/Decoder
Vol
AEC
NLP
Classical Front-end Architectures – Cellphone 2005 - 2010
BPF/
ADC
Receive side
(Rx)
Send side
(Tx)
BPF/
DAC
EQ
EQ
Encoder/Decoder
Vol
AEC
TX
LEV
NS NLP
RX
LEV
NS
Cellphones and Handsfree
 Common problems:
 Far-end listener does not
hear near-end talker
 Near-end listener does not
understand far-end talker
 Why?
 Form factor – Size
 Limited understanding of
physics and acoustics(?)
 Echo louder than near-end:
 Linear AEC
 ERLE  20-30 dB
 After cancellation Residual
Echo to Near-end Ratio
(RENR):
 RENR  90-20-70 = 0 dB
RX/TX Levels, Coupling and Doubletalk
 >20 dB of residual echo
suppression required
 Duplexness suffers
Far-end  95—100 dBSPL at loudspeaker
85—90
dBSPL at mic
Near-end talker  55—70 dBSPL at mic
SPL [dB]
110
70
RAIL (e.g. 32768 or 1)
Digital Level
Speech lev.
Q-noise (white)
14
bits
Mic
SNR=65
dB
26
Mic circuit noise (1/f)
94
29
Room noise lev.
43
TX: Dynamic Range and Noise
 Echo 90 dBSPL  Peak echo 105-110
dB
 No saturation of echo in TX path
ADC
Near-end speech
Level: 70 dBSPL
 Actual speech to room noise ratio is
only about 27 dB at best
Echo Level:
90 dBSPL
 Gain is required to get loud enough output
 Perceived noise level is ~20 dB above normal room
noise level
TX: Fixed-point Processing and Quantization Noise
 N=64  Q-noise increases by 36 dB
 Double-precision “required”
ADC
AFB
(FFT)
SFB
(FFT)
Q-noise increases by 6log2(N) dB!
SPL [dB]
110
70
RAIL (e.g. 32768 or 1)
Digital Level
Speech lev.
LSB for 16-bits
14
Q-noise from
64-point FFT processing
50
6log2(64)
EQ
DAC
RX: Dynamic Range and Distortion
 Small loudspeakers have rather high cut-off frequency
(high-pass)
 EQ often required to get acceptable “sound” (frequency
response). However EQ means:
 Loss of signal loudness and dynamic range
 Increased (analog) distortion
 Many manufacturers compensate the loss of signal level by
excessive digital gain and therefore get (digital) saturation
To AEC
Digital gain
Analog gain
What Can or Should be Done?
 Minimize acoustical coupling by good physical design
 TX
 Use noise suppression but not excessively
 Double-precision, block scaling, or floating-point
 RX
 Compression instead of fixed gain
 10% or less loudspeaker/driver THD is desired
What about Non-linear AEC Algorithms?
 Interesting problem proposed and worked on for many years
 Not practical in most AEC applications since
 Complicated model
 Gain and therefore saturation possibly in both TX and RX
paths
 Added complexity and system cost
 Often slow convergence
 Difficult to fine-tune in field
 Even when non-linear cancellation works perfectly, the user
still perceives a distorted loudspeaker signal!
Classical Front-end Architectures – Cellphone 2005 - 2010
BPF/
ADC
Receive side
(Rx)
Send side
(Tx)
BPF/
DAC
EQ
EQ
Encoder/Decoder
Vol
AEC
TX
LEV
NS NLP
RX
LEV
NS
Why RX NS?
Why TX NS?
Single Channel Noise Suppression
 Basic single channel noise suppressor
 An extremely successful signal processing invention by
Manfred Schroeder in the 1960s
 Musical tones – is it a (solved) problem?
 How do we evaluate and improve quality?
 How about convergence rate?
Background to Single Channel Noise Suppressors
 Block processing:
 Frequency domain model:
 Linear Time-varying filter:
 Wiener filter:
speech
NS
)
(
)
(
)
( n
v
n
s
n
y 
 )
(
ˆ n
s
noise
“enhanced”
speech
( , ) ( , )
( , )
( , )
( , ) ( , ) ( , )
y v
s
s v y
P k m P k m
P k m
H k m
P k m P k m P k m
 
 

ˆ( , ) ( , ) ( , )
S k m H k m Y k m

1
2 /
0
( , ) ( ) ( )
K
j kn K
n
X k m w n x m n e 



 

( , ) ( , ) ( , )
Y k m S k m V k m
 
Background to Single Channel Noise Suppressors
 Estimation of spectra is often done recursively:
 Frequency smoothing:
2 2
( , ) [ ( , 1) ( , ) ] ( , )
y y
P k m P k m Y k m Y k m

   
2 2
( , ) [ ( , 1) ( , ) ] ( , )
v v
P k m P k m Y k m Y k m

    , when speech is “not” present
,
  time-dimension averaging constants
'
( , ) ( ', ) ( ', )
b
b
k
k k
H k m b k k H k k m

 

( ', )
b k k frequency-dimension averaging constants
, , ( ', )
b k k
  and are critical for musical tone control
Musical Tones – Is it a (Solved) Problem?
 Examples
 Original (“Sally Sievers’ reel, June-Sept. 1964” by Manfred
Schroeder and Mohan Sondhi at Bell Labs)
 Original + noise (iSNR ~ 6 dB)
 Schroeder – 1960s
 “Generic spectral subtraction” – Boll 1979
 IS-127 – 1995
 “A problem of last century”, only a constraint in design
 Controlling variance of suppression gains
 Any NS algorithm should be constrained not to have musical
tones
 Must only have a small impact on voice quality
Quality Metrics
 Most importantly: Listen!
 SNR
 Total
 Segmental
 During speech
 Distortion metrics:
 ISD (Itakura-Saito distance)
 ITU-T P.862: PESQ/MOS-LQO
Quality Metric – P.862 (PESQ/MOS-LQO)
 MOS-LQO (MOS Listening
Quality Objective)
 Alg-1/2 – Wiener methods
with 12 dB noise suppression
P.862.2
1.5
2
2.5
3
3.5
4
4.5
0 5 10 15 20 25 30 35 40 45 50 55 60
SNR (dB)
MOS-LQO
unproc Alg-1 Alg-2
 What can the best noise
suppressor achieve?
Quality Metric – “My Rule of Thumb”
P.862.2
1.5
2
2.5
3
3.5
4
4.5
0 5 10 15 20 25 30 35 40 45 50 55 60
SNR (dB)
MOS-LQO
unproc Alg-1 Alg-2 Bound (12 dB)
12 dB
 Ideal MOS (PESQ)
performance bound is given by
shifting the unprocessed
PESQ-curve to the left
 Example for 12 dB suppression
 12 dB shift to the left
Convergence Rate
 Important performance criterion:
 Non-stationary noise conditions
 Frame loss
 Main objective:
 Maximize convergence rate while maintaining speech
quality
Convergence Rate – A Useful Test
a) Input sequence
b) IS-127
c) Wiener Based
d) A spectral subtraction
m-script retrieved
from the internet
Convergence Rate and MOS-LQO
a) “Normal”
b) “Fast”
c) MOS-LQO
Current Applications and Drivers of NS Technology
 Where is NS going in industry now?
 Beyond “12 dB” of suppression
 Multi-microphone solutions
 Two- or more channel suppressors
 Linear beamforming
 Applications
 Mobile phones (a few two-microphone models have
reached the market)
 Bluetooth headsets: great "new" application for signal
processing (Ericsson BT headset 2000)
Background to Linear Beamforming
 N : Number of microphones
 Broadside linear beamforming (e.g.
delay-sum)
 Directional gain: 10log(N)
 White Noise Gain (WNG)>0
 Practical size: “large” (~30cm)
 Endfire differential beamforming
 Directional gain: 20log(N)
 WNG<0
 Practical size: “small” (1.5-5cm)
Processing
Endfire direction
Broadside direction
 Differential beamformers more suitable for small form-factors
Background to Linear Beamforming
 What do we gain?
 Less reverberation (increased intelligibility)
 Less (environmental) noise
 No (or low) distortion on axis
 Possible interference rejection by spatial zero(s)
 Some Issues:
 Performance is given by critical distance!
 Increase in sensor noise (WNG, differential beamforming)
Beamforming: Critical Distance
 Critical distance (Reverberation
radius): reverberant-to-direct path
energy ratio is 0 dB:
 DI = Directivity Index: gain of direct
to reverberant energy over an omni-
directional microphone
 Order of finite differences used. 1st :
2 mics, 2nd : 3 mics etc)
1/2
60
0.1
c
V
r
T


 
  
 
( /10)
DI
  directivity factor=10
Order DI [dB]
0 0
1 6 2.0
2 9.5 3.0
3 12 4.0
c
r
0
r
0
r
0
r
c
r
0
r
First-Order Differential Beamforming
0 1
1
1 0 1 1 0 1
1
1 1
1
( , ) [ cos( )], ( )
( ) ( , ) ( ) [ cos( )] [ (1 )cos( )] ,
/
: (1 )cos( )
L
L
d
E P T H f
c
d T
Y E H T P P
c T d c
    

        
  
 
    
 
     

 
Beamformer response
( , )
E   ( )
Y 
m1
m2
d

T1
- HL(w)
0
P
Classical First-Order Beamformer Responses
1 0.5
  1 0.25
  1 0.0
 
Cardioid Hypercardioid Dipole
Beamforming Demo: DEWIND processing

More Related Content

Similar to Tomas_IWAENC_keynote10.ppt

Optical recording and reproduction
Optical recording and reproductionOptical recording and reproduction
Optical recording and reproduction
Sri Manakula Vinayagar Engineering College
 
The Fundamentals of HVAC Acoustics
The Fundamentals of HVAC AcousticsThe Fundamentals of HVAC Acoustics
The Fundamentals of HVAC Acoustics
Russell Hawkins
 
Pioneer AV Receivers 2014 - features explained (Europe)
Pioneer AV Receivers 2014 - features explained (Europe)Pioneer AV Receivers 2014 - features explained (Europe)
Pioneer AV Receivers 2014 - features explained (Europe)
Pioneer Europe
 
Design of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB CommunicationsDesign of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB Communications
RFIC-IUMA
 
Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...
Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...
Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...
Pioneer Europe
 
Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_
Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_
Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_
Vicente Alvarez
 
Final bsc report full
Final bsc report fullFinal bsc report full
Final bsc report full
Nge Chen
 
Project 1 : Auditorium - A Case Study of Acoustic Design
Project 1 : Auditorium - A Case Study of Acoustic DesignProject 1 : Auditorium - A Case Study of Acoustic Design
Project 1 : Auditorium - A Case Study of Acoustic Design
Bryan Yeoh
 
Digital Audio & Signal Processing (Elad Gariany)
Digital Audio & Signal Processing (Elad Gariany)Digital Audio & Signal Processing (Elad Gariany)
Digital Audio & Signal Processing (Elad Gariany)Ron Reiter
 
Pioneer AV Receivers 2014 - features explained (Singapore)
Pioneer AV Receivers 2014 - features explained (Singapore)Pioneer AV Receivers 2014 - features explained (Singapore)
Pioneer AV Receivers 2014 - features explained (Singapore)
Pioneer Europe
 
Quantum Computer - Low noise signal processing solution
Quantum Computer - Low noise signal processing solutionQuantum Computer - Low noise signal processing solution
Quantum Computer - Low noise signal processing solution
NIHON DENKEI SINGAPORE
 
Pioneer AV Receivers 2014 - features explained (Thailand)
Pioneer AV Receivers 2014 - features explained (Thailand)Pioneer AV Receivers 2014 - features explained (Thailand)
Pioneer AV Receivers 2014 - features explained (Thailand)
Pioneer Europe
 
Recording Devices
Recording DevicesRecording Devices
Recording Devicesbsutton
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
MayankKumar633196
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
mohan s
 
Audio Signal Processing
Audio Signal Processing Audio Signal Processing
Audio Signal Processing
Ahmed A. Arefin
 
E media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberationE media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberation
Giacomo Vairetti
 
Low Sound Solutions
Low Sound SolutionsLow Sound Solutions
Low Sound Solutions
CASAIRCO
 
Q-watt.pdf
Q-watt.pdfQ-watt.pdf

Similar to Tomas_IWAENC_keynote10.ppt (20)

Optical recording and reproduction
Optical recording and reproductionOptical recording and reproduction
Optical recording and reproduction
 
The Fundamentals of HVAC Acoustics
The Fundamentals of HVAC AcousticsThe Fundamentals of HVAC Acoustics
The Fundamentals of HVAC Acoustics
 
Pioneer AV Receivers 2014 - features explained (Europe)
Pioneer AV Receivers 2014 - features explained (Europe)Pioneer AV Receivers 2014 - features explained (Europe)
Pioneer AV Receivers 2014 - features explained (Europe)
 
Design of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB CommunicationsDesign of Radio Frequency Integrated Circuits for UWB Communications
Design of Radio Frequency Integrated Circuits for UWB Communications
 
Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...
Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...
Pioneer AV Receivers 2014 - features explained (Vietnam, Philippines, Hong Ko...
 
Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_
Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_
Jvc kd r35_38_411_412_414_415_416_417_418_419_kd_s27j__et_
 
Final bsc report full
Final bsc report fullFinal bsc report full
Final bsc report full
 
Project 1 : Auditorium - A Case Study of Acoustic Design
Project 1 : Auditorium - A Case Study of Acoustic DesignProject 1 : Auditorium - A Case Study of Acoustic Design
Project 1 : Auditorium - A Case Study of Acoustic Design
 
Digital Audio & Signal Processing (Elad Gariany)
Digital Audio & Signal Processing (Elad Gariany)Digital Audio & Signal Processing (Elad Gariany)
Digital Audio & Signal Processing (Elad Gariany)
 
Pioneer AV Receivers 2014 - features explained (Singapore)
Pioneer AV Receivers 2014 - features explained (Singapore)Pioneer AV Receivers 2014 - features explained (Singapore)
Pioneer AV Receivers 2014 - features explained (Singapore)
 
Quantum Computer - Low noise signal processing solution
Quantum Computer - Low noise signal processing solutionQuantum Computer - Low noise signal processing solution
Quantum Computer - Low noise signal processing solution
 
Soundpres
SoundpresSoundpres
Soundpres
 
Pioneer AV Receivers 2014 - features explained (Thailand)
Pioneer AV Receivers 2014 - features explained (Thailand)Pioneer AV Receivers 2014 - features explained (Thailand)
Pioneer AV Receivers 2014 - features explained (Thailand)
 
Recording Devices
Recording DevicesRecording Devices
Recording Devices
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
 
lect03-audio-representation.ppt
lect03-audio-representation.pptlect03-audio-representation.ppt
lect03-audio-representation.ppt
 
Audio Signal Processing
Audio Signal Processing Audio Signal Processing
Audio Signal Processing
 
E media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberationE media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberation
 
Low Sound Solutions
Low Sound SolutionsLow Sound Solutions
Low Sound Solutions
 
Q-watt.pdf
Q-watt.pdfQ-watt.pdf
Q-watt.pdf
 

More from Rakesh Pogula

BWE2.ppt
BWE2.pptBWE2.ppt
BWE2.ppt
Rakesh Pogula
 
BWE1.ppt
BWE1.pptBWE1.ppt
BWE1.ppt
Rakesh Pogula
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
Rakesh Pogula
 
lec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptxlec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptx
Rakesh Pogula
 
Combinational Filters.pptx
Combinational Filters.pptxCombinational Filters.pptx
Combinational Filters.pptx
Rakesh Pogula
 
Research_Wu.pptx
Research_Wu.pptxResearch_Wu.pptx
Research_Wu.pptx
Rakesh Pogula
 

More from Rakesh Pogula (6)

BWE2.ppt
BWE2.pptBWE2.ppt
BWE2.ppt
 
BWE1.ppt
BWE1.pptBWE1.ppt
BWE1.ppt
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
lec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptxlec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptx
 
Combinational Filters.pptx
Combinational Filters.pptxCombinational Filters.pptx
Combinational Filters.pptx
 
Research_Wu.pptx
Research_Wu.pptxResearch_Wu.pptx
Research_Wu.pptx
 

Recently uploaded

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 

Recently uploaded (20)

Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 

Tomas_IWAENC_keynote10.ppt

  • 1. Front-end Audio Processing: Reflections on Issues, Requirements, and Solutions Tomas Gaensler mh acoustics www.mhacoustics.com Summit NJ/Burlington VT USA
  • 2. Front-end Audio Processing Processing to enhance perceived and/or measured sound quality in communication and recording devices
  • 3. Not So Famous Quotes (Acoustic Jewelry/Bluetooth Headset)  Gary Elko (mh/Bell labs colleague)  At IWAENC 1995: “Acoustic Echo cancellation will not be needed in the future when people wear acoustic jewelry”  Arno Penzias (1978 Nobel prize laureate)  “No one would want acoustic jewelry because people would think the users talking to themselves are crazy”  I’m glad the success of Bluetooth headsets show that both were completely wrong!
  • 4. Classical Front-end Architectures - POTS BPF Receive side (Rx) Send side (Tx) BPF Carbon microphone with expansion effect that reduces noise Large coupling loss in handset mode Switch Loss Switch loss in speakerphone supporting telephones
  • 5. Classical Front-end Architectures – Cellphone 1995 BPF/ ADC Receive side (Rx) Send side (Tx) BPF/ DAC EQ EQ Encoder/Decoder Vol AEC NLP
  • 6. Classical Front-end Architectures – Cellphone 2005 - 2010 BPF/ ADC Receive side (Rx) Send side (Tx) BPF/ DAC EQ EQ Encoder/Decoder Vol AEC TX LEV NS NLP RX LEV NS
  • 7. Cellphones and Handsfree  Common problems:  Far-end listener does not hear near-end talker  Near-end listener does not understand far-end talker  Why?  Form factor – Size  Limited understanding of physics and acoustics(?)
  • 8.  Echo louder than near-end:  Linear AEC  ERLE  20-30 dB  After cancellation Residual Echo to Near-end Ratio (RENR):  RENR  90-20-70 = 0 dB RX/TX Levels, Coupling and Doubletalk  >20 dB of residual echo suppression required  Duplexness suffers Far-end  95—100 dBSPL at loudspeaker 85—90 dBSPL at mic Near-end talker  55—70 dBSPL at mic
  • 9. SPL [dB] 110 70 RAIL (e.g. 32768 or 1) Digital Level Speech lev. Q-noise (white) 14 bits Mic SNR=65 dB 26 Mic circuit noise (1/f) 94 29 Room noise lev. 43 TX: Dynamic Range and Noise  Echo 90 dBSPL  Peak echo 105-110 dB  No saturation of echo in TX path ADC Near-end speech Level: 70 dBSPL  Actual speech to room noise ratio is only about 27 dB at best Echo Level: 90 dBSPL  Gain is required to get loud enough output  Perceived noise level is ~20 dB above normal room noise level
  • 10. TX: Fixed-point Processing and Quantization Noise  N=64  Q-noise increases by 36 dB  Double-precision “required” ADC AFB (FFT) SFB (FFT) Q-noise increases by 6log2(N) dB! SPL [dB] 110 70 RAIL (e.g. 32768 or 1) Digital Level Speech lev. LSB for 16-bits 14 Q-noise from 64-point FFT processing 50 6log2(64)
  • 11. EQ DAC RX: Dynamic Range and Distortion  Small loudspeakers have rather high cut-off frequency (high-pass)  EQ often required to get acceptable “sound” (frequency response). However EQ means:  Loss of signal loudness and dynamic range  Increased (analog) distortion  Many manufacturers compensate the loss of signal level by excessive digital gain and therefore get (digital) saturation To AEC Digital gain Analog gain
  • 12. What Can or Should be Done?  Minimize acoustical coupling by good physical design  TX  Use noise suppression but not excessively  Double-precision, block scaling, or floating-point  RX  Compression instead of fixed gain  10% or less loudspeaker/driver THD is desired
  • 13. What about Non-linear AEC Algorithms?  Interesting problem proposed and worked on for many years  Not practical in most AEC applications since  Complicated model  Gain and therefore saturation possibly in both TX and RX paths  Added complexity and system cost  Often slow convergence  Difficult to fine-tune in field  Even when non-linear cancellation works perfectly, the user still perceives a distorted loudspeaker signal!
  • 14. Classical Front-end Architectures – Cellphone 2005 - 2010 BPF/ ADC Receive side (Rx) Send side (Tx) BPF/ DAC EQ EQ Encoder/Decoder Vol AEC TX LEV NS NLP RX LEV NS Why RX NS? Why TX NS?
  • 15. Single Channel Noise Suppression  Basic single channel noise suppressor  An extremely successful signal processing invention by Manfred Schroeder in the 1960s  Musical tones – is it a (solved) problem?  How do we evaluate and improve quality?  How about convergence rate?
  • 16. Background to Single Channel Noise Suppressors  Block processing:  Frequency domain model:  Linear Time-varying filter:  Wiener filter: speech NS ) ( ) ( ) ( n v n s n y   ) ( ˆ n s noise “enhanced” speech ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) y v s s v y P k m P k m P k m H k m P k m P k m P k m      ˆ( , ) ( , ) ( , ) S k m H k m Y k m  1 2 / 0 ( , ) ( ) ( ) K j kn K n X k m w n x m n e        ( , ) ( , ) ( , ) Y k m S k m V k m  
  • 17. Background to Single Channel Noise Suppressors  Estimation of spectra is often done recursively:  Frequency smoothing: 2 2 ( , ) [ ( , 1) ( , ) ] ( , ) y y P k m P k m Y k m Y k m      2 2 ( , ) [ ( , 1) ( , ) ] ( , ) v v P k m P k m Y k m Y k m      , when speech is “not” present ,   time-dimension averaging constants ' ( , ) ( ', ) ( ', ) b b k k k H k m b k k H k k m     ( ', ) b k k frequency-dimension averaging constants , , ( ', ) b k k   and are critical for musical tone control
  • 18. Musical Tones – Is it a (Solved) Problem?  Examples  Original (“Sally Sievers’ reel, June-Sept. 1964” by Manfred Schroeder and Mohan Sondhi at Bell Labs)  Original + noise (iSNR ~ 6 dB)  Schroeder – 1960s  “Generic spectral subtraction” – Boll 1979  IS-127 – 1995  “A problem of last century”, only a constraint in design  Controlling variance of suppression gains  Any NS algorithm should be constrained not to have musical tones  Must only have a small impact on voice quality
  • 19. Quality Metrics  Most importantly: Listen!  SNR  Total  Segmental  During speech  Distortion metrics:  ISD (Itakura-Saito distance)  ITU-T P.862: PESQ/MOS-LQO
  • 20. Quality Metric – P.862 (PESQ/MOS-LQO)  MOS-LQO (MOS Listening Quality Objective)  Alg-1/2 – Wiener methods with 12 dB noise suppression P.862.2 1.5 2 2.5 3 3.5 4 4.5 0 5 10 15 20 25 30 35 40 45 50 55 60 SNR (dB) MOS-LQO unproc Alg-1 Alg-2  What can the best noise suppressor achieve?
  • 21. Quality Metric – “My Rule of Thumb” P.862.2 1.5 2 2.5 3 3.5 4 4.5 0 5 10 15 20 25 30 35 40 45 50 55 60 SNR (dB) MOS-LQO unproc Alg-1 Alg-2 Bound (12 dB) 12 dB  Ideal MOS (PESQ) performance bound is given by shifting the unprocessed PESQ-curve to the left  Example for 12 dB suppression  12 dB shift to the left
  • 22. Convergence Rate  Important performance criterion:  Non-stationary noise conditions  Frame loss  Main objective:  Maximize convergence rate while maintaining speech quality
  • 23. Convergence Rate – A Useful Test a) Input sequence b) IS-127 c) Wiener Based d) A spectral subtraction m-script retrieved from the internet
  • 24. Convergence Rate and MOS-LQO a) “Normal” b) “Fast” c) MOS-LQO
  • 25. Current Applications and Drivers of NS Technology  Where is NS going in industry now?  Beyond “12 dB” of suppression  Multi-microphone solutions  Two- or more channel suppressors  Linear beamforming  Applications  Mobile phones (a few two-microphone models have reached the market)  Bluetooth headsets: great "new" application for signal processing (Ericsson BT headset 2000)
  • 26. Background to Linear Beamforming  N : Number of microphones  Broadside linear beamforming (e.g. delay-sum)  Directional gain: 10log(N)  White Noise Gain (WNG)>0  Practical size: “large” (~30cm)  Endfire differential beamforming  Directional gain: 20log(N)  WNG<0  Practical size: “small” (1.5-5cm) Processing Endfire direction Broadside direction  Differential beamformers more suitable for small form-factors
  • 27. Background to Linear Beamforming  What do we gain?  Less reverberation (increased intelligibility)  Less (environmental) noise  No (or low) distortion on axis  Possible interference rejection by spatial zero(s)  Some Issues:  Performance is given by critical distance!  Increase in sensor noise (WNG, differential beamforming)
  • 28. Beamforming: Critical Distance  Critical distance (Reverberation radius): reverberant-to-direct path energy ratio is 0 dB:  DI = Directivity Index: gain of direct to reverberant energy over an omni- directional microphone  Order of finite differences used. 1st : 2 mics, 2nd : 3 mics etc) 1/2 60 0.1 c V r T          ( /10) DI   directivity factor=10 Order DI [dB] 0 0 1 6 2.0 2 9.5 3.0 3 12 4.0 c r 0 r 0 r 0 r c r 0 r
  • 29. First-Order Differential Beamforming 0 1 1 1 0 1 1 0 1 1 1 1 1 ( , ) [ cos( )], ( ) ( ) ( , ) ( ) [ cos( )] [ (1 )cos( )] , / : (1 )cos( ) L L d E P T H f c d T Y E H T P P c T d c                                     Beamformer response ( , ) E   ( ) Y  m1 m2 d  T1 - HL(w) 0 P
  • 30. Classical First-Order Beamformer Responses 1 0.5   1 0.25   1 0.0   Cardioid Hypercardioid Dipole

Editor's Notes

  1. Before moving on to processing: Microphone model – 1938 Western Electric 630A. Baffle makes it look like an omni