SlideShare a Scribd company logo
1 of 6
Download to read offline
Learning How Long to Wait: Adaptively-
Constrained Monotonic Multihead
Attention for Streaming ASR
Jaeyun Song, Hajin Shim, Eunho Yang
ASRU 2021
Machine Learning & Intelligence Laboratory
Motivation
● Monotonic	Multihead	Attention	(MMA)	shows	comparable	performance	to	the	
SOTA	online	methods	in	ASR,	but	there	is	room	for	reducing	the	latency.
● HeadDrop and Head-Synchronous Beam Search Decoding reduce the latency of
MMA, but there is a gap between the training and testing phase.
● Mutually-Constrained	MMA	(MCMMA)	reduces	the	latency	of	MMA	with	a	fixed	
waiting	time	threshold,	but	the	optimal	waiting	time	threshold	might	be	
different	depending	on	an	input	sequence.
In	this	work,
● We	proposed	Adaptively-Constrained	MMA	(ACMMA)	to assign	an	adequate	
waiting	time	threshold	to	decrease	latency	without	performance	drop.
● We	reduce	the	latency	even	with	improving	performance	of	MCMMA	in	
Librispeech	100-hour	and	AISHELL-1.
The Overview of Adaptively-Constrained MMA
3
Memory h
Head
m
Right Bound
Activation
Waiting Threshold 𝝐 = 𝟑
FFN
MMA
SAN
SAN
FFN
Encoder states
Previous output tokens
Token embedding
1D-Convolution
Linear & Softmax
Prediction
×𝟒
×𝟐
Context Update
Memory
Threshold
Predictor
MMA
Threshold Predictor in ACMMA
● The	threshold	predictor	(TP)	predicts	the	appropriate	waiting	time	threshold	
with	an	partially	observable	input	sequence.	è Non-differentiable
𝜖! = TP Q
ℎ!"#, ̂
𝑠!
where Q
ℎ!"# = ConCat Q
ℎ!"#
#
, … , Q
ℎ!"#
$
and Q
ℎ!"#
%
= V
#&'&(
W
𝛿!"#
%
𝜖!"# ℎ'
● We	compute	the	attention	distribution	via	linear	interpolation.
● In	testing	phase,	we	choose	the	nearest	integer	as	the	predicted	threshold.
W
𝛿!,'
%
𝜖! = 𝜖! + 1 − 𝜖!
W
𝛿!,'
%
𝜖! + 𝜖! − 𝜖!
W
𝛿!,'
%
𝜖! + 1
where	 W
𝛿!,'
%
𝜖! , W
𝛿!,'
%
𝜖! + 1 are	attention	distributions	calculated	by	MCMMA.
Threshold Regularization in ACMMA
● To	induce	TP	to	predict	a	low	waiting	time	threshold,	we	introduce	the	
threshold	regularization	(TR).
● TR	is	computed	by	averaging	predicted	thresholds	and	adjusted	by	𝜆(*.
● We	train	TP	with	two	approaches	such	as	end-to-end	manner	and	fine	tuning	
with	pretrained	MCMMA.
ℒ(* =
1
𝐿𝑆
V
#&+&,
V
#&!&-
𝜖!
(+)
ℒ = 1 − 𝜆010 ℒ232 + 𝜆010ℒ010 + 𝝀𝑻𝑹ℒ(*
Trade-offs	between	Performance	and	Latency
● Our	approach	shows	better	performance	than	MMA,	HeadDrop,	and	MCMMA
with	the	comparable	latency	with	HeadDrop.

More Related Content

Similar to J. Song, et. al., ASRU 2021, MLILAB, KAIST AI

Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Bharath Sudharsan
 

Similar to J. Song, et. al., ASRU 2021, MLILAB, KAIST AI (20)

STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...
STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...
STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...
 
STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...
STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...
STATIC NOISE MARGIN OPTIMIZED 11NM SHORTED-GATE AND INDEPENDENT-GATE LOW POWE...
 
IRJET- Comparative Analysis of High Speed SRAM Cell for 90nm CMOS Technology
IRJET- Comparative Analysis of High Speed SRAM Cell for 90nm CMOS TechnologyIRJET- Comparative Analysis of High Speed SRAM Cell for 90nm CMOS Technology
IRJET- Comparative Analysis of High Speed SRAM Cell for 90nm CMOS Technology
 
SRAM
SRAMSRAM
SRAM
 
A Low Power Delay Buffer Using Gated Driver Tree
A Low Power Delay Buffer Using Gated Driver TreeA Low Power Delay Buffer Using Gated Driver Tree
A Low Power Delay Buffer Using Gated Driver Tree
 
BWA-MEM2-IPDPS 2019
BWA-MEM2-IPDPS 2019BWA-MEM2-IPDPS 2019
BWA-MEM2-IPDPS 2019
 
Cq4301536541
Cq4301536541Cq4301536541
Cq4301536541
 
Ef31876879
Ef31876879Ef31876879
Ef31876879
 
Ef31876879
Ef31876879Ef31876879
Ef31876879
 
Ijetcas14 542
Ijetcas14 542Ijetcas14 542
Ijetcas14 542
 
IRJET- Design of Energy Efficient 8T SRAM Cell at 90nm Technology
IRJET-  	  Design of Energy Efficient 8T SRAM Cell at 90nm TechnologyIRJET-  	  Design of Energy Efficient 8T SRAM Cell at 90nm Technology
IRJET- Design of Energy Efficient 8T SRAM Cell at 90nm Technology
 
An improvised design implementation of sram
An improvised design implementation of sramAn improvised design implementation of sram
An improvised design implementation of sram
 
An improvised design implementation of sram
An improvised design implementation of sramAn improvised design implementation of sram
An improvised design implementation of sram
 
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
Implementing AI: Hardware Challenges: Heterogeneous and Adaptive Computing fo...
 
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
Enabling Machine Learning on the Edge using SRAM Conserving Efficient Neural ...
 
Dh25647654
Dh25647654Dh25647654
Dh25647654
 
Ph.D. Thesis presentation
Ph.D. Thesis presentationPh.D. Thesis presentation
Ph.D. Thesis presentation
 
IRJET- Modified Low Power Single Bit-Line Static Random-Access Memory Cell Ar...
IRJET- Modified Low Power Single Bit-Line Static Random-Access Memory Cell Ar...IRJET- Modified Low Power Single Bit-Line Static Random-Access Memory Cell Ar...
IRJET- Modified Low Power Single Bit-Line Static Random-Access Memory Cell Ar...
 
Description fpaa
Description fpaaDescription fpaa
Description fpaa
 
Design and Implementation of 6t SRAM using FINFET with Low Power Application
Design and Implementation of 6t SRAM using FINFET with Low Power ApplicationDesign and Implementation of 6t SRAM using FINFET with Low Power Application
Design and Implementation of 6t SRAM using FINFET with Low Power Application
 

More from MLILAB

H. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AIH. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
MLILAB
 

More from MLILAB (20)

J. Jeong, AAAI 2024, MLILAB, KAIST AI..
J. Jeong,  AAAI 2024, MLILAB, KAIST AI..J. Jeong,  AAAI 2024, MLILAB, KAIST AI..
J. Jeong, AAAI 2024, MLILAB, KAIST AI..
 
J. Yun, NeurIPS 2023, MLILAB, KAISTAI
J. Yun,  NeurIPS 2023,  MLILAB,  KAISTAIJ. Yun,  NeurIPS 2023,  MLILAB,  KAISTAI
J. Yun, NeurIPS 2023, MLILAB, KAISTAI
 
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
S. Kim,  NeurIPS 2023,  MLILAB,  KAISTAIS. Kim,  NeurIPS 2023,  MLILAB,  KAISTAI
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
 
C. Kim, INTERSPEECH 2023, MLILAB, KAISTAI
C. Kim, INTERSPEECH 2023, MLILAB, KAISTAIC. Kim, INTERSPEECH 2023, MLILAB, KAISTAI
C. Kim, INTERSPEECH 2023, MLILAB, KAISTAI
 
Y. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAIY. Jung, ICML 2023, MLILAB, KAISTAI
Y. Jung, ICML 2023, MLILAB, KAISTAI
 
J. Song, S. Kim, ICML 2023, MLILAB, KAISTAI
J. Song, S. Kim, ICML 2023, MLILAB, KAISTAIJ. Song, S. Kim, ICML 2023, MLILAB, KAISTAI
J. Song, S. Kim, ICML 2023, MLILAB, KAISTAI
 
K. Seo, ICASSP 2023, MLILAB, KAISTAI
K. Seo, ICASSP 2023, MLILAB, KAISTAIK. Seo, ICASSP 2023, MLILAB, KAISTAI
K. Seo, ICASSP 2023, MLILAB, KAISTAI
 
G. Kim, CVPR 2023, MLILAB, KAISTAI
G. Kim, CVPR 2023, MLILAB, KAISTAIG. Kim, CVPR 2023, MLILAB, KAISTAI
G. Kim, CVPR 2023, MLILAB, KAISTAI
 
S. Kim, ICLR 2023, MLILAB, KAISTAI
S. Kim, ICLR 2023, MLILAB, KAISTAIS. Kim, ICLR 2023, MLILAB, KAISTAI
S. Kim, ICLR 2023, MLILAB, KAISTAI
 
Y. Kim, ICLR 2023, MLILAB, KAISTAI
Y. Kim, ICLR 2023, MLILAB, KAISTAIY. Kim, ICLR 2023, MLILAB, KAISTAI
Y. Kim, ICLR 2023, MLILAB, KAISTAI
 
J. Yun, AISTATS 2022, MLILAB, KAISTAI
J. Yun, AISTATS 2022, MLILAB, KAISTAIJ. Yun, AISTATS 2022, MLILAB, KAISTAI
J. Yun, AISTATS 2022, MLILAB, KAISTAI
 
J. Song, J. Park, ICML 2022, MLILAB, KAISTAI
J. Song, J. Park, ICML 2022, MLILAB, KAISTAIJ. Song, J. Park, ICML 2022, MLILAB, KAISTAI
J. Song, J. Park, ICML 2022, MLILAB, KAISTAI
 
J. Park, J. Song, ICLR 2022, MLILAB, KAISTAI
J. Park, J. Song, ICLR 2022, MLILAB, KAISTAIJ. Park, J. Song, ICLR 2022, MLILAB, KAISTAI
J. Park, J. Song, ICLR 2022, MLILAB, KAISTAI
 
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAIJ. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
J. Park, H. Shim, AAAI 2022, MLILAB, KAISTAI
 
J. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AIJ. Park, AAAI 2022, MLILAB, KAIST AI
J. Park, AAAI 2022, MLILAB, KAIST AI
 
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AIT. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
T. Yoon, et. al., ICLR 2021, MLILAB, KAIST AI
 
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AIG. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
G. Park, J.-Y. Yang, et. al., NeurIPS 2020, MLILAB, KAIST AI
 
I. Chung, AAAI 2020, MLILAB, KAIST AI
I. Chung, AAAI 2020, MLILAB, KAIST AII. Chung, AAAI 2020, MLILAB, KAIST AI
I. Chung, AAAI 2020, MLILAB, KAIST AI
 
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AIH. Shim, NeurIPS 2018, MLILAB, KAIST AI
H. Shim, NeurIPS 2018, MLILAB, KAIST AI
 
J. Yi, ICLR 2020, MLILAB, KAIST AI
J. Yi, ICLR 2020, MLILAB, KAIST AIJ. Yi, ICLR 2020, MLILAB, KAIST AI
J. Yi, ICLR 2020, MLILAB, KAIST AI
 

Recently uploaded

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Dr.Costas Sachpazis
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 

J. Song, et. al., ASRU 2021, MLILAB, KAIST AI

  • 1. Learning How Long to Wait: Adaptively- Constrained Monotonic Multihead Attention for Streaming ASR Jaeyun Song, Hajin Shim, Eunho Yang ASRU 2021 Machine Learning & Intelligence Laboratory
  • 2. Motivation ● Monotonic Multihead Attention (MMA) shows comparable performance to the SOTA online methods in ASR, but there is room for reducing the latency. ● HeadDrop and Head-Synchronous Beam Search Decoding reduce the latency of MMA, but there is a gap between the training and testing phase. ● Mutually-Constrained MMA (MCMMA) reduces the latency of MMA with a fixed waiting time threshold, but the optimal waiting time threshold might be different depending on an input sequence. In this work, ● We proposed Adaptively-Constrained MMA (ACMMA) to assign an adequate waiting time threshold to decrease latency without performance drop. ● We reduce the latency even with improving performance of MCMMA in Librispeech 100-hour and AISHELL-1.
  • 3. The Overview of Adaptively-Constrained MMA 3 Memory h Head m Right Bound Activation Waiting Threshold 𝝐 = 𝟑 FFN MMA SAN SAN FFN Encoder states Previous output tokens Token embedding 1D-Convolution Linear & Softmax Prediction ×𝟒 ×𝟐 Context Update Memory Threshold Predictor MMA
  • 4. Threshold Predictor in ACMMA ● The threshold predictor (TP) predicts the appropriate waiting time threshold with an partially observable input sequence. è Non-differentiable 𝜖! = TP Q ℎ!"#, ̂ 𝑠! where Q ℎ!"# = ConCat Q ℎ!"# # , … , Q ℎ!"# $ and Q ℎ!"# % = V #&'&( W 𝛿!"# % 𝜖!"# ℎ' ● We compute the attention distribution via linear interpolation. ● In testing phase, we choose the nearest integer as the predicted threshold. W 𝛿!,' % 𝜖! = 𝜖! + 1 − 𝜖! W 𝛿!,' % 𝜖! + 𝜖! − 𝜖! W 𝛿!,' % 𝜖! + 1 where W 𝛿!,' % 𝜖! , W 𝛿!,' % 𝜖! + 1 are attention distributions calculated by MCMMA.
  • 5. Threshold Regularization in ACMMA ● To induce TP to predict a low waiting time threshold, we introduce the threshold regularization (TR). ● TR is computed by averaging predicted thresholds and adjusted by 𝜆(*. ● We train TP with two approaches such as end-to-end manner and fine tuning with pretrained MCMMA. ℒ(* = 1 𝐿𝑆 V #&+&, V #&!&- 𝜖! (+) ℒ = 1 − 𝜆010 ℒ232 + 𝜆010ℒ010 + 𝝀𝑻𝑹ℒ(*