SlideShare a Scribd company logo
Xavier Giro-i-Nieto
Associate Professor
Universitat Politecnica de Catalunya
@DocXavi
xavier.giro@upc.edu
Attention Mechanisms
Lecture 14
[course site]
2
Acknowledgments
Amaia Salvador
Doctor
Universitat Politècnica de Catalunya
Marta R. Costa-jussà
Associate Professor
Universitat Politècnica de Catalunya
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
3
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
4
Motivation: Vision
5
Image:
H x W x 3
bird
The whole input volume is used to predict the output...
Motivation: Vision
6
Image:
H x W x 3
bird
...despite the fact that not all pixels are equally important
The whole input volume is used to predict the output...
Motivation: Automatic Speech Recognition (ASR)
7
Figure: distill.pub
Chan, W., Jaitly, N., Le, Q., & Vinyals, O. Listen, attend and spell: A neural network for large vocabulary conversational
speech recognition. ICASSP 2016.
Transcribed letters correspond with just a few phonemes.
Motivation: Neural Machine Translation (NMT)
8
Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015)
The translated words are often related to a subset of the words from the source
sentence.
(Edge thicknesses represent the attention weights found by the attention model)
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
9
Seq2Seq with RNN
10
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014.
The Seq2Seq scheme can be implemented with RNNs:
● trigger the output generation with an input <go> symbol.
● the predicted word at timestep t, becomes the input at t+1.
11
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014.
Seq2Seq with RNN
Seq2Seq for NMT
12
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
13
Seq2Seq for NMT
How does the length of the source sentence affect the size (d) of its encoding ?
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
14
How does the length of the input sentence affect the size (d) of its encoded
representation ?
Ray Mooney, ACL 2014 Workshop on Semantic Parsing.
Seq2Seq for NMT
"You can't cram the meaning of
a whole %&!$# sentence into a
single $&!#* vector!"
15
Seq2Seq for NMT: Information bottleneck
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
16
Query, Keys & Values
Databases store information as pair of keys and values (K,V).
17
Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020)
Example:
<Key> <Value>
myfile.pdf
The (K,Q,V) terminology used to retrieve information from databases is adopted to
formulate attention.
18
Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020)
Query, Keys & Values
Attention is a mechanism to compute a context vector (c) for a query (Q) as a
weighted sum of values (V).
19
Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020)
Query, Keys & Values
20
Key (K) / Values (V) Query (Q)
Usually, both keys and values correspond to the encoder states.
Query, Keys & Values
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
21
22
Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016.
Seq2Seq + Attention
23
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
24
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
25
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
26
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
27
Seq2Seq + Attention: NMT
Softmax
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
28
Seq2Seq + Attention: NMT
29
Seq2Seq + Attention: NMT
Context (c)
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
30
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
31
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
32
Seq2Seq + Attention: NMT
Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
33Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
Seq2Seq + Attention: NMT
34
Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016.
Seq2Seq + Attention
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
35
36
Slide: Marta R. Costa-jussà
Attention Mechanisms
How to compute
these attention
scores ?
● Additive attention
● Multiplicative
● Scaled dot product
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
○ Additive
37
38
Attention Mechanisms: Additive Attention
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
The first attention mechanism used a one-hidden layer feed-forward network to
calculate the attention alignment:
39
Attention Mechanisms: Additive Attention
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
hj
zi
Attention
weight
(aj
)
Key
(k)
Query
(q)
The first attention mechanism used a one-hidden layer feed-forward network to
calculate the attention alignment:
40
Attention Mechanisms: Additive Attention
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and
translate." ICLR 2015.
hj
zi
Attention
weight
(aj
)
The original attention mechanism uses a one-hidden layer feed-forward network
to calculate the attention alignment:
a(q,k) = va
T
tanh(W1
q + W2
k)
a(q,k) = va
T
tanh(Wa
[q;k])
It is often expressed as two separate
linear transformations W1
and W2
for zi
and hi
, respectively:
concatenation
Key
(k)
Query
(q)
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
○ Multiplicative
41
42
Attention Mechanisms: Multiplicative Attention
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation.
EMNLP 2015
A computational efficient approach is just learning matrix multplication:
a(q,k) = qT
W k
=
43
Attention Mechanisms: Multiplicative Attention
Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation.
EMNLP 2015
If the dimension of q & k are equal (eg. in seq2seq, by learning a linear
projection), an even more computational efficient approach is a dot product
between query and key:
a(q,k) = qT
k
=
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
○ Scaled dot product
44
45
Attention Mechanisms: Scaled dot product
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.. Attention is all you need.
NeurIPS 2017.
Multiplicative attention performs worse than additive attention for large
dimensionality of the decoder state (dk
) unless a scaling factor is added:
a(q,k) = ( qT
k ) / √ dk
=
46
Attention Mechanisms: Example
Key (K) / Values (V) Queries (Q)
Softmax
dh
Q KT =
=
C
VC At
Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
Attention Mechanisms: Example
Figure: Ilharco, Ilharco, Turc, Dettmers, Ferreira, Lee, “High Performance Natural Language Processing”. EMNLP 2020.
Outline
1. Motivation
2. Seq2Seq
3. Key, Query and Value
4. Seq2Seq + Attention
5. Attention mechanisms
6. Study case: Image captioning
48
Image Captioning without Attention
LSTMLSTM LSTM
CNN LSTM
A bird flying
...
<EOS>
Features:
D
49
...
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. CVPR 2015.
Limitation: All output predictions are based on the final and static output
of the encoder
CNN
Image:
H x W x 3
50
Image Captioning with Attention
Slide: Amaia Salvador
Attention for Image Captioning
CNN
Image:
H x W x 3
L x
visual
embeddings
h0
51
c0
The first context
vector is the
average of the L
embeddings
a1
L x Attention weights
y1
Predicted word
y0
First word (<start> token)
Slide: Amaia Salvador
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
y1c1
Visual features weighted
with attention give the next
context
h1
a2 y2
52
a1 y1
c0 y0
Predicted word in
previous timestep
Slide: Amaia Salvador
Attention for Image Captioning
CNN
Image:
H x W x 3
h0
c1 y1
h1
a2 y2
h2
a3 y3
c2 y2
53
a1 y1
c0 y0
Slide: Amaia Salvador
Attention for Image Captioning
54Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua
Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
Attention for Image Captioning
55Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua
Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
56
Side-note: attention can be computed with previous or current hidden state
CNN
Image:
H x W x 3
h1
v y1
h2 h3
v y2
a1
y1
v y0average
c1
a2
y2
c2
a3
y3
c3
Attention for Image Captioning
57
Exercise
Given a query vector and two keys:
● q = [0.3, 0.2, 0.1]
● k1
= [0.1, 0.3, 0.1]
● k2
= [0.6, 0.4, 0.2]
a) What are the attention weights a1 and a2 computing the dot product ?
b) Considering dh
=3, what are the attention weights 1&2 when computing the
scaled dot product ?
c) Which key vector will receive more attention ?
58
Learn more
Alex Graves, UCL x Deepmind 2020 [slides]
59
Learn more
● Tutorials
○ Sebastian Ruder, Deep Learning for NLP Best Practices # Attention (2017).
○ Chris Olah, Shan Carter, “Attention and Augmented Recurrent Neural Networks”. Distill 2016.
○ Lilian Weng, “Attention? Attention!”. Lil’Log 2017.
● Demos
○ François Fleuret (EPFL)
● Scientific publications
○ Katharopoulos, A., Vyas, A., Pappas, N., & Fleuret, F. (2020). Transformers are rnns: Fast autoregressive
transformers with linear attention. ICML 2020.
○ Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero,
Yee Whye Teh, Tim Harley, Razvan Pascanu, “Multiplicative Interactions and Where to Find Them”. ICLR 2020.
[tweet]
○ Woo, Sanghyun, Jongchan Park, Joon-Young Lee, and In So Kweon. "Cbam: Convolutional block attention
module." ECCV 2018.
○ Lu, Xiankai, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli. "See More, Know More:
Unsupervised Video Object Segmentation with Co-Attention Siamese Networks." CVPR 2019.
○ Wang, Ziqin, Jun Xu, Li Liu, Fan Zhu, and Ling Shao. "Ranet: Ranking attention network for fast video object
segmentation." ICCV 2019.
60
Questions ?

More Related Content

What's hot

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
Universitat Politècnica de Catalunya
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
Universitat Politècnica de Catalunya
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal Learning
Marc Bolaños Solà
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
Universitat Politècnica de Catalunya
 
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Universitat Politècnica de Catalunya
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Universitat Politècnica de Catalunya
 
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaSelf-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Universitat Politècnica de Catalunya
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Universitat Politècnica de Catalunya
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
Universitat Politècnica de Catalunya
 
Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
Universitat Politècnica de Catalunya
 
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Universitat Politècnica de Catalunya
 
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
Taegyun Jeon
 

What's hot (20)

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018One Perceptron  to Rule them All: Deep Learning for Multimedia #A2IC2018
One Perceptron to Rule them All: Deep Learning for Multimedia #A2IC2018
 
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
The Perceptron - Xavier Giro-i-Nieto - UPC Barcelona 2018
 
Deep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal LearningDeep Neural Networks for Multimodal Learning
Deep Neural Networks for Multimodal Learning
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
 
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
Unsupervised Learning (DLAI D9L1 2017 UPC Deep Learning for Artificial Intell...
 
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
Self-supervised Learning from Video Sequences - Xavier Giro - UPC Barcelona 2019
 
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC BarcelonaSelf-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
Self-supervised Visual Learning 2020 - Xavier Giro-i-Nieto - UPC Barcelona
 
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
Deep Learning Architectures for Video - Xavier Giro - UPC Barcelona 2019
 
Backpropagation for Deep Learning
Backpropagation for Deep LearningBackpropagation for Deep Learning
Backpropagation for Deep Learning
 
Backpropagation for Neural Networks
Backpropagation for Neural NetworksBackpropagation for Neural Networks
Backpropagation for Neural Networks
 
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN BarcelonaDeep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
Deep Video Object Tracking 2020 - Xavier Giro - UPC TelecomBCN Barcelona
 
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
Language and Vision (D3L5 2017 UPC Deep Learning for Computer Vision)
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 

Similar to Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
Jin-Hwa Kim
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Vrije Universiteit Amsterdam
 
Open-ended Visual Question-Answering
Open-ended  Visual Question-AnsweringOpen-ended  Visual Question-Answering
Open-ended Visual Question-Answering
Universitat Politècnica de Catalunya
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yuzukun
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Scalable Dynamic Graph Summarization
Scalable Dynamic Graph SummarizationScalable Dynamic Graph Summarization
Scalable Dynamic Graph Summarization
Ioanna Tsalouchidou
 
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Universitat Politècnica de Catalunya
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
Tomoyuki Suzuki
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
Anshika865276
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
Dhafer Malouche
 
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelsJAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
hirokazutanaka
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Universitat Politècnica de Catalunya
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Saad Liaqat
 
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar NowcastingRecurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
Andreas Scheidegger
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Universitat Politècnica de Catalunya
 
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
MLAI2
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
Data Con LA
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
yang947066
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
Xavier Ochoa
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Feynman Liang
 

Similar to Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020 (20)

Multimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QAMultimodal Residual Networks for Visual QA
Multimodal Residual Networks for Visual QA
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
 
Open-ended Visual Question-Answering
Open-ended  Visual Question-AnsweringOpen-ended  Visual Question-Answering
Open-ended Visual Question-Answering
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yu
 
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)
 
Scalable Dynamic Graph Summarization
Scalable Dynamic Graph SummarizationScalable Dynamic Graph Summarization
Scalable Dynamic Graph Summarization
 
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
Deep Reinforcement Learning: MDP & DQN - Xavier Giro-i-Nieto - UPC Barcelona ...
 
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transforma...
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?More investment in Research and Development for better Education in the future?
More investment in Research and Development for better Education in the future?
 
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron modelsJAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
JAISTサマースクール2016「脳を知るための理論」講義01 Single neuron models
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Mit15 082 jf10_lec01
Mit15 082 jf10_lec01Mit15 082 jf10_lec01
Mit15 082 jf10_lec01
 
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar NowcastingRecurrent Neuronal Network tailored for Weather Radar Nowcasting
Recurrent Neuronal Network tailored for Weather Radar Nowcasting
 
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
 
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
Generating Diverse and Consistent QA pairs from Contexts with Information-Max...
 
Data Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLPData Con LA 2022 - Transformers for NLP
Data Con LA 2022 - Transformers for NLP
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Multimodal Learning Analytics
Multimodal Learning AnalyticsMultimodal Learning Analytics
Multimodal Learning Analytics
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 

More from Universitat Politècnica de Catalunya

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Universitat Politècnica de Catalunya
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Universitat Politècnica de Catalunya
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Universitat Politècnica de Catalunya
 
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationHate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (16)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
Transcription-Enriched Joint Embeddings for Spoken Descriptions of Images and...
 
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
Object Detection with Deep Learning - Xavier Giro-i-Nieto - UPC School Barcel...
 
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
Self-supervised Audiovisual Learning 2020 - Xavier Giro-i-Nieto - UPC Telecom...
 
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic ModerationHate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
 

Recently uploaded

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

  • 1. Xavier Giro-i-Nieto Associate Professor Universitat Politecnica de Catalunya @DocXavi xavier.giro@upc.edu Attention Mechanisms Lecture 14 [course site]
  • 2. 2 Acknowledgments Amaia Salvador Doctor Universitat Politècnica de Catalunya Marta R. Costa-jussà Associate Professor Universitat Politècnica de Catalunya
  • 3. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 3
  • 4. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 4
  • 5. Motivation: Vision 5 Image: H x W x 3 bird The whole input volume is used to predict the output...
  • 6. Motivation: Vision 6 Image: H x W x 3 bird ...despite the fact that not all pixels are equally important The whole input volume is used to predict the output...
  • 7. Motivation: Automatic Speech Recognition (ASR) 7 Figure: distill.pub Chan, W., Jaitly, N., Le, Q., & Vinyals, O. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. ICASSP 2016. Transcribed letters correspond with just a few phonemes.
  • 8. Motivation: Neural Machine Translation (NMT) 8 Kyunghyun Cho, “Introduction to Neural Machine Translation with GPUs” (2015) The translated words are often related to a subset of the words from the source sentence. (Edge thicknesses represent the attention weights found by the attention model)
  • 9. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 9
  • 10. Seq2Seq with RNN 10 Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014. The Seq2Seq scheme can be implemented with RNNs: ● trigger the output generation with an input <go> symbol. ● the predicted word at timestep t, becomes the input at t+1.
  • 11. 11 Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." NIPS 2014. Seq2Seq with RNN
  • 12. Seq2Seq for NMT 12 Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 13. 13 Seq2Seq for NMT How does the length of the source sentence affect the size (d) of its encoding ? Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 14. 14 How does the length of the input sentence affect the size (d) of its encoded representation ? Ray Mooney, ACL 2014 Workshop on Semantic Parsing. Seq2Seq for NMT "You can't cram the meaning of a whole %&!$# sentence into a single $&!#* vector!"
  • 15. 15 Seq2Seq for NMT: Information bottleneck Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 16. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 16
  • 17. Query, Keys & Values Databases store information as pair of keys and values (K,V). 17 Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020) Example: <Key> <Value> myfile.pdf
  • 18. The (K,Q,V) terminology used to retrieve information from databases is adopted to formulate attention. 18 Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020) Query, Keys & Values
  • 19. Attention is a mechanism to compute a context vector (c) for a query (Q) as a weighted sum of values (V). 19 Figure: Nikhil Shah, “Attention? An Other Perspective! [Part 1]” (2020) Query, Keys & Values
  • 20. 20 Key (K) / Values (V) Query (Q) Usually, both keys and values correspond to the encoder states. Query, Keys & Values Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 21. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 21
  • 22. 22 Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016. Seq2Seq + Attention
  • 23. 23 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 24. 24 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 25. 25 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 26. 26 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 27. 27 Seq2Seq + Attention: NMT Softmax Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 29. 29 Seq2Seq + Attention: NMT Context (c) Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 30. 30 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 31. 31 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 32. 32 Seq2Seq + Attention: NMT Slide: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 33. 33Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. Seq2Seq + Attention: NMT
  • 34. 34 Chis Olah & Shan Cate, “Attention and Augmented Recurrent Neural Networks” distill.pub 2016. Seq2Seq + Attention
  • 35. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 35
  • 36. 36 Slide: Marta R. Costa-jussà Attention Mechanisms How to compute these attention scores ? ● Additive attention ● Multiplicative ● Scaled dot product
  • 37. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms ○ Additive 37
  • 38. 38 Attention Mechanisms: Additive Attention Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. The first attention mechanism used a one-hidden layer feed-forward network to calculate the attention alignment:
  • 39. 39 Attention Mechanisms: Additive Attention Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. hj zi Attention weight (aj ) Key (k) Query (q) The first attention mechanism used a one-hidden layer feed-forward network to calculate the attention alignment:
  • 40. 40 Attention Mechanisms: Additive Attention Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." ICLR 2015. hj zi Attention weight (aj ) The original attention mechanism uses a one-hidden layer feed-forward network to calculate the attention alignment: a(q,k) = va T tanh(W1 q + W2 k) a(q,k) = va T tanh(Wa [q;k]) It is often expressed as two separate linear transformations W1 and W2 for zi and hi , respectively: concatenation Key (k) Query (q)
  • 41. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms ○ Multiplicative 41
  • 42. 42 Attention Mechanisms: Multiplicative Attention Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015 A computational efficient approach is just learning matrix multplication: a(q,k) = qT W k =
  • 43. 43 Attention Mechanisms: Multiplicative Attention Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective Approaches to Attention-based Neural Machine Translation. EMNLP 2015 If the dimension of q & k are equal (eg. in seq2seq, by learning a linear projection), an even more computational efficient approach is a dot product between query and key: a(q,k) = qT k =
  • 44. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms ○ Scaled dot product 44
  • 45. 45 Attention Mechanisms: Scaled dot product Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.. Attention is all you need. NeurIPS 2017. Multiplicative attention performs worse than additive attention for large dimensionality of the decoder state (dk ) unless a scaling factor is added: a(q,k) = ( qT k ) / √ dk =
  • 46. 46 Attention Mechanisms: Example Key (K) / Values (V) Queries (Q) Softmax dh Q KT = = C VC At Slide concept: Abigail See, Matthew Lamm (Stanford CS224N), 2020
  • 47. Attention Mechanisms: Example Figure: Ilharco, Ilharco, Turc, Dettmers, Ferreira, Lee, “High Performance Natural Language Processing”. EMNLP 2020.
  • 48. Outline 1. Motivation 2. Seq2Seq 3. Key, Query and Value 4. Seq2Seq + Attention 5. Attention mechanisms 6. Study case: Image captioning 48
  • 49. Image Captioning without Attention LSTMLSTM LSTM CNN LSTM A bird flying ... <EOS> Features: D 49 ... Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. CVPR 2015. Limitation: All output predictions are based on the final and static output of the encoder
  • 50. CNN Image: H x W x 3 50 Image Captioning with Attention Slide: Amaia Salvador
  • 51. Attention for Image Captioning CNN Image: H x W x 3 L x visual embeddings h0 51 c0 The first context vector is the average of the L embeddings a1 L x Attention weights y1 Predicted word y0 First word (<start> token) Slide: Amaia Salvador
  • 52. Attention for Image Captioning CNN Image: H x W x 3 h0 y1c1 Visual features weighted with attention give the next context h1 a2 y2 52 a1 y1 c0 y0 Predicted word in previous timestep Slide: Amaia Salvador
  • 53. Attention for Image Captioning CNN Image: H x W x 3 h0 c1 y1 h1 a2 y2 h2 a3 y3 c2 y2 53 a1 y1 c0 y0 Slide: Amaia Salvador
  • 54. Attention for Image Captioning 54Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
  • 55. Attention for Image Captioning 55Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention." ICML 2015
  • 56. 56 Side-note: attention can be computed with previous or current hidden state CNN Image: H x W x 3 h1 v y1 h2 h3 v y2 a1 y1 v y0average c1 a2 y2 c2 a3 y3 c3 Attention for Image Captioning
  • 57. 57 Exercise Given a query vector and two keys: ● q = [0.3, 0.2, 0.1] ● k1 = [0.1, 0.3, 0.1] ● k2 = [0.6, 0.4, 0.2] a) What are the attention weights a1 and a2 computing the dot product ? b) Considering dh =3, what are the attention weights 1&2 when computing the scaled dot product ? c) Which key vector will receive more attention ?
  • 58. 58 Learn more Alex Graves, UCL x Deepmind 2020 [slides]
  • 59. 59 Learn more ● Tutorials ○ Sebastian Ruder, Deep Learning for NLP Best Practices # Attention (2017). ○ Chris Olah, Shan Carter, “Attention and Augmented Recurrent Neural Networks”. Distill 2016. ○ Lilian Weng, “Attention? Attention!”. Lil’Log 2017. ● Demos ○ François Fleuret (EPFL) ● Scientific publications ○ Katharopoulos, A., Vyas, A., Pappas, N., & Fleuret, F. (2020). Transformers are rnns: Fast autoregressive transformers with linear attention. ICML 2020. ○ Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu, “Multiplicative Interactions and Where to Find Them”. ICLR 2020. [tweet] ○ Woo, Sanghyun, Jongchan Park, Joon-Young Lee, and In So Kweon. "Cbam: Convolutional block attention module." ECCV 2018. ○ Lu, Xiankai, Wenguan Wang, Chao Ma, Jianbing Shen, Ling Shao, and Fatih Porikli. "See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks." CVPR 2019. ○ Wang, Ziqin, Jun Xu, Li Liu, Fan Zhu, and Ling Shao. "Ranet: Ranking attention network for fast video object segmentation." ICCV 2019.