SlideShare a Scribd company logo
Attention is all you need
Whi Kwon
소개
2008 ~ 2015: 화공생명공학과
2015 ~ 2017: 품질 / 고객지원 엔지니어
2017 ~ 2018: 딥러닝 자유롭게 공부
2018 ~: 의료 분야 스타트업
관심사
~2017.12: Vision, NLP
~2018.06: RL, GAN
2018.06~: Relational, Imitation
Outline
Part.1: Attention
Part.2: Self-Attention
Part 1. Attention
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. . It is the taking possession by the mind in clear and vivid form of one out
of what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. . It is the taking possession by the mind in clear and vivid form of one out
of what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. It is the taking possession by the mind in clear and vivid form of one out of
what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Recurrent Neural Network
...
attention also referred resources
문제 : Non-parallel computation, not long-range dependencies
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. It is the taking possession by the mind in clear and vivid form of one out of
what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Convolution Neural Network
attention also ... cognitive process
of selectively ... whether deemed
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
concentrat of ... enthrallment or
attention has ... cognitive processing
Filter
문제 : Not long-range dependencies, computationally inefficient
Attention, also referred to as enthrallment, is the behavioral and cognitive process
of selectively concentrating on a discrete aspect of information, whether deemed
subjective or objective, while ignoring other perceivable information. It is a state of
arousal. . It is the taking possession by the mind in clear and vivid form of one out
of what seem several simultaneous objects or trains of thought. Focalization, the
concentration of consciousness, is of its essence. Attention or enthrallment or
attention has also been described as the allocation of limited cognitive processing
resources.
Attention mechanism
Parallel computation, long-range dependencies, explainable
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
1. Q 와 K 간의 유사도를 구합니다 .
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
2. 너무 큰 값이 지배적이지 않도록 normalize
1. Q 와 K 간의 유사도를 구합니다 .
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
2. 너무 큰 값이 지배적이지 않도록 normalize
1. Q 와 K 간의 유사도를 구합니다 .
3. 유사도 → 가중치 ( 총 합 =1)
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
2. 너무 큰 값이 지배적이지 않도록 normalize
3. 유사도 → 가중치 ( 총 합 =1)
1. Q 와 K 간의 유사도를 구합니다 .
4. 가중치를
V 에 곱해줍니다 .
Attention mechanism
Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017
정보 {K:V} 가 어떤 Q 와 연관이 있을 것입니다 .
이를 활용해서 K 와 Q 의 유사도를 구하고 이를 , V 에 반영해줍시다 .
그럼 Q 에 직접적으로 연관된 V 의 정보를 더 많이 전달해 줄 수 있을 것입
니다 .
2. 너무 큰 값이 지배적이지 않도록 normalize
3. 유사도 → 가중치 ( 총 합 =1)
1. Q 와 K 간의 유사도를 구합니다 .
4. 가중치를
V 에 곱해줍니다 .
e.g. Attention mechanism with Seq2Seq
...
Encoder
Decoder
...
Decoder 의 정보 전달은 오직 이
전 t 의 정보에 의존적입니다 .
Encoder 의 마지막 정보가
Decoder 로 전달됩니다 .
Encoder 의 정보 전달은 이전
t 의 hidden state, 현재 t 의
input 에 의존적입니다 .
(Machine translation, Encoder-Decoder, Attention)
e.g. Attention mechanism with Seq2Seq
(Machine translation, Encoder-Decoder, Attention)
⊕
...
Decoder
Encoder
...
Attention
Long-range dependency
e.g. Attention mechanism with Seq2Seq
Fig from Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015
(Machine translation, Encoder-Decoder, Attention)
Attention
e.g. Style-token
Fig. from Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis.
ArXiv. 2018
Decoder
Encoder1
Encoder2
GST
(Random init token)
⊕
Attention
(Text to speech, Encoder-Decoder, Style transfer, Attention)
Demo: https://google.github.io/tacotron/publications/global_style_tokens/
Part 2. Self-attention
1
Self-attention
1 2 3
4 5 6
7 8 9
3
9
1
2
*
=
0.1
0.3
0.10.1
0.10.2
0.30.10.10.1
... ...
0.1
0.3
0.10.1
0.10.2
0.30.10.10.1
...
*
*
*
3
9
1
2
...
1’ 2’ 3’
4’ 5’ 6’
7’ 8’ 9’
1’⊕
Self-attention LayerSelf-attention Layer
2
Self-attention
1 2 3
4 5 6
7 8 9
3
9
1
2
*
=
0.1
0.1
0.10.1
0.10.2
0.30.10.10.2
... ...
0.1
0.1
0.10.1
0.10.2
0.30.10.10.2
...
*
*
*
3
9
1
2
...
⊕
1’ 2’ 3’
4’ 5’ 6’
7’ 8’ 9’
2’⊕
Self-attention Layer
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
Self-attention
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
2. j pixel 값을 곱한다 .
Self-attention
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
2. j pixel 값을 곱한다 .
3. normalization 항
Self-attention
Fig. from Wang et al. Non-local neural networks. ArXiv. 2017.
1. i, j pixel 간의 유사도를 구한다 .
i, j 번째 정보는 서로 연관이 있을 것입니다 .
각 위치 별 유사도를 구하고 이를 가중치로 반영해줍시다 .
그럼 , 모든 위치 별 관계를 학습 할 수 있을 것입니다 .
(Long-range dependency!)
2. j pixel 값을 곱한다 .
3. normalization 항
Self-attention
e.g. Self-Attention GAN
(Image generation, GAN, Self-attention)
Transpose
Conv ⊕
Latent
(z)
Image
(x’)
Self-
Attention
Conv ⊕ FC
Self-
Attention
ProbImage
(x)
Generator
Discriminator
Fig. from Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018.
e.g. Self-Attention GAN
(Image generation, GAN, Self-attention)
Conclusion
Attention:
Self-Attention:
Next...?
Relational Network, Graphical Model...
Reference
- Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015
- Wang et al. Non-local neural networks. ArXiv. 2017
- Vaswani et al. Attention is all you need. ArXiv. 2017
- Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End
Speech Synthesis. ArXiv. 2018
- Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018.
- Attention is all you need 설명 블로그
(https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/)
- Attention is all you need 설명 동영상
(https://www.youtube.com/watch?v=iDulhoQ2pro)

More Related Content

What's hot

Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
SwatiNarkhede1
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
Dongmin Choi
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Universitat Politècnica de Catalunya
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
Ferdin Joe John Joseph PhD
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
Yuta Niki
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
SwatiNarkhede1
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
Arvind Devaraj
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
Hyeongmin Lee
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
Hye-min Ahn
 
Transformers
TransformersTransformers
Transformers
Anup Joseph
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Attention Mechanism in Language Understanding and its Applications
Attention Mechanism in Language Understanding and its ApplicationsAttention Mechanism in Language Understanding and its Applications
Attention Mechanism in Language Understanding and its Applications
Artifacia
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx
TuCaoMinh2
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
RahulKumar854607
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
Fellowship at Vodafone FutureLab
 
Attention scores and mechanisms
Attention scores and mechanismsAttention scores and mechanisms
Attention scores and mechanisms
JaeHo Jang
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
Illia Polosukhin
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
suga93
 
Attention
AttentionAttention
Attention
SEMINARGROOT
 

What's hot (20)

Survey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer VisionSurvey of Attention mechanism & Use in Computer Vision
Survey of Attention mechanism & Use in Computer Vision
 
ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]ViT (Vision Transformer) Review [CDM]
ViT (Vision Transformer) Review [CDM]
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Survey of Attention mechanism
Survey of Attention mechanismSurvey of Attention mechanism
Survey of Attention mechanism
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
PR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic ModelsPR-409: Denoising Diffusion Probabilistic Models
PR-409: Denoising Diffusion Probabilistic Models
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
Transformers
TransformersTransformers
Transformers
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Attention Mechanism in Language Understanding and its Applications
Attention Mechanism in Language Understanding and its ApplicationsAttention Mechanism in Language Understanding and its Applications
Attention Mechanism in Language Understanding and its Applications
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
 
Attention scores and mechanisms
Attention scores and mechanismsAttention scores and mechanisms
Attention scores and mechanisms
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN DecodersConditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
 
Attention
AttentionAttention
Attention
 

Similar to Attention mechanism 소개 자료

Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A    Guidelines Manual (November 1, 2018) .docxCh. 5 Pt. A    Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
bartholomeocoombs
 
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Nao (Naotsugu) Tsuchiya
 
Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6
Larry Paul
 
Behavioral analysis of cognition
Behavioral analysis of cognitionBehavioral analysis of cognition
Behavioral analysis of cognitionGheraldine Fillaro
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Numenta
 
Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...
Nao (Naotsugu) Tsuchiya
 
1810.mid1043.07
1810.mid1043.071810.mid1043.07
1810.mid1043.07
vizualizer
 
Boost your strategic thinking
Boost your strategic thinkingBoost your strategic thinking
Boost your strategic thinking
The BrainLink Group
 
Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention
Nao (Naotsugu) Tsuchiya
 
Cognitive Science Unit 4
Cognitive Science Unit 4Cognitive Science Unit 4
Cognitive Science Unit 4CSITSansar
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptx
kumarkaushal17
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptx
kumarkaushal17
 
A Handbook of Cognition for UX Designers
A Handbook of Cognition for UX DesignersA Handbook of Cognition for UX Designers
A Handbook of Cognition for UX DesignersXinLei Guo
 
Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?
Klaxon
 
UP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic ThinkingUP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic ThinkingEducation Moving Up Cc.
 
NeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016CNeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016CValeria Trezzi
 
Human function and attention ppt
Human function and attention pptHuman function and attention ppt
Human function and attention ppt
Henry Mwanza
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.pptbutest
 
Information Processing and Motor Skill Performance
Information Processing and Motor Skill PerformanceInformation Processing and Motor Skill Performance
Information Processing and Motor Skill Performance
John John
 

Similar to Attention mechanism 소개 자료 (20)

Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A    Guidelines Manual (November 1, 2018) .docxCh. 5 Pt. A    Guidelines Manual (November 1, 2018) .docx
Ch. 5 Pt. A Guidelines Manual (November 1, 2018) .docx
 
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
Week 9 the neural basis of consciousness : dissociation of consciousness &amp...
 
Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6Paying and capturing attention - A decision/action model for soccer - pt.6
Paying and capturing attention - A decision/action model for soccer - pt.6
 
Tvcg.12a
Tvcg.12aTvcg.12a
Tvcg.12a
 
Behavioral analysis of cognition
Behavioral analysis of cognitionBehavioral analysis of cognition
Behavioral analysis of cognition
 
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
Brains@Bay Meetup: The Effect of Sensorimotor Learning on the Learned Represe...
 
Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...Growing evidence for separate neural mechanisms for attention and consciousne...
Growing evidence for separate neural mechanisms for attention and consciousne...
 
1810.mid1043.07
1810.mid1043.071810.mid1043.07
1810.mid1043.07
 
Boost your strategic thinking
Boost your strategic thinkingBoost your strategic thinking
Boost your strategic thinking
 
Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention Week 8 : The neural basis of consciousness : consciousness vs. attention
Week 8 : The neural basis of consciousness : consciousness vs. attention
 
Cognitive Science Unit 4
Cognitive Science Unit 4Cognitive Science Unit 4
Cognitive Science Unit 4
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptx
 
pending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptxpending-1664760315-2 knowledge based agent student.pptx
pending-1664760315-2 knowledge based agent student.pptx
 
A Handbook of Cognition for UX Designers
A Handbook of Cognition for UX DesignersA Handbook of Cognition for UX Designers
A Handbook of Cognition for UX Designers
 
Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?Can Marketers Get to Grips with the Human Condition?
Can Marketers Get to Grips with the Human Condition?
 
UP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic ThinkingUP LBL880 - Article on Systemic Thinking
UP LBL880 - Article on Systemic Thinking
 
NeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016CNeuroscienceLaboratory__03_2016C
NeuroscienceLaboratory__03_2016C
 
Human function and attention ppt
Human function and attention pptHuman function and attention ppt
Human function and attention ppt
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
Information Processing and Motor Skill Performance
Information Processing and Motor Skill PerformanceInformation Processing and Motor Skill Performance
Information Processing and Motor Skill Performance
 

Recently uploaded

一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 

Recently uploaded (20)

一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 

Attention mechanism 소개 자료

  • 1. Attention is all you need Whi Kwon
  • 2. 소개 2008 ~ 2015: 화공생명공학과 2015 ~ 2017: 품질 / 고객지원 엔지니어 2017 ~ 2018: 딥러닝 자유롭게 공부 2018 ~: 의료 분야 스타트업
  • 3. 관심사 ~2017.12: Vision, NLP ~2018.06: RL, GAN 2018.06~: Relational, Imitation
  • 6. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. . It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources.
  • 7. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. . It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources.
  • 8. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources. Recurrent Neural Network ... attention also referred resources 문제 : Non-parallel computation, not long-range dependencies
  • 9. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources. Convolution Neural Network attention also ... cognitive process of selectively ... whether deemed . . . . . . . . . . . . . . . concentrat of ... enthrallment or attention has ... cognitive processing Filter 문제 : Not long-range dependencies, computationally inefficient
  • 10. Attention, also referred to as enthrallment, is the behavioral and cognitive process of selectively concentrating on a discrete aspect of information, whether deemed subjective or objective, while ignoring other perceivable information. It is a state of arousal. . It is the taking possession by the mind in clear and vivid form of one out of what seem several simultaneous objects or trains of thought. Focalization, the concentration of consciousness, is of its essence. Attention or enthrallment or attention has also been described as the allocation of limited cognitive processing resources. Attention mechanism Parallel computation, long-range dependencies, explainable
  • 11. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 1. Q 와 K 간의 유사도를 구합니다 .
  • 12. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 2. 너무 큰 값이 지배적이지 않도록 normalize 1. Q 와 K 간의 유사도를 구합니다 .
  • 13. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 2. 너무 큰 값이 지배적이지 않도록 normalize 1. Q 와 K 간의 유사도를 구합니다 . 3. 유사도 → 가중치 ( 총 합 =1)
  • 14. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 2. 너무 큰 값이 지배적이지 않도록 normalize 3. 유사도 → 가중치 ( 총 합 =1) 1. Q 와 K 간의 유사도를 구합니다 . 4. 가중치를 V 에 곱해줍니다 .
  • 15. Attention mechanism Fig. from Vaswani et al. Attention is all you need. ArXiv. 2017 정보 {K:V} 가 어떤 Q 와 연관이 있을 것입니다 . 이를 활용해서 K 와 Q 의 유사도를 구하고 이를 , V 에 반영해줍시다 . 그럼 Q 에 직접적으로 연관된 V 의 정보를 더 많이 전달해 줄 수 있을 것입 니다 . 2. 너무 큰 값이 지배적이지 않도록 normalize 3. 유사도 → 가중치 ( 총 합 =1) 1. Q 와 K 간의 유사도를 구합니다 . 4. 가중치를 V 에 곱해줍니다 .
  • 16. e.g. Attention mechanism with Seq2Seq ... Encoder Decoder ... Decoder 의 정보 전달은 오직 이 전 t 의 정보에 의존적입니다 . Encoder 의 마지막 정보가 Decoder 로 전달됩니다 . Encoder 의 정보 전달은 이전 t 의 hidden state, 현재 t 의 input 에 의존적입니다 . (Machine translation, Encoder-Decoder, Attention)
  • 17. e.g. Attention mechanism with Seq2Seq (Machine translation, Encoder-Decoder, Attention) ⊕ ... Decoder Encoder ... Attention Long-range dependency
  • 18. e.g. Attention mechanism with Seq2Seq Fig from Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015 (Machine translation, Encoder-Decoder, Attention) Attention
  • 19. e.g. Style-token Fig. from Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ArXiv. 2018 Decoder Encoder1 Encoder2 GST (Random init token) ⊕ Attention (Text to speech, Encoder-Decoder, Style transfer, Attention) Demo: https://google.github.io/tacotron/publications/global_style_tokens/
  • 21. 1 Self-attention 1 2 3 4 5 6 7 8 9 3 9 1 2 * = 0.1 0.3 0.10.1 0.10.2 0.30.10.10.1 ... ... 0.1 0.3 0.10.1 0.10.2 0.30.10.10.1 ... * * * 3 9 1 2 ... 1’ 2’ 3’ 4’ 5’ 6’ 7’ 8’ 9’ 1’⊕ Self-attention LayerSelf-attention Layer
  • 22. 2 Self-attention 1 2 3 4 5 6 7 8 9 3 9 1 2 * = 0.1 0.1 0.10.1 0.10.2 0.30.10.10.2 ... ... 0.1 0.1 0.10.1 0.10.2 0.30.10.10.2 ... * * * 3 9 1 2 ... ⊕ 1’ 2’ 3’ 4’ 5’ 6’ 7’ 8’ 9’ 2’⊕ Self-attention Layer
  • 23. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . Self-attention
  • 24. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . 2. j pixel 값을 곱한다 . Self-attention
  • 25. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . 2. j pixel 값을 곱한다 . 3. normalization 항 Self-attention
  • 26. Fig. from Wang et al. Non-local neural networks. ArXiv. 2017. 1. i, j pixel 간의 유사도를 구한다 . i, j 번째 정보는 서로 연관이 있을 것입니다 . 각 위치 별 유사도를 구하고 이를 가중치로 반영해줍시다 . 그럼 , 모든 위치 별 관계를 학습 할 수 있을 것입니다 . (Long-range dependency!) 2. j pixel 값을 곱한다 . 3. normalization 항 Self-attention
  • 27. e.g. Self-Attention GAN (Image generation, GAN, Self-attention) Transpose Conv ⊕ Latent (z) Image (x’) Self- Attention Conv ⊕ FC Self- Attention ProbImage (x) Generator Discriminator
  • 28. Fig. from Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018. e.g. Self-Attention GAN (Image generation, GAN, Self-attention)
  • 31. Reference - Bahdanau et al. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. 2015 - Wang et al. Non-local neural networks. ArXiv. 2017 - Vaswani et al. Attention is all you need. ArXiv. 2017 - Wang et al. Style-tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ArXiv. 2018 - Zhang et al. Self-Attention Generative Adversarial Networks. ArXiv. 2018. - Attention is all you need 설명 블로그 (https://mchromiak.github.io/articles/2017/Sep/12/Transformer-Attention-is-all-you-need/) - Attention is all you need 설명 동영상 (https://www.youtube.com/watch?v=iDulhoQ2pro)