SlideShare a Scribd company logo
1 of 15
Show and Tell: A
Neural Image
Caption Generator
May 17 , 2021
Hee Dae Kwon
Contents
• Introduction
• Model
• Experiments
• Results
Introduction
Image description 문제를
푸는 end-to-end 시스템을
제안함.
논문에서 제안하는 모델은
vision과 language 모델
중에 SOTA인 모델들의
구조의 일부를 조합
Introduction
encoder RNN →→
encoder CNN
image I
Target sequnce of
Words S
S=S1,S2,…S=S1,S2,…
maximizing p(S|I) 하도
록 학습
Model
Model
Encoder
: CNN (pre-trained)
Decorder
: RNN(LSTM)
Experiments
Over fiiting을 막기 위한 techniques
Pre-Train 된 Deep CNN 사용.
Pre-Train 된 Word embedding vetor 사용. (효과 적음)
Dropout & Ensembling
Results
• How dataset size affects generalization
• What kinds of transfer learning it would be able to achieve
• How it would deal with weakly labeled examples
Results
• Generation Results
Results
• Flickr30k -> Flickr8k
BLEU 4 증가
• MSCOCO -> Flickr8k
BLEU 10 감소,
• MSCOCO -> SBU
BLEU 16 감소
Results
• Generation Diversity
Discussion
=> generating model 이 새롭고
다양하고 높은 퀄리티의
문장생성
Results
• Ranking Results
Results
• Human Evaluation
Results
• Analysis of Embedding
•감사합니다

More Related Content

Similar to A neural image caption generator

final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
nexgentech
 
IMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxx
IMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxxIMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxx
IMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxx
AtharvaTanawade
 

Similar to A neural image caption generator (20)

6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptx
 
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC... BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
 
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...  final  year ieee pojects in pondicherry,bulk ieee projects ,bulk  2015-16 i...
final year ieee pojects in pondicherry,bulk ieee projects ,bulk 2015-16 i...
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
MediaEval 2017 - Interestingness Task: Multimodality and Deep Learning when p...
MediaEval 2017 - Interestingness Task: Multimodality and Deep Learning when p...MediaEval 2017 - Interestingness Task: Multimodality and Deep Learning when p...
MediaEval 2017 - Interestingness Task: Multimodality and Deep Learning when p...
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
mPLUG
mPLUGmPLUG
mPLUG
 
Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)Deep Learning for Chatbot (3/4)
Deep Learning for Chatbot (3/4)
 
Presentation_Conversion of Sign language to text.pptx
Presentation_Conversion of Sign language to text.pptxPresentation_Conversion of Sign language to text.pptx
Presentation_Conversion of Sign language to text.pptx
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
Image caption generation L18_CNN_RNN_2.pptx
Image caption generation L18_CNN_RNN_2.pptxImage caption generation L18_CNN_RNN_2.pptx
Image caption generation L18_CNN_RNN_2.pptx
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
IMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxx
IMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxxIMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxx
IMAGE CAPTION GENERATOR.pptx1.pptxxxxxxxxxx
 
Image captioning
Image captioningImage captioning
Image captioning
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
Automated Speech Recognition
Automated Speech Recognition Automated Speech Recognition
Automated Speech Recognition
 

More from heedaeKwon

Generative adversarial nets
Generative adversarial nets Generative adversarial nets
Generative adversarial nets
heedaeKwon
 
Generating sequences with recurrent neural networks
Generating sequences with recurrent neural networksGenerating sequences with recurrent neural networks
Generating sequences with recurrent neural networks
heedaeKwon
 
Fully convolutional networks for semantic segmentation
Fully convolutional networks for semantic segmentation Fully convolutional networks for semantic segmentation
Fully convolutional networks for semantic segmentation
heedaeKwon
 
Feature pyramid networks for object detection
Feature pyramid networks for object detection Feature pyramid networks for object detection
Feature pyramid networks for object detection
heedaeKwon
 
Attention is all you need
Attention is all you needAttention is all you need
Attention is all you need
heedaeKwon
 
Perceptual losses for real time style transfer and super-resolution
Perceptual losses for real time style transfer  and super-resolutionPerceptual losses for real time style transfer  and super-resolution
Perceptual losses for real time style transfer and super-resolution
heedaeKwon
 
Localisation network
Localisation networkLocalisation network
Localisation network
heedaeKwon
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
heedaeKwon
 
Grad cam visual explanations from deep networks via gradient-based localizati...
Grad cam visual explanations from deep networks via gradient-based localizati...Grad cam visual explanations from deep networks via gradient-based localizati...
Grad cam visual explanations from deep networks via gradient-based localizati...
heedaeKwon
 
Learning deep features for discriminative localization
Learning deep features for discriminative localizationLearning deep features for discriminative localization
Learning deep features for discriminative localization
heedaeKwon
 
Image net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networksImage net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networks
heedaeKwon
 
Show, attend and tell
Show, attend and tellShow, attend and tell
Show, attend and tell
heedaeKwon
 

More from heedaeKwon (18)

Generative adversarial nets
Generative adversarial nets Generative adversarial nets
Generative adversarial nets
 
Generating sequences with recurrent neural networks
Generating sequences with recurrent neural networksGenerating sequences with recurrent neural networks
Generating sequences with recurrent neural networks
 
Fully convolutional networks for semantic segmentation
Fully convolutional networks for semantic segmentation Fully convolutional networks for semantic segmentation
Fully convolutional networks for semantic segmentation
 
Feature pyramid networks for object detection
Feature pyramid networks for object detection Feature pyramid networks for object detection
Feature pyramid networks for object detection
 
Attention is all you need
Attention is all you needAttention is all you need
Attention is all you need
 
Se net
Se netSe net
Se net
 
Perceptual losses for real time style transfer and super-resolution
Perceptual losses for real time style transfer  and super-resolutionPerceptual losses for real time style transfer  and super-resolution
Perceptual losses for real time style transfer and super-resolution
 
Localisation network
Localisation networkLocalisation network
Localisation network
 
Les net
Les netLes net
Les net
 
Learning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descentLearning to learn by gradient descent by gradient descent
Learning to learn by gradient descent by gradient descent
 
Grad cam visual explanations from deep networks via gradient-based localizati...
Grad cam visual explanations from deep networks via gradient-based localizati...Grad cam visual explanations from deep networks via gradient-based localizati...
Grad cam visual explanations from deep networks via gradient-based localizati...
 
Goog lenet
Goog lenetGoog lenet
Goog lenet
 
Learning deep features for discriminative localization
Learning deep features for discriminative localizationLearning deep features for discriminative localization
Learning deep features for discriminative localization
 
Image net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networksImage net classification with deep convolutional neural networks
Image net classification with deep convolutional neural networks
 
Show, attend and tell
Show, attend and tellShow, attend and tell
Show, attend and tell
 
Vgg
VggVgg
Vgg
 
A.i
A.iA.i
A.i
 
Ai basic
Ai basicAi basic
Ai basic
 

Recently uploaded

Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (20)

Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx
 
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using PipesLinux Systems Programming: Inter Process Communication (IPC) using Pipes
Linux Systems Programming: Inter Process Communication (IPC) using Pipes
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Introduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdfIntroduction to Data Visualization,Matplotlib.pdf
Introduction to Data Visualization,Matplotlib.pdf
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Computer Graphics Introduction To Curves
Computer Graphics Introduction To CurvesComputer Graphics Introduction To Curves
Computer Graphics Introduction To Curves
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 

A neural image caption generator