SlideShare a Scribd company logo
Department of Artificial Intelligence
Winter 2022 (Session: 2022-2023)
G H Raisoni College of Engineering, Nagpur
Presented By:
1. Akundi Harshvardhan (A-20)
2. Arya Bharne (A-24)
3. Priyesh Gawali (A-62)
4. Rajat Satpure (A-65)
Guide:-
prof. Pravin Kshirsagar
Assistant Professor
GHRCE ,Nagpur
Title of the Project:-
Image Captioning using Deep Learning
and NLP
Contents
1. Introduction
2. Abstract
3. Objective
4. Literature survey
5. Proposed Methodology/System architecture.
6. Hardware / Software Specification
7. Conclusion
8. Reference
Introduction
• From a image we can describe what’s in
image , simply we can define the image by
seeing the image.
• In this project, we are developing a
system/model which will describe the
content present in image i.e. our model
will give a caption to a image which is
called Image Captioning.
• In this, we are using a Deep Neural
Network(DNN) ,Convolutional Neural
Network(CNN) and Long Short Term
Memory(LSTM).
• As we are dealing with text data (image
captions) so we are also using Natural
Language Processing (NLP).
• By combining all these we are developing
a model which is called as Image Caption
Generator.
A muscular man standing.
Abstract
• Image captioning is an important task
nowadays. It will help you describe the
image in Editing software, assists visually
impaired and it can also generate captions
for social media posts .
• In recent years, researchers made a
significant progress in image captioning .
• In our solution we are using ‘Long Short
Term Memory’ (LSTM) along with
‘Convolutional Neural Network’(CNN).
• We are using Convolutional Neural Network
(CNN) to extract features from images and
Long Short Term Memory (LSTM) for
generating description from extracted
features of image.
• To describe contents of an image using CNN .
• To showcase the effectiveness of LSTM .
• To create a working model that describes the image on the basis of
features that are extracted.
• To understand the features of an image.
• To predict the next words from extracted features to make a caption.
Objectives
● Template-based approaches are able to generate grammatically correct
captions, but since the templates are predefined, it cannot generate variable-
length captions.
● Retrieval-based methods produce general and grammatically correct captions.
However, they cannot generate more descriptive and semantically correct
captions.
● Captions are most often generated for a whole scene in the image. However,
captions can also be generated for different areas of an image such as in Dense
captioning.
● RNN when used along with CNN had a very short term memory.
● Multimodal recurrent neural network method is similar to the method of Kiros
which uses a fixed length context, but in this method, the temporal context is
stored in a recurrent architecture that allows an arbitrary context length.
Literature Survey (Survey of existing products)
• We describe something using features of that thing . Like if we are describing
image we use it features to describe it. For example :- If we saw a large red
rose, we started describing it by saying “ A big beautiful Red rose.”, in this
sentence we use features like ‘large(size)’, ‘red(colour)’,
’beautiful(appearance)’ to describe the flower i.e. we are giving caption to it .
This process of describing something by seeing it is called as Image Captioning.
• In this project we are developing an Image Caption Generator which extract the
features from image by using Convolutional Neural Network (CNN),from the
extracted features our model will generate the caption for given image by
arranging the features in proper meaningful manner using Recurrent Neural
Network(RNN) and Long Short Term Memory(LSTM).
• LSTM remembers the previous words which helps in the prediction of words
which came later to make a proper sentence(Caption).
• By combining CNN(for feature extraction), RNN and LSTM(for prediction and
arranging of words ) , we are developing our model.
Proposed Methodology/System Architecture
• Category:
Machine Learning, Deep Learning and NLP
• Programming Language:
Python
• Tools & Libraries:
Tensorflow, Keras, NumPy, TQDM
• IDE:
Google Colab, Kaggle and Jupyter notebook
• Prerequisites:
Python, Machine Learning, Deep Learning and NLP
• DataSets :
Flickr Dataset.
Hardware / Software Specification
Our developed solution is a model which will describe the image using features
extracted from it i.e. our model will give caption to an image. We have used
Convolutional Neural Network (CNN) for feature extraction from an image and
Recurrent Neural Network (RNN) along with Long Short Term Memory(LSTM)
for prediction of words to make a proper caption for an given image .
Conclusion
1. International Journal of Innovative Research in Electrical, Electronics, Instrumentation and
Control Engineering NCIET- 2020
2. Aditya, A. N., Anditya, A. and Suyanto, (2019). “Generating Image Description on
Indonesian Language using Convolutional Neural Network and Gated Recurrent Unit”, 7th
International Conference on Information and Communication Technology (ICoICT).
3. Chetan, A. and Vaishli, J. (2018). “Image Caption Generation using Deep Learning
Technique”, Fourth International Conference on Computing Communication Control and
Automation (ICCUBEA).
4. Huda A. Al-muzaini, Tasniem N. and Hafida B. (2018) “Automatic Arabic Image
Captioning using RNN LSTM-Based Language Model and CNN”, International Journal of
Advanced Computer Science and Applications (IJACSA), Vol. 9, No.6.
5. International Journal of Innovative Research in Electrical, Electronics, Instrumentation and
Control Engineering NCIET- 2020
6. Aditya, A. N., Anditya, A. and Suyanto, (2019). “Generating Image Description on
Indonesian Language using Convolutional Neural Network and Gated Recurrent Unit”, 7th
International Conference on Information and Communication Technology (ICoICT).
References
1. Chetan, A. and Vaishli, J. (2018). “Image Caption Generation using Deep
Learning Technique”, Fourth International Conference on Computing
Communication Control and Automation (ICCUBEA).
2. Huda A. Al-muzaini, Tasniem N. and Hafida B. (2018) “Automatic Arabic Image
Captioning using RNN LSTM-Based Language Model and CNN”, International
Journal of Advanced Computer Science and Applications (IJACSA), Vol. 9,
No.6.
3. J. Liu, G. Wang, P. Hu, L.-Y. Duan, and A. C. Kot. Global context-aware
attention lstm networks for 3d action recognition. CVPR, 2017.
4. J. Lu, C. Xiong, D. Parikh, and R. Socher. Knowing when to look: Adaptive
attention via a visual sentinel for image captioning. CVPR, 2017
5. S. J. Rennie, E. Marcheret, Y. Mroueh, J. Ross, and V. Goel. Self-critical
sequence training for image captioning. CVPR, 2017.
6. Loshchilov and F. Hutter. Sgdr: Stochastic gradient de[1]scent with restarts.
ICLR, 2016.
7. J. Johnson, A. Karpathy, and L. Fei-Fei. Densecap: Fully convolutional
localization networks for dense captioning. In CVPR, 2016.
References
Thank You

More Related Content

What's hot

Image recognition
Image recognitionImage recognition
Image recognition
Nikhil Singh
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
Wael Badawy
 
Handwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learningHandwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learning
Sharmin Rubi
 
Brain Tumour Detection.pptx
Brain Tumour Detection.pptxBrain Tumour Detection.pptx
Brain Tumour Detection.pptx
RevolverRaja2
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
Wael Badawy
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
Raviraj singh shekhawat
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
Noura Hussein
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networks
Hojin Yang
 
Moving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNNMoving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNN
NITISHKUMAR1401
 
AI Computer vision
AI Computer visionAI Computer vision
AI Computer vision
Kashafnaz2
 
Computer vision
Computer visionComputer vision
Computer vision
AnkitKamal6
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
Brain tumor detection by scanning MRI images (using filtering techniques)
Brain tumor detection by scanning MRI images (using filtering techniques)Brain tumor detection by scanning MRI images (using filtering techniques)
Brain tumor detection by scanning MRI images (using filtering techniques)
Vivek reddy
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
Brodmann17
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
Richard Kuo
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
Roelof Pieters
 
Computer vision
Computer visionComputer vision
Computer vision
Mahmoud Hussein
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
Basit Rafiq
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
Antonio Rueda-Toicen
 
BRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSING
BRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSINGBRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSING
BRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSING
Dharshika Shreeganesh
 

What's hot (20)

Image recognition
Image recognitionImage recognition
Image recognition
 
Computer Vision image classification
Computer Vision image classificationComputer Vision image classification
Computer Vision image classification
 
Handwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learningHandwritten bangla-digit-recognition-using-deep-learning
Handwritten bangla-digit-recognition-using-deep-learning
 
Brain Tumour Detection.pptx
Brain Tumour Detection.pptxBrain Tumour Detection.pptx
Brain Tumour Detection.pptx
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Tutorial on convolutional neural networks
Tutorial on convolutional neural networksTutorial on convolutional neural networks
Tutorial on convolutional neural networks
 
Moving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNNMoving Object Detection And Tracking Using CNN
Moving Object Detection And Tracking Using CNN
 
AI Computer vision
AI Computer visionAI Computer vision
AI Computer vision
 
Computer vision
Computer visionComputer vision
Computer vision
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Brain tumor detection by scanning MRI images (using filtering techniques)
Brain tumor detection by scanning MRI images (using filtering techniques)Brain tumor detection by scanning MRI images (using filtering techniques)
Brain tumor detection by scanning MRI images (using filtering techniques)
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
Machine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural NetworkMachine Learning - Convolutional Neural Network
Machine Learning - Convolutional Neural Network
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Computer vision
Computer visionComputer vision
Computer vision
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
BRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSING
BRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSINGBRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSING
BRAIN TUMOR MRI IMAGE SEGMENTATION AND DETECTION IN IMAGE PROCESSING
 

Similar to Image captioning using DL and NLP.pptx

Natural Language Description Generation for Image using Deep Learning Archite...
Natural Language Description Generation for Image using Deep Learning Archite...Natural Language Description Generation for Image using Deep Learning Archite...
Natural Language Description Generation for Image using Deep Learning Archite...
ijtsrd
 
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNINGATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
Nathan Mathis
 
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMDEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
IRJET Journal
 
Modelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionModelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object Recognition
IJERA Editor
 
Batik image retrieval using convolutional neural network
Batik image retrieval using convolutional neural networkBatik image retrieval using convolutional neural network
Batik image retrieval using convolutional neural network
TELKOMNIKA JOURNAL
 
IRJET- Capsearch - An Image Caption Generation Based Search
IRJET- Capsearch - An Image Caption Generation Based SearchIRJET- Capsearch - An Image Caption Generation Based Search
IRJET- Capsearch - An Image Caption Generation Based Search
IRJET Journal
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU Architecture
IRJET Journal
 
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
Overview of Video Concept Detection using (CNN) Convolutional Neural NetworkOverview of Video Concept Detection using (CNN) Convolutional Neural Network
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
IRJET Journal
 
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A SurveyIRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET Journal
 
Optically processed Kannada script realization with Siamese neural network model
Optically processed Kannada script realization with Siamese neural network modelOptically processed Kannada script realization with Siamese neural network model
Optically processed Kannada script realization with Siamese neural network model
IAESIJAI
 
Image Processing Compression and Reconstruction by Using New Approach Artific...
Image Processing Compression and Reconstruction by Using New Approach Artific...Image Processing Compression and Reconstruction by Using New Approach Artific...
Image Processing Compression and Reconstruction by Using New Approach Artific...
CSCJournals
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
Hưng Đặng
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
Hưng Đặng
 
Cartoonization of images using machine Learning
Cartoonization of images using machine LearningCartoonization of images using machine Learning
Cartoonization of images using machine Learning
IRJET Journal
 
Kadir A_20160804_res_tea
Kadir A_20160804_res_teaKadir A_20160804_res_tea
Kadir A_20160804_res_tea
Kadir A Peker
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network method
TELKOMNIKA JOURNAL
 
Handwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNNHandwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNN
IRJET Journal
 
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET Journal
 
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
IRJET Journal
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
Brodmann17
 

Similar to Image captioning using DL and NLP.pptx (20)

Natural Language Description Generation for Image using Deep Learning Archite...
Natural Language Description Generation for Image using Deep Learning Archite...Natural Language Description Generation for Image using Deep Learning Archite...
Natural Language Description Generation for Image using Deep Learning Archite...
 
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNINGATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
ATTENTION BASED IMAGE CAPTIONING USING DEEP LEARNING
 
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTMDEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
DEEP LEARNING BASED IMAGE CAPTIONING IN REGIONAL LANGUAGE USING CNN AND LSTM
 
Modelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object RecognitionModelling Framework of a Neural Object Recognition
Modelling Framework of a Neural Object Recognition
 
Batik image retrieval using convolutional neural network
Batik image retrieval using convolutional neural networkBatik image retrieval using convolutional neural network
Batik image retrieval using convolutional neural network
 
IRJET- Capsearch - An Image Caption Generation Based Search
IRJET- Capsearch - An Image Caption Generation Based SearchIRJET- Capsearch - An Image Caption Generation Based Search
IRJET- Capsearch - An Image Caption Generation Based Search
 
Automated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU ArchitectureAutomated Image Captioning – Model Based on CNN – GRU Architecture
Automated Image Captioning – Model Based on CNN – GRU Architecture
 
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
Overview of Video Concept Detection using (CNN) Convolutional Neural NetworkOverview of Video Concept Detection using (CNN) Convolutional Neural Network
Overview of Video Concept Detection using (CNN) Convolutional Neural Network
 
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A SurveyIRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
 
Optically processed Kannada script realization with Siamese neural network model
Optically processed Kannada script realization with Siamese neural network modelOptically processed Kannada script realization with Siamese neural network model
Optically processed Kannada script realization with Siamese neural network model
 
Image Processing Compression and Reconstruction by Using New Approach Artific...
Image Processing Compression and Reconstruction by Using New Approach Artific...Image Processing Compression and Reconstruction by Using New Approach Artific...
Image Processing Compression and Reconstruction by Using New Approach Artific...
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
 
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
 
Cartoonization of images using machine Learning
Cartoonization of images using machine LearningCartoonization of images using machine Learning
Cartoonization of images using machine Learning
 
Kadir A_20160804_res_tea
Kadir A_20160804_res_teaKadir A_20160804_res_tea
Kadir A_20160804_res_tea
 
Handwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network methodHandwriting identification using deep convolutional neural network method
Handwriting identification using deep convolutional neural network method
 
Handwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNNHandwritten Digit Recognition Using CNN
Handwritten Digit Recognition Using CNN
 
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A Survey
 
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...A Review on Natural Scene Text Understanding for Computer Vision using Machin...
A Review on Natural Scene Text Understanding for Computer Vision using Machin...
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 

Recently uploaded

DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
gerogepatton
 
2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt
PuktoonEngr
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
IJNSA Journal
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
wisnuprabawa3
 
Exception Handling notes in java exception
Exception Handling notes in java exceptionException Handling notes in java exception
Exception Handling notes in java exception
Ratnakar Mikkili
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
Madhumitha Jayaram
 

Recently uploaded (20)

DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELDEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODEL
 
2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt2. Operations Strategy in a Global Environment.ppt
2. Operations Strategy in a Global Environment.ppt
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSA SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMS
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
New techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdfNew techniques for characterising damage in rock slopes.pdf
New techniques for characterising damage in rock slopes.pdf
 
Exception Handling notes in java exception
Exception Handling notes in java exceptionException Handling notes in java exception
Exception Handling notes in java exception
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
 

Image captioning using DL and NLP.pptx

  • 1. Department of Artificial Intelligence Winter 2022 (Session: 2022-2023) G H Raisoni College of Engineering, Nagpur Presented By: 1. Akundi Harshvardhan (A-20) 2. Arya Bharne (A-24) 3. Priyesh Gawali (A-62) 4. Rajat Satpure (A-65) Guide:- prof. Pravin Kshirsagar Assistant Professor GHRCE ,Nagpur Title of the Project:- Image Captioning using Deep Learning and NLP
  • 2. Contents 1. Introduction 2. Abstract 3. Objective 4. Literature survey 5. Proposed Methodology/System architecture. 6. Hardware / Software Specification 7. Conclusion 8. Reference
  • 3. Introduction • From a image we can describe what’s in image , simply we can define the image by seeing the image. • In this project, we are developing a system/model which will describe the content present in image i.e. our model will give a caption to a image which is called Image Captioning. • In this, we are using a Deep Neural Network(DNN) ,Convolutional Neural Network(CNN) and Long Short Term Memory(LSTM). • As we are dealing with text data (image captions) so we are also using Natural Language Processing (NLP). • By combining all these we are developing a model which is called as Image Caption Generator. A muscular man standing.
  • 4. Abstract • Image captioning is an important task nowadays. It will help you describe the image in Editing software, assists visually impaired and it can also generate captions for social media posts . • In recent years, researchers made a significant progress in image captioning . • In our solution we are using ‘Long Short Term Memory’ (LSTM) along with ‘Convolutional Neural Network’(CNN). • We are using Convolutional Neural Network (CNN) to extract features from images and Long Short Term Memory (LSTM) for generating description from extracted features of image.
  • 5. • To describe contents of an image using CNN . • To showcase the effectiveness of LSTM . • To create a working model that describes the image on the basis of features that are extracted. • To understand the features of an image. • To predict the next words from extracted features to make a caption. Objectives
  • 6. ● Template-based approaches are able to generate grammatically correct captions, but since the templates are predefined, it cannot generate variable- length captions. ● Retrieval-based methods produce general and grammatically correct captions. However, they cannot generate more descriptive and semantically correct captions. ● Captions are most often generated for a whole scene in the image. However, captions can also be generated for different areas of an image such as in Dense captioning. ● RNN when used along with CNN had a very short term memory. ● Multimodal recurrent neural network method is similar to the method of Kiros which uses a fixed length context, but in this method, the temporal context is stored in a recurrent architecture that allows an arbitrary context length. Literature Survey (Survey of existing products)
  • 7. • We describe something using features of that thing . Like if we are describing image we use it features to describe it. For example :- If we saw a large red rose, we started describing it by saying “ A big beautiful Red rose.”, in this sentence we use features like ‘large(size)’, ‘red(colour)’, ’beautiful(appearance)’ to describe the flower i.e. we are giving caption to it . This process of describing something by seeing it is called as Image Captioning. • In this project we are developing an Image Caption Generator which extract the features from image by using Convolutional Neural Network (CNN),from the extracted features our model will generate the caption for given image by arranging the features in proper meaningful manner using Recurrent Neural Network(RNN) and Long Short Term Memory(LSTM). • LSTM remembers the previous words which helps in the prediction of words which came later to make a proper sentence(Caption). • By combining CNN(for feature extraction), RNN and LSTM(for prediction and arranging of words ) , we are developing our model. Proposed Methodology/System Architecture
  • 8.
  • 9. • Category: Machine Learning, Deep Learning and NLP • Programming Language: Python • Tools & Libraries: Tensorflow, Keras, NumPy, TQDM • IDE: Google Colab, Kaggle and Jupyter notebook • Prerequisites: Python, Machine Learning, Deep Learning and NLP • DataSets : Flickr Dataset. Hardware / Software Specification
  • 10. Our developed solution is a model which will describe the image using features extracted from it i.e. our model will give caption to an image. We have used Convolutional Neural Network (CNN) for feature extraction from an image and Recurrent Neural Network (RNN) along with Long Short Term Memory(LSTM) for prediction of words to make a proper caption for an given image . Conclusion
  • 11. 1. International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering NCIET- 2020 2. Aditya, A. N., Anditya, A. and Suyanto, (2019). “Generating Image Description on Indonesian Language using Convolutional Neural Network and Gated Recurrent Unit”, 7th International Conference on Information and Communication Technology (ICoICT). 3. Chetan, A. and Vaishli, J. (2018). “Image Caption Generation using Deep Learning Technique”, Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). 4. Huda A. Al-muzaini, Tasniem N. and Hafida B. (2018) “Automatic Arabic Image Captioning using RNN LSTM-Based Language Model and CNN”, International Journal of Advanced Computer Science and Applications (IJACSA), Vol. 9, No.6. 5. International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering NCIET- 2020 6. Aditya, A. N., Anditya, A. and Suyanto, (2019). “Generating Image Description on Indonesian Language using Convolutional Neural Network and Gated Recurrent Unit”, 7th International Conference on Information and Communication Technology (ICoICT). References
  • 12. 1. Chetan, A. and Vaishli, J. (2018). “Image Caption Generation using Deep Learning Technique”, Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). 2. Huda A. Al-muzaini, Tasniem N. and Hafida B. (2018) “Automatic Arabic Image Captioning using RNN LSTM-Based Language Model and CNN”, International Journal of Advanced Computer Science and Applications (IJACSA), Vol. 9, No.6. 3. J. Liu, G. Wang, P. Hu, L.-Y. Duan, and A. C. Kot. Global context-aware attention lstm networks for 3d action recognition. CVPR, 2017. 4. J. Lu, C. Xiong, D. Parikh, and R. Socher. Knowing when to look: Adaptive attention via a visual sentinel for image captioning. CVPR, 2017 5. S. J. Rennie, E. Marcheret, Y. Mroueh, J. Ross, and V. Goel. Self-critical sequence training for image captioning. CVPR, 2017. 6. Loshchilov and F. Hutter. Sgdr: Stochastic gradient de[1]scent with restarts. ICLR, 2016. 7. J. Johnson, A. Karpathy, and L. Fei-Fei. Densecap: Fully convolutional localization networks for dense captioning. In CVPR, 2016. References