142 GOVERNMENT POLYTECHNIC COLLEGE
KELAMANGALAM
REVIEW-01
DEPARTMENT OF COMPUTER
ENGINEERING
IMAGE CAPTIONING USING DEEP LEARNING
FOR MULTILANGUAGE
PROJECT MEMBERS
SAHANA .S
AMUDHA .G
DHIVYA .B
MONISHA.Y
SOWMIYA .V
VEDHA SHREE .M
- 23501538
- 23501506
- 23501515
- 23501526
- 23501545
- 23501555
PROJECT GUIDE;
B.SANTHIMEENA. M.E.,HOD..
ABSTRACT
 This project's major goal is to Produce a version that makes it
possible to foresee the caption or text that will appear next to
the image.
The creation of captions for pictures is a difficult and
responsible task.
 This image captioning is a complex and challenging tasks.
 The main aim of this project is to develop a model such that it
can predict the closest text or caption to that particular image.
INTRODUCTION
The term "photo captioning" refers to the explanation of the image's content.
Picture captions are a brand-new trend that is currently generating a lot of
curiosity.
The main objective of this photo capture is to generate a natural language
description for the entry photo that is sent to the version.
In this analysis, we provide a model that depicts characterization as a
linguistic technique using herbs.
Also, the neural network's functionalities are taken from extraordinary
architectures like CNN and LSTM.
EXISTING SYSTEM
In the beginning it is impractical to computer to
characterize an image.
There are still some problems in existing systems.
Disadvantages:
Improving the Quality of Captions.
Not Generating creative captions.
 Handling unseen objects.
 Dealing with multiple objects and relationships are
PROPOSED SYSTEM
Image Caption Generator Model (CNN-RNN model)
= CNN + LSTM
A pre-trained model called Xception is used for
this.
CNN – To extract features from the image.
LSTM – To generate a description from the
extracted information of the image.
MODULES
 Data pre-processing
 VGG16-Xception
 CNN LSTM
Advantages:
 Efficiency.
 Generating creative captions.
SYSTEM REQUIREMENT SPECIFICATION
Processor : Intel or Pentium
Speed : 2.4GHZ
Hard Disk : 1 TB HDD, 256GB SSD
Input : Keyboard, Mouse
Ram : 8GB
HARDWARE REQUIREMENTS
SOFTWARE REQUIREMENTS
Operating System - Windows10 or Higher
Dataset - Flickr 8k
Software - Jupyter Notebook
image captioning project using python for diploma

image captioning project using python for diploma

  • 2.
    142 GOVERNMENT POLYTECHNICCOLLEGE KELAMANGALAM REVIEW-01 DEPARTMENT OF COMPUTER ENGINEERING IMAGE CAPTIONING USING DEEP LEARNING FOR MULTILANGUAGE
  • 3.
    PROJECT MEMBERS SAHANA .S AMUDHA.G DHIVYA .B MONISHA.Y SOWMIYA .V VEDHA SHREE .M - 23501538 - 23501506 - 23501515 - 23501526 - 23501545 - 23501555 PROJECT GUIDE; B.SANTHIMEENA. M.E.,HOD..
  • 4.
    ABSTRACT  This project'smajor goal is to Produce a version that makes it possible to foresee the caption or text that will appear next to the image. The creation of captions for pictures is a difficult and responsible task.  This image captioning is a complex and challenging tasks.  The main aim of this project is to develop a model such that it can predict the closest text or caption to that particular image.
  • 5.
    INTRODUCTION The term "photocaptioning" refers to the explanation of the image's content. Picture captions are a brand-new trend that is currently generating a lot of curiosity. The main objective of this photo capture is to generate a natural language description for the entry photo that is sent to the version. In this analysis, we provide a model that depicts characterization as a linguistic technique using herbs. Also, the neural network's functionalities are taken from extraordinary architectures like CNN and LSTM.
  • 6.
    EXISTING SYSTEM In thebeginning it is impractical to computer to characterize an image. There are still some problems in existing systems. Disadvantages: Improving the Quality of Captions. Not Generating creative captions.  Handling unseen objects.  Dealing with multiple objects and relationships are
  • 7.
    PROPOSED SYSTEM Image CaptionGenerator Model (CNN-RNN model) = CNN + LSTM A pre-trained model called Xception is used for this. CNN – To extract features from the image. LSTM – To generate a description from the extracted information of the image.
  • 8.
    MODULES  Data pre-processing VGG16-Xception  CNN LSTM Advantages:  Efficiency.  Generating creative captions.
  • 9.
    SYSTEM REQUIREMENT SPECIFICATION Processor: Intel or Pentium Speed : 2.4GHZ Hard Disk : 1 TB HDD, 256GB SSD Input : Keyboard, Mouse Ram : 8GB HARDWARE REQUIREMENTS
  • 10.
    SOFTWARE REQUIREMENTS Operating System- Windows10 or Higher Dataset - Flickr 8k Software - Jupyter Notebook