Flickr8k_Image_Captioning_Project_PPT.pptx

Welcome
Name: Vinothkumar
Course: BSc Computer Science
Semester: Final Year
Batch: 2018-2022
[College Logo]

Flickr 8k Dataset Image
Captioning Project
Student: Vinothkumar
Register No.: XXXXXXX
Guide: [Faculty Advisor Name]
College Logo

Problem Statement (Abstract)
• • Huge image collections make manual
captioning impossible.
• • Goal: Automatically generate captions for
images.
• • Useful for accessibility, image search, and
photo organization.

Introduction
• • Combines Computer Vision + NLP.
• • Dataset: Flickr8k (images + human-written
captions).
• • Scope: Feature extraction, preprocessing,
model training, caption generation,
evaluation.

Existing System (Literature Survey)
• • Manual tagging or keyword-based search.
• • Drawbacks: Time-consuming, inaccurate,
lacks scalability.

Proposed System
• • Encoder-Decoder architecture (VGG16 +
LSTM).
• • Generates captions for unseen images.
• • Advantages: Scalable, improves accessibility,
automation.

System Design
• • Data Flow Diagram (DFD)
• • UML (Use Case/Class)
• • ER Diagram
• • Flowchart: preprocessing → training →
captioning

Hardware and Software
Requirements
• Hardware:
• • Processor: i5 or above
• • RAM: 8 GB minimum
• • GPU recommended
• Software:
• • Python, TensorFlow/Keras
• • NLTK, NumPy, Matplotlib

Modules Overview
• 1. Image Feature Extraction (VGG16)
• 2. Caption Preprocessing
• 3. Encoder-Decoder Model
• 4. Training & Validation
• 5. Inference & Caption Generation
• 6. Evaluation (BLEU Scores)

Module Explanation - Part 1
• • Image Feature Extraction (VGG16 → 4096
features)
• • Caption Preprocessing (cleaning,
tokenization, padding)

• • Encoder-Decoder Model (Image Encoder +
LSTM Decoder)
• • Training & Validation (data generator,
epochs)

• • Caption Generation (predict_caption)
• • Evaluation (BLEU scores, example output)

Output & Results
• • Training failed due to TypeError (data
generator issue).
• • Example caption: 'startseq ended ended...'
• • BLEU-1 = 0.02, BLEU-2 = 0.00
• • Lesson: Data formatting is critical.

Conclusion
• Expected Outcome:
• • Automated caption generation.
• • Applications: Accessibility, search, tagging.
• Future Scope:
• • Fix training error.
• • Use larger datasets (Flickr30k, MS COCO).
• • Explore Transformers & Attention.

Thank You
• Acknowledgement:
• • Guide
• • Review Committee
• — Vinothkumar

Flickr8k_Image_Captioning_Project_PPT.pptx

More Related Content

Similar to Flickr8k_Image_Captioning_Project_PPT.pptx

Recently uploaded

Flickr8k_Image_Captioning_Project_PPT.pptx