SlideShare a Scribd company logo
1 of 44
Major Project Presentation
on
REAL TIME OBJECT RECOGNITION FOR VISUALLY
IMPAIRED PEOPLE
Mahatma Gandhi Mission’s College Of
Engineering & Technology
A-09, Sector 62, Noida, Uttar Pradesh 201301
Submitted by:
Vikas Kumar Pandey Akshay kumar Hariom
Roll No.:1900950310011 Roll no:1900950310002 Roll no: 190950310006
Content
Introduction
Problems faced by blind peoples
Literature review
Objective
Block diagram
Yolo algorithm
Block diagram of yolo algorithm
Object detection
Database used
Methodology
Flow chart
Hardware used
Advantages of yolo algorithm
Survey
Advantages
Conclusion
Future work
Reference
Introduction
 The World Health Organization (WHO) had a survey over around 7889 million people.
The statistics showed that among the population under consideration while survey, 253
millions were visually impaired.[4]
 There are many visually impaired people facing many problems in our society.
 The device developed can detect the objects in the user's surroundings.
 This is a model has been proposed which makes the visually impaired people detect
objects in his surroundings. The output of the system is in audio form that can be easily
understandable for a blind user.
Problem faced by blind people
Visually Impaired People confront many problems in recognizing the
objects.
Blind people doesn’t able to recognize the objects next to them.
This is developed to detect the objects in the user's surroundings.
It will also solve the problem of keeping a walking stick.
Literature review
1. “The authors in(Seema et al ) suggested using a smart system that guides a
blind person in 2016[1]”
• The system detects the obstacles that could not be detected by his/her cane.
However, the proposed system was designed to protect the blind from the area
near to his/her head.
Problem statement - The buzzer and vibrator were used and employed as output
modes to a user. This is useful for obstacles detection only at a head level without
recognizing the type of obstacles.
Contd.
2. “A modification of several systems used in visual recognition was proposed
in 2014.[2]”
• The authors used fast-feature pyramids and provided findings on general object
detection systems. The results showed that the proposed scheme can be strictly
used for wide-spectrum images.
 Problem statement - It does not succeed for narrow-spectrum images. Hence,
their work cannot be used as efficient general objects detection.
Contd.
3. “In (Nazli Mohajeri et al, 2011) the authors suggested a two-camera
system to capture photos”.[3]
• However, the proposed system was only tested under three conditions and for
three objects. Specific obstacles that have distances from cameras of about 70 cm
were detected.
Problem statement - The results showed some range of error. Blind helping
systems need to cover more cases with efficient and satisfied results.
Objective
This project aims to relieve some of their problems using assistive technology.
Simply it is the technique of real time stationary object recognition.
To make visually impaired people self independent.
To provide a device for detection of objects.
Our main aim is, an object recognition function with device should be able to detect
certain items from the camera and return an audio output to announce what it is. In
order to recognize object, machine learning has to be involved.
Block diagram
Capturing
the video
Object detection
using Yolo
Algorithm
Raspberry 3B+
Text to speech
Speaker
Object detection
 Object detection is a phenomenon in computer vision that
involves the detection of various objects in digital images or
videos.
 Some of the objects detected include people, cars, chairs, stones,
buildings, and animals.
 It identify the object in a specific image.
 Establish the exact location of the object within the image.
Sr no ALGORITHM ADVANTAGE DISADVANTAGE
1 RESNET • solve degradation problem by shortcuts
• skip connections.
• RESNETs are that for a deeper network
the detection of errors becomes difficult.
2 R-CNN • very accurate at image recognition and
classification
• They fail to encode the position and
orientation of objects.
3 FAST R-CNN • save time compared to traditional
algorithms like Selective Search.
• It still uses the Selective Search
Algorithm which is slow and a time-
consuming process.
4 SSD • SSD makes more predictions.
• It has better coverage on location, scale,
and aspect ratios.
• Shallow layers in a neural network may
not generate enough high level features
to do prediction for small objects.
5 YOLO
• Allows real time object detection.
• System trains in single go.
• More efficient and fast.
• Struggles to detect close objects because
each grid can propose only 2 bounding
boxes.
EXISTING ALGORITHM
 The YOLOv4 performance was evaluated based on previous YOLO versions (YOLOv3
and YOLOv2)as baselines.
 The new YOLOv4 shows the best speed-to-accuracy balance compared to state-of-the-art
object detectors.
 In general, YOLOv4 surpasses all previous object detectors in terms of both speed and
accuracy, ranging from 5 FPS to as much as 160 FPS.
 The YOLO v4 algorithm achieves the highest accuracy among all other real-time object
detection models – while achieving 30 FPS or higher using a GPU.
ALGORITHM SELECTION
YOLO algorithm
YOLO is an abbreviation for the term 'You Only Look Once’.
Created by Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi.
YOLO algorithm detects and recognizes various objects in the picture.
 Object detection in YOLO is done as a regression problem and provides the class
probabilities of the detected images
 Prediction in the entire image is done in a single algorithmic run.
YOLO algorithm consists of various variants including tiny YOLO and YOLOv1,
v2, v3, v4.
 Popular because of its speed and accuracy.
Yolo evolution
Algorithm Description
The original YOLO - YOLO was the first object detection network to combine the problem of drawing
bounding boxes and identifying class labels in one end-to-end differentiable
network.
YOLOv2 - YOLOv2 made a number of iterative improvements on top of YOLO including
BatchNorm, higher resolution, and anchor boxes.
YOLOv3 - YOLOv3 built upon previous models by adding an objects score to bounding box
prediction, added connections to the backbone network layers and made predictions at
three separate levels of granularity to improve performance on smaller objects.
YOLOv4 - It is a one-stage detector with several components in it. It detects the object in real
time. The speed and accuracy is faster than other algorithm.
Backbone
Input Neck Dense prediction
One stage detector
CSP darknet53 SPP + PAN
YOLOv4 = CSP darknet53 + SPP + BoF + BoS
YOLOv4 architecture
BoF + BoS
CSP DARKNET53
 CSPDarknet53 is a convolutional neural network and
backbone for object detection.
 It employs a strategy to partition the feature map of the
Image into two parts and then merges them through a
cross-stage hierarchy.
 The use of a split and merge strategy allows for more
gradient flow through the network.
SPATIAL PYRAMID POOLING
 A CNN consists of some Convolutional
(Conv) layers followed by some Fully-
Connected (FC) layers. Conv layers don’t
require fixed-size input .
 The solution to this problem lies in the
Spatial Pyramid Pooling (SPP) layer. It is
placed between the last Conv layer and the
first FC layer and removes the fixed-size
constraint of the network.
 The goal of the SPP layer is to pool the
variable-size features that come from the
Conv layer and generate fixed-length outputs
that will then be fed to the first FC layer of
the network.
BAG OF FREEBIES AND SPACIALS
 ‘Bag of Freebies’ (BoF) is a general framework of training strategies for improving
the overall accuracy of an object detection model.
 The set of techniques or methods that change the training strategy or training cost
for improvement of model accuracy is termed as Bag of Freebies.
 Bag of Specials (BoS) can be considered as an add-on for any object detectors
present right now to make them more accurate.
METHODOLOGY
The steps of a currency recognition system based on image processing are as follows
–
 Image capturing
 Image Acquisition
 Object detection
 YOLO algorithm
 Prediction
Block diagram
Residual
blocks
Bounding box Target label Y
Non max
suppression
Intersection
over union
Prediction
Localization
Start
Output as
audio
Yolo algorithm
Capturing image
 Capturing of image is done by camera
module for that purpose the objects
captured in real time and stationary also.
Image Acquisition
 The image is captured by digital camera
as RGB image and is converted to Gray
scale version by intensity equation 1.
I = (R+G+B)/3
RESIDUAL BLOCKS
The image is divided into various grids. Each grid has
a dimension of S x S.
It uses the dimensions of 3 x 3, 13 x 13 and 19 x 19.
There are many grid cells of equal dimension. Every
grid cell will detect objects that appear within it.
LOCALIZATION
The term 'localization' refers to where the
object in the image is present. In YOLO object
detection we classify image with localization
i.e., a supervised learning algorithm is trained
to not only predict class but also the bounding
box around the object in image.
Classification + localization = object detection
BOUNDING BOXES
A bounding box is an outline that highlights an
object in an image.
Every bounding box in the image consists of
the following attributes:
• Bounding box center (bx, by)
• Height (bh)
• Width (bw)
• Class (for example, person, car, traffic light,
etc.). This is represented by the letter c.
(bw)
(bh) .
(bx, by)
BOUNDING BOXES - CONT...
Each 13x13 cell detects objects in the input image
via its specified number of bounding boxes 13x13.
In YOLO v4, each cell has 3 bounding boxes. So
the total number of bounding boxes using 13x13
feature map would be.
(13x13)x3 = 507 bounding boxes.
The remaining bounding boxes are discarded as
they don't localize the objects in the picture.
TARGET LABEL Y
Target label y for this supervised learning task is
explained as:
Y is a vector containing Pc, Bx, By, Bh, Bw, CI,..., Ch
Pc is the probability of presence of particular class in
the grid cell. Pc >=0 and <=1. (i.e., Pc=0) means that
object is not found. Pc>I means 100% probability that
object is present.
(Bx, By) defines the mid-point of object and (Bh, Bw)
defines the height and width of bounding box.
Also, if Pc > 0 then there will be n number of C which
represents the classes of objects present in the image.
Intersection over union (IOU)
(Intersection over Union) is a term used to
describe the extent of overlap of two boxes. The
greater the region of overlap, the greater the
IOU.
IOU is mainly used in applications related to
object detection, where we train a model to
output a box that fits perfectly around an object.
IOU is also used in non max suppression
algorithm.
IOU=
𝐼𝑁𝑇𝐸𝑅𝑆𝐸𝐶𝑇𝐼𝑂𝑁(𝐴𝑅𝐸𝐴 𝑂𝐹 𝑂𝑉𝐸𝑅𝐿𝐴𝑃)
𝑈𝑁𝐼𝑂𝑁
NMS- NON MAX SUPRESSION
To select the best bounding box, from the multiple predicted bounding
boxes, an algorithm called Non-Max Suppression is used to
"suppress" the less likely bounding boxes and keep only the best one.
Prediction
YOLO v4 make detections at 3 different points i.e., layer 82, 94, 106. Network
down-samples the input image by the network strides 32, 16 and 8 at those points
respectively.
After reaching a stride of 32, the network produces a 13x13 feature map for an
input image of size 416x416.
Another detection layer when the stride is 16 we obtain a 26x26 output feature
map.
And 52x52 feature map at the detection layer when the stride is 8. Thus, the total
number of bounding boxes by YOLO V4 when the input image size is 416x416.
((13x13)+(26x26)+(52x52))x3 = 10647 bounding boxes32 is per image
Database used
Coco dataset – COCO dataset, meaning “Common Objects In Context”.
It is a large-scale image dataset containing 328,000 images of everyday objects
and humans.
The dataset contains annotations of machine learning models to recognize, label,
and describe objects.
COCO provides the following types of annotations:
• Object detection
• Captioning
• Key points
• Dense pose
Contd:
 Object detection consists of various approaches such as fast R-CNN,
Retina-Net, and Sliding Window detection but none of the
aforementioned methods can detect object in one single run. So there
comes another efficient and faster algorithm called YOLO algorithm.
Flow Chart
Start Capture
image
Image
captured
correctly Processing
Deep learning
Algorithm
Predicted
Object
recognition
Output in
audio
format
Send error
message
Yes
No
No
Yes
Hardware
Raspberry pi 3B+
Camera module v2
Jumper wires
Speaker
Button
Raspberry pi 3B+
The Raspberry Pi 3 Model B+ is the latest product in the Raspberry Pi 3
range, boasting a 64-bit quad core processor running at 1.4GHz, dual-band
2.4GHz. and 5GHz wireless LAN, Bluetooth 4.2/BLE
Camera module v2
The Raspberry Pi Camera v2 is a high quality 8 megapixel Sony
IMX219 image sensor custom designed add-on board for Raspberry Pi,
featuring a fixed focus lens.
Advantage of yolo algorithm
YOLO algorithm is important because of the following reasons:
Speed : This algorithm improves the speed of detection because it can predict
objects in real-time.
High accuracy: YOLO is a predictive technique that provides accurate results. It
use Convolutional implementation that means that if you have 3*3 grid (i.e.,
divide image into 9 grid cells) then you don't need to run the algorithm 9 times to
validate presence of object in each grid cell rather this is one single convolutional
implementation.
Learning capabilities: The algorithm has excellent learning capabilities that
enable it to learn the representations of objects and apply them in object
detection.
Surveys
 According to national federation of blind , blind people can use all the
devices easily so they can also use our object recognition system.[5]
Advantage
This work is implemented using PTTS.
Easy to set up.
Open source tools were used for this project.
Cheap and cost-efficient.
This project will work on device only no need to buy any extra
things.
Conclusion
Simple Indian object recognition system
based on yolo algorithm has been
proposed.
The system has been written in OpenCV.
Future Work
Enhancing the accuracy by building a model of features for each
object class. Working now on using local features instead of
template matching Enhancing the best frame to be processed for
runtime application Adding more objects to the database.
References
1. https://www.researchgate.net/publication/334811299_Real-
Time_Objects_Recognition_Approach_for_Assisting_Blind_People
2. https://www.researchgate.net/publication/334811299_Real-
Time_Objects_Recognition_Approach_for_Assisting_Blind_People
3. https://www.researchgate.net/publication/235987140_An_obstacl
e_detection_system_for_blind_people
4. https://www.who.int/news-room/fact-sheets/detail/blindness-and-
visual-
impairment#:~:text=Prevalence,near%20or%20distance%20vision%
20impairment.
5. https://www.irjet.net/archives/V5/i2/IRJET-V5I2249.pdf
6.https://1000projects.org/detection-of-currency-notes-and-
medicine-names-for-the-blind-people-project.html
7.https://DrNouraSemary/currency-recognition-system-for-
visually-impaired-egyptian-banknote-as-a-study-case-icta2015
8.https://www.researchgate.net/publication/329487411_Currency
_Recognition_System_for_Blind_people_using_ORB_Algorithm
Thank you

More Related Content

Similar to ppt - of a project will help you on your college projects

auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired person
shahsamkit73
 
presentation on Faster Yolo
presentation on Faster Yolo presentation on Faster Yolo
presentation on Faster Yolo
toontown1
 
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorProposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
QUESTJOURNAL
 
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONSENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
sipij
 

Similar to ppt - of a project will help you on your college projects (20)

YOLOv4: A Face Mask Detection System
YOLOv4: A Face Mask Detection SystemYOLOv4: A Face Mask Detection System
YOLOv4: A Face Mask Detection System
 
Detection of a user-defined object in an image using feature extraction- Trai...
Detection of a user-defined object in an image using feature extraction- Trai...Detection of a user-defined object in an image using feature extraction- Trai...
Detection of a user-defined object in an image using feature extraction- Trai...
 
Road signs detection using voila jone's algorithm with the help of opencv
Road signs detection using voila jone's algorithm with the help of opencvRoad signs detection using voila jone's algorithm with the help of opencv
Road signs detection using voila jone's algorithm with the help of opencv
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
auto-assistance system for visually impaired person
auto-assistance system for visually impaired personauto-assistance system for visually impaired person
auto-assistance system for visually impaired person
 
Real Time Sign Language Recognition Using Deep Learning
Real Time Sign Language Recognition Using Deep LearningReal Time Sign Language Recognition Using Deep Learning
Real Time Sign Language Recognition Using Deep Learning
 
Development of wearable object detection system &amp; blind stick for visuall...
Development of wearable object detection system &amp; blind stick for visuall...Development of wearable object detection system &amp; blind stick for visuall...
Development of wearable object detection system &amp; blind stick for visuall...
 
presentation on Faster Yolo
presentation on Faster Yolo presentation on Faster Yolo
presentation on Faster Yolo
 
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
 
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operatorProposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
Proposed Multi-object Tracking Algorithm Using Sobel Edge Detection operator
 
An assistive model of obstacle detection based on deep learning: YOLOv3 for v...
An assistive model of obstacle detection based on deep learning: YOLOv3 for v...An assistive model of obstacle detection based on deep learning: YOLOv3 for v...
An assistive model of obstacle detection based on deep learning: YOLOv3 for v...
 
Csit3916
Csit3916Csit3916
Csit3916
 
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detection
 
Conference research paper_target_tracking
Conference research paper_target_trackingConference research paper_target_tracking
Conference research paper_target_tracking
 
IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3
 
INDOOR AND OUTDOOR NAVIGATION ASSISTANCE SYSTEM FOR VISUALLY IMPAIRED PEOPLE ...
INDOOR AND OUTDOOR NAVIGATION ASSISTANCE SYSTEM FOR VISUALLY IMPAIRED PEOPLE ...INDOOR AND OUTDOOR NAVIGATION ASSISTANCE SYSTEM FOR VISUALLY IMPAIRED PEOPLE ...
INDOOR AND OUTDOOR NAVIGATION ASSISTANCE SYSTEM FOR VISUALLY IMPAIRED PEOPLE ...
 
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEMDEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
DEEP LEARNING APPROACH FOR EVENT MONITORING SYSTEM
 
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONSENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTION
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 

Recently uploaded

DR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdf
DR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdfDR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdf
DR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdf
DrGurudutt
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdf
Kamal Acharya
 
Paint shop management system project report.pdf
Paint shop management system project report.pdfPaint shop management system project report.pdf
Paint shop management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
Electrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission lineElectrostatic field in a coaxial transmission line
Electrostatic field in a coaxial transmission line
 
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
NO1 Pandit Black Magic Removal in Uk kala jadu Specialist kala jadu for Love ...
 
Lect_Z_Transform_Main_digital_image_processing.pptx
Lect_Z_Transform_Main_digital_image_processing.pptxLect_Z_Transform_Main_digital_image_processing.pptx
Lect_Z_Transform_Main_digital_image_processing.pptx
 
"United Nations Park" Site Visit Report.
"United Nations Park" Site  Visit Report."United Nations Park" Site  Visit Report.
"United Nations Park" Site Visit Report.
 
DR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdf
DR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdfDR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdf
DR PROF ING GURUDUTT SAHNI WIKIPEDIA.pdf
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdf
 
Dairy management system project report..pdf
Dairy management system project report..pdfDairy management system project report..pdf
Dairy management system project report..pdf
 
internship exam ppt.pptx on embedded system and IOT
internship exam ppt.pptx on embedded system and IOTinternship exam ppt.pptx on embedded system and IOT
internship exam ppt.pptx on embedded system and IOT
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Paint shop management system project report.pdf
Paint shop management system project report.pdfPaint shop management system project report.pdf
Paint shop management system project report.pdf
 
Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...Software Engineering - Modelling Concepts + Class Modelling + Building the An...
Software Engineering - Modelling Concepts + Class Modelling + Building the An...
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 

ppt - of a project will help you on your college projects

  • 1. Major Project Presentation on REAL TIME OBJECT RECOGNITION FOR VISUALLY IMPAIRED PEOPLE Mahatma Gandhi Mission’s College Of Engineering & Technology A-09, Sector 62, Noida, Uttar Pradesh 201301 Submitted by: Vikas Kumar Pandey Akshay kumar Hariom Roll No.:1900950310011 Roll no:1900950310002 Roll no: 190950310006
  • 2. Content Introduction Problems faced by blind peoples Literature review Objective Block diagram Yolo algorithm Block diagram of yolo algorithm Object detection Database used Methodology Flow chart Hardware used Advantages of yolo algorithm Survey Advantages Conclusion Future work Reference
  • 3. Introduction  The World Health Organization (WHO) had a survey over around 7889 million people. The statistics showed that among the population under consideration while survey, 253 millions were visually impaired.[4]  There are many visually impaired people facing many problems in our society.  The device developed can detect the objects in the user's surroundings.  This is a model has been proposed which makes the visually impaired people detect objects in his surroundings. The output of the system is in audio form that can be easily understandable for a blind user.
  • 4. Problem faced by blind people Visually Impaired People confront many problems in recognizing the objects. Blind people doesn’t able to recognize the objects next to them. This is developed to detect the objects in the user's surroundings. It will also solve the problem of keeping a walking stick.
  • 5. Literature review 1. “The authors in(Seema et al ) suggested using a smart system that guides a blind person in 2016[1]” • The system detects the obstacles that could not be detected by his/her cane. However, the proposed system was designed to protect the blind from the area near to his/her head. Problem statement - The buzzer and vibrator were used and employed as output modes to a user. This is useful for obstacles detection only at a head level without recognizing the type of obstacles.
  • 6. Contd. 2. “A modification of several systems used in visual recognition was proposed in 2014.[2]” • The authors used fast-feature pyramids and provided findings on general object detection systems. The results showed that the proposed scheme can be strictly used for wide-spectrum images.  Problem statement - It does not succeed for narrow-spectrum images. Hence, their work cannot be used as efficient general objects detection.
  • 7. Contd. 3. “In (Nazli Mohajeri et al, 2011) the authors suggested a two-camera system to capture photos”.[3] • However, the proposed system was only tested under three conditions and for three objects. Specific obstacles that have distances from cameras of about 70 cm were detected. Problem statement - The results showed some range of error. Blind helping systems need to cover more cases with efficient and satisfied results.
  • 8. Objective This project aims to relieve some of their problems using assistive technology. Simply it is the technique of real time stationary object recognition. To make visually impaired people self independent. To provide a device for detection of objects. Our main aim is, an object recognition function with device should be able to detect certain items from the camera and return an audio output to announce what it is. In order to recognize object, machine learning has to be involved.
  • 9. Block diagram Capturing the video Object detection using Yolo Algorithm Raspberry 3B+ Text to speech Speaker
  • 10. Object detection  Object detection is a phenomenon in computer vision that involves the detection of various objects in digital images or videos.  Some of the objects detected include people, cars, chairs, stones, buildings, and animals.  It identify the object in a specific image.  Establish the exact location of the object within the image.
  • 11. Sr no ALGORITHM ADVANTAGE DISADVANTAGE 1 RESNET • solve degradation problem by shortcuts • skip connections. • RESNETs are that for a deeper network the detection of errors becomes difficult. 2 R-CNN • very accurate at image recognition and classification • They fail to encode the position and orientation of objects. 3 FAST R-CNN • save time compared to traditional algorithms like Selective Search. • It still uses the Selective Search Algorithm which is slow and a time- consuming process. 4 SSD • SSD makes more predictions. • It has better coverage on location, scale, and aspect ratios. • Shallow layers in a neural network may not generate enough high level features to do prediction for small objects. 5 YOLO • Allows real time object detection. • System trains in single go. • More efficient and fast. • Struggles to detect close objects because each grid can propose only 2 bounding boxes. EXISTING ALGORITHM
  • 12.  The YOLOv4 performance was evaluated based on previous YOLO versions (YOLOv3 and YOLOv2)as baselines.  The new YOLOv4 shows the best speed-to-accuracy balance compared to state-of-the-art object detectors.  In general, YOLOv4 surpasses all previous object detectors in terms of both speed and accuracy, ranging from 5 FPS to as much as 160 FPS.  The YOLO v4 algorithm achieves the highest accuracy among all other real-time object detection models – while achieving 30 FPS or higher using a GPU. ALGORITHM SELECTION
  • 13. YOLO algorithm YOLO is an abbreviation for the term 'You Only Look Once’. Created by Joseph Redmon, Santosh Divvala, Ross Girshick and Ali Farhadi. YOLO algorithm detects and recognizes various objects in the picture.  Object detection in YOLO is done as a regression problem and provides the class probabilities of the detected images  Prediction in the entire image is done in a single algorithmic run. YOLO algorithm consists of various variants including tiny YOLO and YOLOv1, v2, v3, v4.  Popular because of its speed and accuracy.
  • 14. Yolo evolution Algorithm Description The original YOLO - YOLO was the first object detection network to combine the problem of drawing bounding boxes and identifying class labels in one end-to-end differentiable network. YOLOv2 - YOLOv2 made a number of iterative improvements on top of YOLO including BatchNorm, higher resolution, and anchor boxes. YOLOv3 - YOLOv3 built upon previous models by adding an objects score to bounding box prediction, added connections to the backbone network layers and made predictions at three separate levels of granularity to improve performance on smaller objects. YOLOv4 - It is a one-stage detector with several components in it. It detects the object in real time. The speed and accuracy is faster than other algorithm.
  • 15. Backbone Input Neck Dense prediction One stage detector CSP darknet53 SPP + PAN YOLOv4 = CSP darknet53 + SPP + BoF + BoS YOLOv4 architecture BoF + BoS
  • 16. CSP DARKNET53  CSPDarknet53 is a convolutional neural network and backbone for object detection.  It employs a strategy to partition the feature map of the Image into two parts and then merges them through a cross-stage hierarchy.  The use of a split and merge strategy allows for more gradient flow through the network.
  • 17. SPATIAL PYRAMID POOLING  A CNN consists of some Convolutional (Conv) layers followed by some Fully- Connected (FC) layers. Conv layers don’t require fixed-size input .  The solution to this problem lies in the Spatial Pyramid Pooling (SPP) layer. It is placed between the last Conv layer and the first FC layer and removes the fixed-size constraint of the network.  The goal of the SPP layer is to pool the variable-size features that come from the Conv layer and generate fixed-length outputs that will then be fed to the first FC layer of the network.
  • 18. BAG OF FREEBIES AND SPACIALS  ‘Bag of Freebies’ (BoF) is a general framework of training strategies for improving the overall accuracy of an object detection model.  The set of techniques or methods that change the training strategy or training cost for improvement of model accuracy is termed as Bag of Freebies.  Bag of Specials (BoS) can be considered as an add-on for any object detectors present right now to make them more accurate.
  • 19. METHODOLOGY The steps of a currency recognition system based on image processing are as follows –  Image capturing  Image Acquisition  Object detection  YOLO algorithm  Prediction
  • 20. Block diagram Residual blocks Bounding box Target label Y Non max suppression Intersection over union Prediction Localization Start Output as audio Yolo algorithm
  • 21. Capturing image  Capturing of image is done by camera module for that purpose the objects captured in real time and stationary also.
  • 22. Image Acquisition  The image is captured by digital camera as RGB image and is converted to Gray scale version by intensity equation 1. I = (R+G+B)/3
  • 23. RESIDUAL BLOCKS The image is divided into various grids. Each grid has a dimension of S x S. It uses the dimensions of 3 x 3, 13 x 13 and 19 x 19. There are many grid cells of equal dimension. Every grid cell will detect objects that appear within it.
  • 24. LOCALIZATION The term 'localization' refers to where the object in the image is present. In YOLO object detection we classify image with localization i.e., a supervised learning algorithm is trained to not only predict class but also the bounding box around the object in image. Classification + localization = object detection
  • 25. BOUNDING BOXES A bounding box is an outline that highlights an object in an image. Every bounding box in the image consists of the following attributes: • Bounding box center (bx, by) • Height (bh) • Width (bw) • Class (for example, person, car, traffic light, etc.). This is represented by the letter c. (bw) (bh) . (bx, by)
  • 26. BOUNDING BOXES - CONT... Each 13x13 cell detects objects in the input image via its specified number of bounding boxes 13x13. In YOLO v4, each cell has 3 bounding boxes. So the total number of bounding boxes using 13x13 feature map would be. (13x13)x3 = 507 bounding boxes. The remaining bounding boxes are discarded as they don't localize the objects in the picture.
  • 27. TARGET LABEL Y Target label y for this supervised learning task is explained as: Y is a vector containing Pc, Bx, By, Bh, Bw, CI,..., Ch Pc is the probability of presence of particular class in the grid cell. Pc >=0 and <=1. (i.e., Pc=0) means that object is not found. Pc>I means 100% probability that object is present. (Bx, By) defines the mid-point of object and (Bh, Bw) defines the height and width of bounding box. Also, if Pc > 0 then there will be n number of C which represents the classes of objects present in the image.
  • 28. Intersection over union (IOU) (Intersection over Union) is a term used to describe the extent of overlap of two boxes. The greater the region of overlap, the greater the IOU. IOU is mainly used in applications related to object detection, where we train a model to output a box that fits perfectly around an object. IOU is also used in non max suppression algorithm. IOU= 𝐼𝑁𝑇𝐸𝑅𝑆𝐸𝐶𝑇𝐼𝑂𝑁(𝐴𝑅𝐸𝐴 𝑂𝐹 𝑂𝑉𝐸𝑅𝐿𝐴𝑃) 𝑈𝑁𝐼𝑂𝑁
  • 29. NMS- NON MAX SUPRESSION To select the best bounding box, from the multiple predicted bounding boxes, an algorithm called Non-Max Suppression is used to "suppress" the less likely bounding boxes and keep only the best one.
  • 30. Prediction YOLO v4 make detections at 3 different points i.e., layer 82, 94, 106. Network down-samples the input image by the network strides 32, 16 and 8 at those points respectively. After reaching a stride of 32, the network produces a 13x13 feature map for an input image of size 416x416. Another detection layer when the stride is 16 we obtain a 26x26 output feature map. And 52x52 feature map at the detection layer when the stride is 8. Thus, the total number of bounding boxes by YOLO V4 when the input image size is 416x416. ((13x13)+(26x26)+(52x52))x3 = 10647 bounding boxes32 is per image
  • 31. Database used Coco dataset – COCO dataset, meaning “Common Objects In Context”. It is a large-scale image dataset containing 328,000 images of everyday objects and humans. The dataset contains annotations of machine learning models to recognize, label, and describe objects. COCO provides the following types of annotations: • Object detection • Captioning • Key points • Dense pose
  • 32. Contd:  Object detection consists of various approaches such as fast R-CNN, Retina-Net, and Sliding Window detection but none of the aforementioned methods can detect object in one single run. So there comes another efficient and faster algorithm called YOLO algorithm.
  • 33. Flow Chart Start Capture image Image captured correctly Processing Deep learning Algorithm Predicted Object recognition Output in audio format Send error message Yes No No Yes
  • 34. Hardware Raspberry pi 3B+ Camera module v2 Jumper wires Speaker Button
  • 35. Raspberry pi 3B+ The Raspberry Pi 3 Model B+ is the latest product in the Raspberry Pi 3 range, boasting a 64-bit quad core processor running at 1.4GHz, dual-band 2.4GHz. and 5GHz wireless LAN, Bluetooth 4.2/BLE
  • 36. Camera module v2 The Raspberry Pi Camera v2 is a high quality 8 megapixel Sony IMX219 image sensor custom designed add-on board for Raspberry Pi, featuring a fixed focus lens.
  • 37. Advantage of yolo algorithm YOLO algorithm is important because of the following reasons: Speed : This algorithm improves the speed of detection because it can predict objects in real-time. High accuracy: YOLO is a predictive technique that provides accurate results. It use Convolutional implementation that means that if you have 3*3 grid (i.e., divide image into 9 grid cells) then you don't need to run the algorithm 9 times to validate presence of object in each grid cell rather this is one single convolutional implementation. Learning capabilities: The algorithm has excellent learning capabilities that enable it to learn the representations of objects and apply them in object detection.
  • 38. Surveys  According to national federation of blind , blind people can use all the devices easily so they can also use our object recognition system.[5]
  • 39. Advantage This work is implemented using PTTS. Easy to set up. Open source tools were used for this project. Cheap and cost-efficient. This project will work on device only no need to buy any extra things.
  • 40. Conclusion Simple Indian object recognition system based on yolo algorithm has been proposed. The system has been written in OpenCV.
  • 41. Future Work Enhancing the accuracy by building a model of features for each object class. Working now on using local features instead of template matching Enhancing the best frame to be processed for runtime application Adding more objects to the database.
  • 42. References 1. https://www.researchgate.net/publication/334811299_Real- Time_Objects_Recognition_Approach_for_Assisting_Blind_People 2. https://www.researchgate.net/publication/334811299_Real- Time_Objects_Recognition_Approach_for_Assisting_Blind_People 3. https://www.researchgate.net/publication/235987140_An_obstacl e_detection_system_for_blind_people 4. https://www.who.int/news-room/fact-sheets/detail/blindness-and- visual- impairment#:~:text=Prevalence,near%20or%20distance%20vision% 20impairment. 5. https://www.irjet.net/archives/V5/i2/IRJET-V5I2249.pdf