SlideShare a Scribd company logo
Object Detection in Videos
Team 8:
Priyesh Kaushik
Pranay Mankad
Introduction
• Videos are basically multiple frames in a sequence which have several
objects in them at any given moment. Machine learning can be used
to identify these objects and make them searchable using tags.
Our Scope
• Identify objects in videos
• Listing objects
• Localizing them per frame and
• Bounding them with boxes
Our Approach - Dataset
• We chose our dataset based on observations of mean objects per
image. We observed that the maximum were in the MS-COCO
dataset.
Approach – Selecting Model
• There are several models available for making Convoluted Neural
Networks. Based on research we found that Faster R-CNN and The
SSD (Single-Shot Multibox Detector) are highly efficient at detecting
objects in frames.
• Based on comparitive results we decided to go with the SSD model,
with the coco-dataset.
Comparative Analysis
Model Name Speed (ms) COCO mAP [^1]
ssd_mobilenet_v1_coco 30 21
ssd_inception_v2_coco 42 24
faster_rcnn_inception_v2_coco 58 28
faster_rcnn_resnet50_coco 89 30
mAP is the mean average precision that is calculated for the basis of classification.
After the comparative analysis, we decided on using the SSD_MOBILENET_V1_COCO. Here are some details
about what we’re dealing with.
CNN – a birds eye view
Object Detection – a birds eye view
Single Shot Multibox Detection Specifics
• Takes inputs of 300x300
• Training requires image and the ground bounding boxes
• Performs non-maximum suppression internally
SSD v/s The Rest
On the basis of a different dataset, but proportions stay the same with COCO.
Transfer Learning
• We performed transfer learning over the SSD model, using Python,
LXML, LabelImg, Paperspace and Tensorflow.
• Steps involved were:
• Gathering Images for custom objects,
• Drawing bounding box for images,
• Generating an XML with dimensions for the bounding box,
• Using Tensorflow to train model on the object,
• Used Paperspace for utilizing a GPU.
• Used Tensorboard to monitor accuracy at various iterations.
How we made it
• We started off using openCV for capturing videos and rendering as
images.
• But openCV was harder to configure on cloud platforms as an API for
accessing web camera footage, which was a goal.
• So here’s what we followed.
Flask
Application
Client
Side
WebRTC Image
Stream
Start Object
Detection
Client
Side
Classify and
Box Images
Return
Coordinates
Render on
Browser using
JS
What we could do with this
Further down the line
• This application can be used in inventory management using
computer vision. We see segmentation as a possibility for bring smart
checkouts to convenience stores that may not be as heavy on
infrastructure as Amazon or competition.
• Achieve better performance by pruning the model.
Work Allocation
• We split the work almost equally across all fields.
Priyesh Kaushik Pranay Mankad
Implementation 50% Implementation 50%
Model Training Custom Object Training
Web Interfacing 25% Web Interfacing 75%
Transfer Learning 75% Transfer Learning 24%
Documentation 50% Documentation 50%
Presentation 49% Presentation 49%
Have Questions?
• Thank you!

More Related Content

What's hot

Anomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) EnglishAnomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) English
Adam Gibson
 
Open Source vs. Open Standards by Sage Weil
Open Source vs. Open Standards by Sage WeilOpen Source vs. Open Standards by Sage Weil
Open Source vs. Open Standards by Sage Weil
Red_Hat_Storage
 
Open Stack Cheng Du Swift Alex Yang
Open Stack Cheng Du Swift Alex YangOpen Stack Cheng Du Swift Alex Yang
Open Stack Cheng Du Swift Alex Yang
OpenCity Community
 
CEPH technical analysis 2014
CEPH technical analysis 2014CEPH technical analysis 2014
CEPH technical analysis 2014
Erwan Quigna
 
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen..."Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
Edge AI and Vision Alliance
 
2016 08-05 - Intro to OpenStack
2016 08-05 - Intro to OpenStack2016 08-05 - Intro to OpenStack
2016 08-05 - Intro to OpenStack
Alfonso Peletier
 
The Old New Crash: Cloud Memory Dump Analysis
The Old New Crash: Cloud Memory Dump AnalysisThe Old New Crash: Cloud Memory Dump Analysis
The Old New Crash: Cloud Memory Dump Analysis
Dmitry Vostokov
 
Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...
Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...
Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...
Spark Summit
 
Long running aws lambda - Joel Schuweiler, Minneapolis
Long running aws lambda -  Joel Schuweiler, MinneapolisLong running aws lambda -  Joel Schuweiler, Minneapolis
Long running aws lambda - Joel Schuweiler, Minneapolis
AWS Chicago
 
Dell openstack cloud with inktank ceph – large scale customer deployment
Dell openstack cloud with inktank ceph – large scale customer deploymentDell openstack cloud with inktank ceph – large scale customer deployment
Dell openstack cloud with inktank ceph – large scale customer deployment
Kamesh Pemmaraju
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
ShapeBlue
 
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Romeo Kienzler
 
Object Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container EnvirnomentObject Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container Envirnoment
Minio
 
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
 Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ... Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
AWS Chicago
 
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At ScaleRunning a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At Scale
Zhenzhong Xu
 
Tensorflow vs MxNet
Tensorflow vs MxNetTensorflow vs MxNet
Tensorflow vs MxNet
Ashish Bansal
 
DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0
DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0
DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0
Plain Concepts
 
Writing high performance code in NetCore 3.0
Writing high performance code in NetCore 3.0Writing high performance code in NetCore 3.0
Writing high performance code in NetCore 3.0
Javier Cantón Ferrero
 
Scaling drupal on amazon web services dr
Scaling drupal on amazon web services drScaling drupal on amazon web services dr
Scaling drupal on amazon web services dr
Tristan Roddis
 
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_ComputingIntroduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
YanpingWang
 

What's hot (20)

Anomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) EnglishAnomaly detection in deep learning (Updated) English
Anomaly detection in deep learning (Updated) English
 
Open Source vs. Open Standards by Sage Weil
Open Source vs. Open Standards by Sage WeilOpen Source vs. Open Standards by Sage Weil
Open Source vs. Open Standards by Sage Weil
 
Open Stack Cheng Du Swift Alex Yang
Open Stack Cheng Du Swift Alex YangOpen Stack Cheng Du Swift Alex Yang
Open Stack Cheng Du Swift Alex Yang
 
CEPH technical analysis 2014
CEPH technical analysis 2014CEPH technical analysis 2014
CEPH technical analysis 2014
 
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen..."Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...
 
2016 08-05 - Intro to OpenStack
2016 08-05 - Intro to OpenStack2016 08-05 - Intro to OpenStack
2016 08-05 - Intro to OpenStack
 
The Old New Crash: Cloud Memory Dump Analysis
The Old New Crash: Cloud Memory Dump AnalysisThe Old New Crash: Cloud Memory Dump Analysis
The Old New Crash: Cloud Memory Dump Analysis
 
Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...
Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...
Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...
 
Long running aws lambda - Joel Schuweiler, Minneapolis
Long running aws lambda -  Joel Schuweiler, MinneapolisLong running aws lambda -  Joel Schuweiler, Minneapolis
Long running aws lambda - Joel Schuweiler, Minneapolis
 
Dell openstack cloud with inktank ceph – large scale customer deployment
Dell openstack cloud with inktank ceph – large scale customer deploymentDell openstack cloud with inktank ceph – large scale customer deployment
Dell openstack cloud with inktank ceph – large scale customer deployment
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
 
Object Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container EnvirnomentObject Storage in a Cloud-Native Container Envirnoment
Object Storage in a Cloud-Native Container Envirnoment
 
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
 Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ... Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...
 
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At ScaleRunning a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At Scale
 
Tensorflow vs MxNet
Tensorflow vs MxNetTensorflow vs MxNet
Tensorflow vs MxNet
 
DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0
DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0
DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0
 
Writing high performance code in NetCore 3.0
Writing high performance code in NetCore 3.0Writing high performance code in NetCore 3.0
Writing high performance code in NetCore 3.0
 
Scaling drupal on amazon web services dr
Scaling drupal on amazon web services drScaling drupal on amazon web services dr
Scaling drupal on amazon web services dr
 
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_ComputingIntroduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
 

Similar to ADS Team 8 Final Presentation

kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
AshrafDabbas1
 
Object Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/MLObject Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/ML
IRJET Journal
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
pratik pratyay
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Ganesan Narayanasamy
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
SharanrajK22MMT1003
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
Marco Parenzan
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
Marco Parenzan
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
mathraq
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNet
IRJET Journal
 
OpenCV @ Droidcon 2012
OpenCV @ Droidcon 2012OpenCV @ Droidcon 2012
OpenCV @ Droidcon 2012
Wingston
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
Boxed Ice
 
DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...
DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...
DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...
Docker, Inc.
 
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
Amazon Web Services
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
Anirudh Koul
 
odtslide-180529073940.pptx
odtslide-180529073940.pptxodtslide-180529073940.pptx
odtslide-180529073940.pptx
ahmedchammam
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
AMD Developer Central
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
Sushant Shrivastava
 
Machine Learning Inference at the Edge
Machine Learning Inference at the EdgeMachine Learning Inference at the Edge
Machine Learning Inference at the Edge
Amazon Web Services
 
What multimodal foundation models cannot perceive
What multimodal foundation models cannot perceiveWhat multimodal foundation models cannot perceive
What multimodal foundation models cannot perceive
University of Amsterdam
 

Similar to ADS Team 8 Final Presentation (20)

kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 
Object Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/MLObject Detection for Autonomous Cars using AI/ML
Object Detection for Autonomous Cars using AI/ML
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
 
Object Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNetObject Detetcion using SSD-MobileNet
Object Detetcion using SSD-MobileNet
 
OpenCV @ Droidcon 2012
OpenCV @ Droidcon 2012OpenCV @ Droidcon 2012
OpenCV @ Droidcon 2012
 
MongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and QueueingMongoDB Tokyo - Monitoring and Queueing
MongoDB Tokyo - Monitoring and Queueing
 
DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...
DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...
DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...
 
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
 
odtslide-180529073940.pptx
odtslide-180529073940.pptxodtslide-180529073940.pptx
odtslide-180529073940.pptx
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Machine Learning Inference at the Edge
Machine Learning Inference at the EdgeMachine Learning Inference at the Edge
Machine Learning Inference at the Edge
 
What multimodal foundation models cannot perceive
What multimodal foundation models cannot perceiveWhat multimodal foundation models cannot perceive
What multimodal foundation models cannot perceive
 

Recently uploaded

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Undress Baby
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 

Recently uploaded (20)

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfRevolutionizing Visual Effects Mastering AI Face Swaps.pdf
Revolutionizing Visual Effects Mastering AI Face Swaps.pdf
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 

ADS Team 8 Final Presentation

  • 1. Object Detection in Videos Team 8: Priyesh Kaushik Pranay Mankad
  • 2. Introduction • Videos are basically multiple frames in a sequence which have several objects in them at any given moment. Machine learning can be used to identify these objects and make them searchable using tags.
  • 3. Our Scope • Identify objects in videos • Listing objects • Localizing them per frame and • Bounding them with boxes
  • 4. Our Approach - Dataset • We chose our dataset based on observations of mean objects per image. We observed that the maximum were in the MS-COCO dataset.
  • 5. Approach – Selecting Model • There are several models available for making Convoluted Neural Networks. Based on research we found that Faster R-CNN and The SSD (Single-Shot Multibox Detector) are highly efficient at detecting objects in frames. • Based on comparitive results we decided to go with the SSD model, with the coco-dataset.
  • 6. Comparative Analysis Model Name Speed (ms) COCO mAP [^1] ssd_mobilenet_v1_coco 30 21 ssd_inception_v2_coco 42 24 faster_rcnn_inception_v2_coco 58 28 faster_rcnn_resnet50_coco 89 30 mAP is the mean average precision that is calculated for the basis of classification. After the comparative analysis, we decided on using the SSD_MOBILENET_V1_COCO. Here are some details about what we’re dealing with.
  • 7. CNN – a birds eye view
  • 8. Object Detection – a birds eye view
  • 9. Single Shot Multibox Detection Specifics • Takes inputs of 300x300 • Training requires image and the ground bounding boxes • Performs non-maximum suppression internally
  • 10. SSD v/s The Rest On the basis of a different dataset, but proportions stay the same with COCO.
  • 11. Transfer Learning • We performed transfer learning over the SSD model, using Python, LXML, LabelImg, Paperspace and Tensorflow. • Steps involved were: • Gathering Images for custom objects, • Drawing bounding box for images, • Generating an XML with dimensions for the bounding box, • Using Tensorflow to train model on the object, • Used Paperspace for utilizing a GPU. • Used Tensorboard to monitor accuracy at various iterations.
  • 12.
  • 13.
  • 14. How we made it • We started off using openCV for capturing videos and rendering as images. • But openCV was harder to configure on cloud platforms as an API for accessing web camera footage, which was a goal. • So here’s what we followed. Flask Application Client Side WebRTC Image Stream Start Object Detection Client Side Classify and Box Images Return Coordinates Render on Browser using JS
  • 15. What we could do with this
  • 16. Further down the line • This application can be used in inventory management using computer vision. We see segmentation as a possibility for bring smart checkouts to convenience stores that may not be as heavy on infrastructure as Amazon or competition. • Achieve better performance by pruning the model.
  • 17. Work Allocation • We split the work almost equally across all fields. Priyesh Kaushik Pranay Mankad Implementation 50% Implementation 50% Model Training Custom Object Training Web Interfacing 25% Web Interfacing 75% Transfer Learning 75% Transfer Learning 24% Documentation 50% Documentation 50% Presentation 49% Presentation 49%