ADS Team 8 Final Presentation

•Download as PPTX, PDF•

0 likes•89 views

The document describes a project to perform object detection in videos. The team's scope was to identify, list, localize and bound objects in video frames using machine learning. They chose the MS-COCO dataset and the SSD model for its efficiency and speed at object detection. A comparative analysis found SSD_MOBILENET_V1_COCO to have the best balance of speed and accuracy. The team performed transfer learning to customize the model for new object types. They developed a web application using Flask that streams video frames from the client to perform object detection and returns bounding box coordinates.

Object Detection in Videos
Team 8:
Priyesh Kaushik
Pranay Mankad

Introduction
• Videos are basically multiple frames in a sequence which have several
objects in them at any given moment. Machine learning can be used
to identify these objects and make them searchable using tags.

Our Scope
• Identify objects in videos
• Listing objects
• Localizing them per frame and
• Bounding them with boxes

Our Approach - Dataset
• We chose our dataset based on observations of mean objects per
image. We observed that the maximum were in the MS-COCO
dataset.

Approach – Selecting Model
• There are several models available for making Convoluted Neural
Networks. Based on research we found that Faster R-CNN and The
SSD (Single-Shot Multibox Detector) are highly efficient at detecting
objects in frames.
• Based on comparitive results we decided to go with the SSD model,
with the coco-dataset.

Comparative Analysis
Model Name Speed (ms) COCO mAP [^1]
ssd_mobilenet_v1_coco 30 21
ssd_inception_v2_coco 42 24
faster_rcnn_inception_v2_coco 58 28
faster_rcnn_resnet50_coco 89 30
mAP is the mean average precision that is calculated for the basis of classification.
After the comparative analysis, we decided on using the SSD_MOBILENET_V1_COCO. Here are some details
about what we’re dealing with.

Single Shot Multibox Detection Specifics
• Takes inputs of 300x300
• Training requires image and the ground bounding boxes
• Performs non-maximum suppression internally

SSD v/s The Rest
On the basis of a different dataset, but proportions stay the same with COCO.

Transfer Learning
• We performed transfer learning over the SSD model, using Python,
LXML, LabelImg, Paperspace and Tensorflow.
• Steps involved were:
• Gathering Images for custom objects,
• Drawing bounding box for images,
• Generating an XML with dimensions for the bounding box,
• Using Tensorflow to train model on the object,
• Used Paperspace for utilizing a GPU.
• Used Tensorboard to monitor accuracy at various iterations.

How we made it
• We started off using openCV for capturing videos and rendering as
images.
• But openCV was harder to configure on cloud platforms as an API for
accessing web camera footage, which was a goal.
• So here’s what we followed.
Flask
Application
Client
Side
WebRTC Image
Stream
Start Object
Detection
Client
Side
Classify and
Box Images
Return
Coordinates
Render on
Browser using
JS

Further down the line
• This application can be used in inventory management using
computer vision. We see segmentation as a possibility for bring smart
checkouts to convenience stores that may not be as heavy on
infrastructure as Amazon or competition.
• Achieve better performance by pruning the model.

Work Allocation
• We split the work almost equally across all fields.
Priyesh Kaushik Pranay Mankad
Implementation 50% Implementation 50%
Model Training Custom Object Training
Web Interfacing 25% Web Interfacing 75%
Transfer Learning 75% Transfer Learning 24%
Documentation 50% Documentation 50%
Presentation 49% Presentation 49%

Cinder is OpenStack's block storage service that was extracted from Nova. In Folsom, Nova-Volume and Cinder provide equivalent functionality. Cinder saw great participation from many organizations and added new features like creating volumes from images and using NFS files as block devices. Areas for improvement included finding and fixing bugs later in the development cycle and improving documentation. Future plans for Cinder include adding quality of service, multiple backend support, image metadata retention, and backup to object storage.

Spark application on ec2 cluster

Chao-Hsuan Shen

This document summarizes a project using Apache Spark on an AWS EC2 cluster to classify images using the Naive Bayes classifier algorithm. It first provides an overview of the key aspects of the project, including the dataset used (ImageNet), AWS architecture, Spark RDDs, and obstacles faced in setting up the cluster. It then goes into more detail on how to set up the EC2 cluster, use Spark for distributed processing, and compares the Mahout and Spark MLlib machine learning libraries.

K8s in 2hours

DEV Cafe

Kubernetes is an open source container cluster manager that provides workload abstractions like pods, deployments, and services. It supports multiple virtual clusters on the same physical cluster through namespaces. Key resources include secrets for storing sensitive information like TLS certificates, persistent volumes for storage, and private registries. Kubernetes can be installed on-premises or in the cloud using tools like Kops on AWS or kubeadm for laptop development.

OSDC 2017 | Something Openshift Kubernetes Containers by Kristian Köhntopp

NETWAYS

Kristian Köhntopp, Principal in Core Infrastructure at Booking.com, currently building Kubernetes Clusters and Datacenters. In previous lives, Kris has been working on getting Openstack, SDN and distributed file systems to work at SysEleven, managing MySQL infrastructure at Booking.com, doing performance and architecture consulting for MySQL AB, and working as a senior security engineer at web.de.

Rails Usergroup Hamburg: Heroku

Ralph von der Heyden

Heroku is a cloud platform that allows companies to host their Rails applications without owning the physical infrastructure. Some key advantages are variable costs that scale with usage and high flexibility. Deployment can be challenging due to limitations on configurations and reliance on a third party. The Heroku stack provides automatic scaling through dynos and add-ons, and uses Git-based deployment where code is pushed to build application slugs. While not suitable for all large applications, Heroku is recommended for new simple projects due to its low cost and ease of use.

Hyperloglog Lightning Talk

Simon Prickett

The document discusses using the Hyperloglog algorithm and Redis to approximately count unique items in a dataset in a space-efficient way. Some key points: - Hyperloglog is an algorithm that estimates the size of a set in a space-efficient manner, sacrificing some accuracy. - It provides a similar interface to a set but uses much less space (around 12kb) while maintaining the same time complexity. - Redis implements Hyperloglog which allows approximating uniques across programming languages and persisting counts to the Redis key-value store in a space-efficient way.

Amazon Web Services (cloud: is it good for anything?)

Maciej Pasternacki

This document discusses anomaly detection in deep learning. It begins by defining what an anomaly is, such as abnormal patterns in data for fraud detection. It then discusses techniques for anomaly detection using unsupervised autoencoders and supervised recurrent neural networks. Finally, it provides an example reference architecture for an anomaly detection pipeline that ingests data from external sources using NiFi, sends it to Kafka, makes predictions using deep learning models, indexes predictions in Elasticsearch using Logstash, and renders the data in Kibana.

Open Source vs. Open Standards by Sage Weil

Red_Hat_Storage

Open Stack Cheng Du Swift Alex Yang

OpenCity Community

Swift is an object storage software used at SinaAppEngine for storage. It uses consistent hashing to distribute objects uniformly across multiple servers for high reliability and availability. Data is replicated across multiple servers and zones to protect against hardware failures. Consistency is ensured through a quorum-based protocol where writes require acknowledgment from a majority of replicas and reads require a majority to respond. Problems include inefficient replication that impacts performance and availability, and improvements involve optimizing replication to run during idle times and adding nodes for scaling out.

CEPH technical analysis 2014

Erwan Quigna

This document discusses Ceph, an open-source storage platform that provides object, block, and file storage using a single distributed computer cluster. Ceph's goals are to be completely distributed without single points of failure, scalable to the exabyte level, and freely available. Inktank was created to drive adoption of Ceph in enterprises. Ceph offers advantages like being self-managing, self-healing, web-scale, integrated into Linux and OpenStack, and avoiding vendor lock-in. The document also briefly discusses converged infrastructure using Ceph and other software-based storage solutions.

"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: http://www.embedded-vision.com/platinum-members/synopsys/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit-mirchandaney For more information about embedded vision, please visit: http://www.embedded-vision.com Seema Mirchandaney, Engineering Manager for Software Tools at Synopsys, presents the "Using the OpenCL C Kernel Language for Embedded Vision Processors" tutorial at the May 2016 Embedded Vision Summit. OpenCL C is a programming language that is used to write computation kernels. It is based on C99 and extended to support features such as multiple levels of memory hierarchy, parallelism and synchronization. This talk focuses on the benefits and ease of programming vision-based kernels by using the key features of OpenCL C. In addition, Mirchandaney describes language extensions that allow programmers to take advantage of hardware features typical of embedded vision processors, such as wider vector widths, sophisticated accumulator forms of instructions, and scatter/gather capabilities. This talk also addresses advanced topics, such as whole function vectorization support available in the compiler and the benefits of hardware support for predication in the context of lane-based control flow and OpenCL C.

2016 08-05 - Intro to OpenStack

Alfonso Peletier

This document provides an introduction and overview of OpenStack, an open-source cloud computing platform. It defines cloud computing and describes the main components of OpenStack, including Nova for compute resources, Keystone for identity management, Horizon for the dashboard interface, Cinder for block storage, Swift for object storage, Glance for images, and Neutron for networking. It lists many large companies that use OpenStack and provides examples of how OpenStack can be used for applications such as web hosting, big data, and content delivery. The document concludes with instructions for accessing local and online demonstrations of OpenStack.

The Old New Crash: Cloud Memory Dump Analysis

Dmitry Vostokov

The document summarizes memory dump analysis in cloud computing environments. It discusses analyzing memory dumps and traces from cloud platforms like Windows Azure. The presentation covers what aspects are old and new in cloud memory dump analysis, including using elastic storage and centralized security. Live memory dump analysis in cloud environments is also presented, along with resources for training and free webinars on related topics.

Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...

Spark Summit

This document discusses using Spark to model catastrophic events more efficiently than MapReduce. It describes how catastrophe (cat) models involve large datasets but generate even larger intermediate datasets requiring complex analytics. Spark is better suited than MapReduce for this work due to its ability to share memory and resources across processes, providing faster performance at lower costs. The document advocates designing cat model workflows in Spark to take advantage of its flexible architecture and high code quality.

Long running aws lambda - Joel Schuweiler, Minneapolis

AWS Chicago

Dell openstack cloud with inktank ceph – large scale customer deployment

Kamesh Pemmaraju

This was my presentation at the OpenStack Summit in Hong Kong, November 2013. Learn detail around a unique deployment of the Dell OpenStack-Powered Cloud Solution with Inktank Ceph installed at a large nationally recognized American University that specializes in cancer and genomic research. The University had a need to provide a scalable, secure, centralized data repository to support approximately 900 researchers and an ever-expanding number of research projects and rapidly expanding universe of data. The Dell and Inktank cloud storage solution addresses these storage challenges with an open source solution that leverages the Dell Crowbar Framework and Reference Architecture. After assessing a number of traditional storage scenarios, the University partnered with Dell and Inktank to architect a centralized cloud storage platform that is capable of scaling seamlessly and rapidly, is cost-effective, and that can leverage a single hardware infrastructure, with Dell Power Edge R-720XD servers and the Dell Reference Architecture for their OpenStack compute and storage environment.

Disaggregating Ceph using NVMeoF

ShapeBlue

This document discusses disaggregating Ceph storage using NVMe over Fabrics (NVMeoF). It motivates using NVMeoF by showing the performance limitations of directly attaching multiple NVMe drives to individual compute nodes. It then proposes a design to leverage the full resources of a cluster by distributing NVMe drives across dedicated storage nodes and connecting them to compute nodes over a high performance fabric using NVMeoF and RDMA. Some initial Ceph performance measurements using this model show improved IOPS and latency compared to the direct attached approach. Future work could explore using SPDK and Linux kernel improvements to further optimize performance.

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Romeo Kienzler

Object Storage in a Cloud-Native Container Envirnoment

Minio

Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...

AWS Chicago

Running a Massively Parallel Self-serve Distributed Data System At Scale

Zhenzhong Xu

Nearly any Internet-connected screen is capable of streaming Netflix content. Sitting on top of a cloud-native microservice architecture, the entire ecosystem generates over 1 trillion events every day to feed critical Netflix systems to monitor service health, to detect fraudulent behaviors, and to improve customer experience. Keystone is the critical piece of Netflix backend infrastructure to ensure massive amount of events are processed in near real time, reliably, at scale, and in face of failures in a cloud-native microservices environment. Turns out, such an embarrassingly parallel stream processing system is not embarrassingly easy to develop and operate, especially given the challenges of unpredictable failures in a cloud-native environment, self-serve multi-tenancy support, and assumptions of maintaining extremely high development/operation agility. This talk will shed light on how we built an elastic, resilient, reactive, and self-healing distributed system in the cloud. Zhenzhong will present * High-level cloud-native microservice based Keystone architecture. * A deep dive on how we built the system based on ideas such as declarative reconciliation, container based immutable deployment, logical workload isolation, and chaos exercise. * Insights into our operation best practices, such as capacity provisioning, delivery semantics, deployment tradeoffs, backpressure management, etc.

Tensorflow vs MxNet

Ashish Bansal

DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0

Plain Concepts

This document discusses techniques for writing high performance .NET Core 3.0 code. It covers new features like Span<T>, ValueTuple, and C# 8 async streams. It emphasizes that micro-optimizations are only needed for BCL, real-time apps, and graphics. Bottlenecks follow the Pareto principle. The document then discusses specific optimizations for a KTX file loader, including using stackalloc and unsafe code for pinned memory as well as custom collections and multithreading for OpenGL. It concludes by covering new MathF APIs, hardware intrinsics, and taking questions.

Writing high performance code in NetCore 3.0

Javier Cantón Ferrero

This document discusses techniques for writing high performance code in .NET Core 3.0, including using value types over reference types to reduce garbage collection, pinning memory to avoid copying, leveraging the unsafe context to directly access pointers, using stackalloc for stack memory, and taking advantage of hardware intrinsics and SIMD for parallel operations. It also covers identifying performance bottlenecks using Pareto's law and optimizing specific cases like a KTX file loader and OpenGL command queue.

Scaling drupal on amazon web services dr

Tristan Roddis

Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing

YanpingWang

This document introduces Apache Mnemonic, an open source project that provides a non-volatile programming model for Java applications to improve performance of in-memory computing frameworks like Spark. It describes how Mnemonic allows data to be stored and processed directly in persistent memory rather than being serialized to disk. Experimental results show Mnemonic can significantly reduce Spark MLlib Kmeans execution time by avoiding object serialization and spilling to disk.

kanimozhi2019.pdf

AshrafDabbas1

This document describes a proposed method for real-time object detection using Single Shot Multi-Box Detection (SSD) with the MobileNet model. SSD is a single, unified network for object detection that eliminates feature resampling and combines predictions. MobileNet is used to create a lightweight network by employing depthwise separable convolutions, which significantly reduces model size compared to regular convolutions. The proposed SSD with MobileNet model achieved improved accuracy in identifying real-time household objects while maintaining the detection speed of SSD.

Object Detection for Autonomous Cars using AI/ML

IRJET Journal

The document discusses using machine learning and computer vision techniques for object detection in autonomous vehicles. Specifically, it proposes using the Single Shot Detector (SSD) algorithm to identify and classify objects around a self-driving car from camera images. The SSD model was trained on a dataset to detect common objects like cars, people, buses etc. and estimate bounding boxes around detected objects. The methodology uses OpenCV and TensorFlow to implement SSD on images from a webcam in real-time. While bounding boxes were sometimes inconsistent in dense traffic, detection was more accurate for objects closer to the camera or in less crowded scenarios. The goal is to demonstrate how computer vision allows autonomous vehicles to perceive their surroundings.

What's hot

Anomaly detection in deep learning (Updated) English

Adam Gibson

Open Source vs. Open Standards by Sage Weil

Red_Hat_Storage

Open Stack Cheng Du Swift Alex Yang

OpenCity Community

CEPH technical analysis 2014

Erwan Quigna

"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...

Edge AI and Vision Alliance

2016 08-05 - Intro to OpenStack

Alfonso Peletier

The Old New Crash: Cloud Memory Dump Analysis

Dmitry Vostokov

Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...

Spark Summit

Long running aws lambda - Joel Schuweiler, Minneapolis

AWS Chicago

Dell openstack cloud with inktank ceph – large scale customer deployment

Kamesh Pemmaraju

Disaggregating Ceph using NVMeoF

ShapeBlue

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Romeo Kienzler

Object Storage in a Cloud-Native Container Envirnoment

Minio

Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...

AWS Chicago

Running a Massively Parallel Self-serve Distributed Data System At Scale

Zhenzhong Xu

Tensorflow vs MxNet

Ashish Bansal

DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0

Plain Concepts

Writing high performance code in NetCore 3.0

Javier Cantón Ferrero

Scaling drupal on amazon web services dr

Tristan Roddis

Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing

YanpingWang

What's hot (20)

Anomaly detection in deep learning (Updated) English

Open Source vs. Open Standards by Sage Weil

Open Stack Cheng Du Swift Alex Yang

CEPH technical analysis 2014

"Using the OpenCL C Kernel Language for Embedded Vision Processors," a Presen...

2016 08-05 - Intro to OpenStack

The Old New Crash: Cloud Memory Dump Analysis

Modeling Catastrophic Events in Spark: Spark Summit East Talk by Georg Hofman...

Long running aws lambda - Joel Schuweiler, Minneapolis

Dell openstack cloud with inktank ceph – large scale customer deployment

Disaggregating Ceph using NVMeoF

Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16

Object Storage in a Cloud-Native Container Envirnoment

Server-less solution for moving Millions of Images in Cloud - Brett Sutter, ...

Running a Massively Parallel Self-serve Distributed Data System At Scale

Tensorflow vs MxNet

DotNet 2019 | Javier Cantón - Writing high performance code in NetCore 3.0

Writing high performance code in NetCore 3.0

Scaling drupal on amazon web services dr

Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing

Similar to ADS Team 8 Final Presentation

kanimozhi2019.pdf

AshrafDabbas1

Object Detection for Autonomous Cars using AI/ML

IRJET Journal

Real Time Object Dectection using machine learning

pratik pratyay

This document discusses the development of a real-time object detection system using computer vision techniques. It aims to recognize and label moving objects in video streams from monitoring cameras with high accuracy and in a short amount of time. The system will use a hybrid model of convolutional neural networks and support vector machines for feature extraction and classification of objects from camera feeds into predefined classes. It is intended to help analyze surveillance video by only flagging clips that contain objects of interest like people or vehicles, reducing wasted storage and review time.

Deep learning fundamental and Research project on IBM POWER9 system from NUS

Ganesan Narayanasamy

Moving object recognition (MOR) corresponds to the localisation and classification of moving objects in videos. Discriminating moving objects from static objects and background in videos is an essential task for many computer vision applications. MOR has widespread applications in intelligent visual surveillance, intrusion detection, anomaly detection and monitoring, industrial sites monitoring, detection-based tracking, autonomous vehicles, etc. In this session, Murari is going to talk about the deep learning algorithms to identify both locations and corresponding categories of moving objects with a convolutional network. The challenges in developing such algorithms will be discussed. The discourse will also include the implementation details of these models in both conventional and UAV videos.

slide-171212080528.pptx

SharanrajK22MMT1003

This document summarizes a project on real-time object detection using computer vision techniques. It discusses using a system that can recognize objects in a video stream from a camera and label them with bounding boxes and labels. It notes that most video surveillance footage is uninteresting unless there are moving objects. The project aims to address this by building an accurate, fast object detection system that can run on resource-constrained devices. It proposes using a hybrid CNN-SVM model trained on a large dataset to recognize objects and discusses the training and detection phases of the system.

Anomaly Detection with Azure and .NET

Marco Parenzan

Anomaly Detection with Azure and .net

Marco Parenzan

Mongo DB at Community Engine

Community Engine

MongoDB at community engine

mathraq

Object Detetcion using SSD-MobileNet

IRJET Journal

This document presents a study on object detection using SSD-MobileNet. The researchers developed a lightweight object detection model using SSD-MobileNet that can perform real-time object detection on embedded systems with limited processing resources. They tested the model on images and video captured using webcams. The model was able to detect objects like people, cars, and animals with good accuracy. The SSD-MobileNet framework provides fast and efficient object detection for applications like autonomous driving assistance systems that require real-time performance on low-power devices.

OpenCV @ Droidcon 2012

Wingston

The document discusses OpenCV and its suitability for image processing on Android devices, noting that OpenCV is an open source library for computer vision and image processing that allows treating images as matrices and provides functions for tasks like blurring, edge detection, and object recognition; it provides an overview of some key OpenCV classes for Android and approaches for building image processing applications using OpenCV on Android.

MongoDB Tokyo - Monitoring and Queueing

Boxed Ice

The document discusses using MongoDB as both a primary data store and queueing system for Server Density. It describes how Server Density implemented queuing functionality in MongoDB using the findAndModify command to atomically retrieve and update documents. It also provides an overview of monitoring considerations for MongoDB in production, including keeping indexes and frequently accessed data in memory, watching for disk I/O spikes or slow queries that may indicate insufficient memory, and using db.serverStatus() to monitor connection usage and check for limits.

DCEU 18: Use Cases and Practical Solutions for Docker Container Storage on Sw...

Docker, Inc.

Mark Church - Product Manager, Docker Don Stewart - Solutions Architect, Docker Persistent storage has quickly advanced from something considered incompatible with containers to a mature set of solutions and patterns that have been thoroughly adopted by the industry. We’ll define the persistent characteristics of different use-cases and map these to some of the many solutions that exist for container storage. From this talk you’ll learn about the storage options available to users on Swarm, Kubernetes, on-premises, cloud, and how they work and compare to each other. You’ll also learn how to characterize different persistent application requirements and the solutions best for suited for them.

AWS re:Invent 2016: Deep Learning at Cloud Scale: Improving Video Discoverabi...

Amazon Web Services

Deep learning continues to push the state of the art in domains such as video analytics, computer vision, and speech recognition. Deep networks are powered by amazing levels of representational power, feature learning, and abstraction. This approach comes at the cost of a significant increase in required compute power, which makes the AWS cloud an excellent environment for training. Innovators in this space are applying deep learning to a variety of applications. One such innovator, Vilynx, a startup based in Palo Alto, realized that the current pre-roll advertising-based models for mobile video weren’t returning publishers' desired levels of engagement. In this session, we explain the algorithmic challenges of scaling across multiple nodes, and what Intel is doing on AWS to overcome them. We describe the benefits of using AWS CloudFormation to set up a distributed training environment for deep networks. We also showcase Vilynx’s contributions to video discoverability, and explain how Vilynx uses AWS tools to understand video content. This session is sponsored by Intel.

Deep learning on mobile

Anirudh Koul

odtslide-180529073940.pptx

ahmedchammam

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaav

WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...

AMD Developer Central

Object detection with deep learning

Sushant Shrivastava

This document discusses object detection using the Single Shot Detector (SSD) algorithm with the MobileNet V1 architecture. It begins with an introduction to object detection and a literature review of common techniques. It then describes the basic architecture of convolutional neural networks and how they are used for feature extraction in SSD. The SSD framework uses multi-scale feature maps for detection and convolutional predictors. MobileNet V1 reduces model size and complexity through depthwise separable convolutions. This allows SSD with MobileNet V1 to perform real-time object detection with reduced parameters and computations compared to other models.

Machine Learning Inference at the Edge

Amazon Web Services

What multimodal foundation models cannot perceive

University of Amsterdam

Multimodal foundation models are a revolutionary class of AI models that provide impressive abilities to generate multimedia content and do so by interactive prompts in a seemingly creative manner. These foundation models are often self-supervised transformer-based models pre-trained on large volumes of data, typically collected from the web. They already form the basis of all state-of-the-art systems in computer vision and natural language processing across a wide range of tasks and have shown impressive transfer learning abilities. Despite their immense potential, these foundation models face challenges in fundamental perception tasks such as spatial grounding and temporal reasoning, have difficulty to operate on low-resource scenarios, and neglect human-alignment for ethical, legal, and societal acceptance. In this talk I will highlight recent work from my lab that identifies several of these challenges as well as ways to update foundation models to address these challenges and to do so in a sustainable way, without the need to retrain from scratch.

Recently uploaded

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts

Green Software Development

Orion Context Broker introduction 20240604

Fermin Galan

What is Augmented Reality Image Tracking

pavan998932

GraphSummit Paris - The art of the possible with Graph Technology

Neo4j

E-commerce Development Services- Hornet Dynamics

Hornet Dynamics

Revolutionizing Visual Effects Mastering AI Face Swaps.pdf

Undress Baby

The quest for the best AI face swap solution is marked by an amalgamation of technological prowess and artistic finesse, where cutting-edge algorithms seamlessly replace faces in images or videos with striking realism. Leveraging advanced deep learning techniques, the best AI face swap tools meticulously analyze facial features, lighting conditions, and expressions to execute flawless transformations, ensuring natural-looking results that blur the line between reality and illusion, captivating users with their ingenuity and sophistication. Web:- https://undressbaby.com/

Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management

Utilocate

Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly. The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.

Atelier - Innover avec l’IA Générative et les graphes de connaissances

Neo4j

Atelier - Innover avec l’IA Générative et les graphes de connaissances Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement. Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.

Artificia Intellicence and XPath Extension Functions

Octavian Nadolu

UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions

Peter Muessig

The UI5 tooling is the development and build tooling of UI5. It is built in a modular and extensible way so that it can be easily extended by your needs. This session will showcase various tooling extensions which can boost your development experience by far so that you can really work offline, transpile your code in your project to use even newer versions of EcmaScript (than 2022 which is supported right now by the UI5 tooling), consume any npm package of your choice in your project, using different kind of proxies, and even stitching UI5 projects during development together to mimic your target environment.

Need for Speed: Removing speed bumps from your Symfony projects ⚡️

Łukasz Chruściel

No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception. In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed. We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.

Transform Your Communication with Cloud-Based IVR Solutions

TheSMSPoint

Discover the power of Cloud-Based IVR Solutions to streamline communication processes. Embrace scalability and cost-efficiency while enhancing customer experiences with features like automated call routing and voice recognition. Accessible from anywhere, these solutions integrate seamlessly with existing systems, providing real-time analytics for continuous improvement. Revolutionize your communication strategy today with Cloud-Based IVR Solutions. Learn more at: https://thesmspoint.com/channel/cloud-telephony

Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris

Neo4j

LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM

lorraineandreiamcidl

GreenCode-A-VSCode-Plugin--Dario-Jurisic

Green Software Development

Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...

kalichargn70th171

A dynamic process unfolds in the intricate realm of software development, dedicated to crafting and sustaining products that effortlessly address user needs. Amidst vital stages like market analysis and requirement assessments, the heart of software development lies in the meticulous creation and upkeep of source code. Code alterations are inherent, challenging code quality, particularly under stringent deadlines.

Enterprise Resource Planning System in Telangana

NYGGS Automation Suite

Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics. To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/

SWEBOK and Education at FUSE Okinawa 2024

Hironori Washizaki

Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition

Envertis Software Solutions

Odoo ERP software Odoo ERP software, a leading open-source software for Enterprise Resource Planning (ERP) and business management, has recently launched its latest version, Odoo 17 Community Edition. This update introduces a range of new features and enhancements designed to streamline business operations and support growth. The Odoo Community serves as a cost-free edition within the Odoo suite of ERP systems. Tailored to accommodate the standard needs of business operations, it provides a robust platform suitable for organisations of different sizes and business sectors. Within the Odoo Community Edition, users can access a variety of essential features and services essential for managing day-to-day tasks efficiently. This blog presents a detailed overview of the features available within the Odoo 17 Community edition, and the differences between Odoo 17 community and enterprise editions, aiming to equip you with the necessary information to make an informed decision about its suitability for your business.

2024 eCommerceDays Toulouse - Sylius 2.0.pdf

Łukasz Chruściel

Recently uploaded (20)

ALGIT - Assembly Line for Green IT - Numbers, Data, Facts

Orion Context Broker introduction 20240604

What is Augmented Reality Image Tracking

GraphSummit Paris - The art of the possible with Graph Technology

E-commerce Development Services- Hornet Dynamics

Revolutionizing Visual Effects Mastering AI Face Swaps.pdf

Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management

Atelier - Innover avec l’IA Générative et les graphes de connaissances

Artificia Intellicence and XPath Extension Functions

UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions

Need for Speed: Removing speed bumps from your Symfony projects ⚡️

Transform Your Communication with Cloud-Based IVR Solutions

Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris

LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM

GreenCode-A-VSCode-Plugin--Dario-Jurisic

Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...

Enterprise Resource Planning System in Telangana

SWEBOK and Education at FUSE Okinawa 2024

Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition

2024 eCommerceDays Toulouse - Sylius 2.0.pdf

ADS Team 8 Final Presentation

1. Object Detection in Videos Team 8: Priyesh Kaushik Pranay Mankad

2. Introduction • Videos are basically multiple frames in a sequence which have several objects in them at any given moment. Machine learning can be used to identify these objects and make them searchable using tags.

3. Our Scope • Identify objects in videos • Listing objects • Localizing them per frame and • Bounding them with boxes

4. Our Approach - Dataset • We chose our dataset based on observations of mean objects per image. We observed that the maximum were in the MS-COCO dataset.

5. Approach – Selecting Model • There are several models available for making Convoluted Neural Networks. Based on research we found that Faster R-CNN and The SSD (Single-Shot Multibox Detector) are highly efficient at detecting objects in frames. • Based on comparitive results we decided to go with the SSD model, with the coco-dataset.

6. Comparative Analysis Model Name Speed (ms) COCO mAP [^1] ssd_mobilenet_v1_coco 30 21 ssd_inception_v2_coco 42 24 faster_rcnn_inception_v2_coco 58 28 faster_rcnn_resnet50_coco 89 30 mAP is the mean average precision that is calculated for the basis of classification. After the comparative analysis, we decided on using the SSD_MOBILENET_V1_COCO. Here are some details about what we’re dealing with.

7. CNN – a birds eye view

8. Object Detection – a birds eye view

9. Single Shot Multibox Detection Specifics • Takes inputs of 300x300 • Training requires image and the ground bounding boxes • Performs non-maximum suppression internally

10. SSD v/s The Rest On the basis of a different dataset, but proportions stay the same with COCO.

11. Transfer Learning • We performed transfer learning over the SSD model, using Python, LXML, LabelImg, Paperspace and Tensorflow. • Steps involved were: • Gathering Images for custom objects, • Drawing bounding box for images, • Generating an XML with dimensions for the bounding box, • Using Tensorflow to train model on the object, • Used Paperspace for utilizing a GPU. • Used Tensorboard to monitor accuracy at various iterations.

12.

13.

14. How we made it • We started off using openCV for capturing videos and rendering as images. • But openCV was harder to configure on cloud platforms as an API for accessing web camera footage, which was a goal. • So here’s what we followed. Flask Application Client Side WebRTC Image Stream Start Object Detection Client Side Classify and Box Images Return Coordinates Render on Browser using JS

15. What we could do with this

16. Further down the line • This application can be used in inventory management using computer vision. We see segmentation as a possibility for bring smart checkouts to convenience stores that may not be as heavy on infrastructure as Amazon or competition. • Achieve better performance by pruning the model.

17. Work Allocation • We split the work almost equally across all fields. Priyesh Kaushik Pranay Mankad Implementation 50% Implementation 50% Model Training Custom Object Training Web Interfacing 25% Web Interfacing 75% Transfer Learning 75% Transfer Learning 24% Documentation 50% Documentation 50% Presentation 49% Presentation 49%

18. Have Questions?

19. • Thank you!

ADS Team 8 Final Presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to ADS Team 8 Final Presentation

Similar to ADS Team 8 Final Presentation (20)

Recently uploaded

Recently uploaded (20)

ADS Team 8 Final Presentation