[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV

•

3 likes•742 views

This document provides a brief introduction to recent image recognition methods and ChainerCV. It first introduces the presenter Shunta Saito and their background and research interests. It then outlines several major image recognition problems including image classification, object detection, semantic segmentation, instance-aware segmentation, image captioning, and visual question answering. For each problem, it lists some popular datasets and example methods that have been proposed. It also provides an overview and link to ChainerCV, an open source framework for computer vision research. Finally, it mentions some datasets and methods for computer vision applications in fashion.

Technology

Brief Introduction to
Recent Image Recognition Methods
and ChainerCV
Shunta Saito
Researcher at Preferred Networks, Inc.

Self-introduction
● Shunta Saito, Ph. D. in Engineering (@mitmul)
● Background: Keio Univ. → UC Berkeley → Keio Leading Edge Lab.
→ Facebook, Inc. → Preferred Networks, Inc.
● Research interest: Computer Vision, semantic segmentation, etc.
● Job: Research on computer vision applications, Development of
Chainer, Driving global alliance with Microsoft, Intel, etc...

Related work to fashion…?
Virtual fitting demo using Kinect (2011) at Geis, Inc.

Major Image Recognition Problems
● Image Classification
● Object Detection
● Semantic Segmentation
● Instance-aware Segmentation
● Image Captioning
● Visual Question Answering

Image Classification
● MNIST, CIFAR-10/101, ImageNet, Places205, etc…
● Various methods based on ConvNets have been proposed
● ILSVRC 2017: the last ImageNet challenge
– 1.28 million images, 1000 classes

Object Detection
● Dataset: Pascal VOC, MS COCO, KITTI, Cityscapes, etc…
● Methods: Faster R-CNN, SSD, R-FCN, etc...
Complicated...

Semantic Segmentation
● Dataset: Pascal VOC, MS COCO, Cityscapes, KITTI, CamVid,
LabelMe, SUN RGB-D, Mapillaly, etc...
● Methods: FCN, Deconv Net, U-Net, SegNet, Dilated, RefineNet,
PSPNet, etc...
See details here:
http://bit.ly/seg-slide
Semantic Segmentation result
on Cityscapes dataset ->

Instance-aware Segmentation
● Dataset: MS COCO, Cityscapes, Mapillary, etc...
● Methods: DeepMask, SharpMask, Multi-task Network
Cascades, Mask R-CNN, etc...
Instance-aware: Differentiate each instance of the same class

Image Captioning
● Dataset: MS COCO, etc...
● Methods: Show and Tell, Show,
Attend and Tell, Review Networks,
Knowing When to Look,
Hierarchical Recurrent Network,
etc...

Visual Question Answering
● Dataset: VQA dataset ( http://visualqa.org ) ...
● Methods: MCB for VQA, Dual Attention Networks, etc...

ChainerCV
Want to
compare
your own
method on
public
datasets...
Datasets
Pascal VOC,
Caltech-UCSD
Birds-200-2011,
Stanford Online
Products,
CamVid, etc.
Models
Faster R-CNN,
SSD, SegNet
(will add more
models!)
Training
scripts
Evaluation
tools
Dataset
abstraction
Want to
apply to
new data...
Make image recognition research / develpment much easier
https://github.com/chainer/chainercv

Computer Vision for Fashion
● Dataset: Etsy dataset, Wear dataset, Fashion144k, DeepFashion,
● Methods: Algorithmic clothing, Pose Guided Person Image Generation, etc...
↑ Kota
Yamaguchi
←Edgar
Simo-Serra
https://sites.google.com/zalando.de/cvf-iccv2017

Basically object detection and object tracking are two important and challenging aspects in many computer vision applications like surveillance system, vehicle navigation, autonomous robot navigation, compression of video etc. Object detection is first low level important task for any video surveillance application. To detection of moving object is a challenging task. Tracking is required in higher level applications that required the location and shape of object. There are three key steps in video analysis: detection of interesting moving objects, tracking of such objects from frame to frame, and analysis of object tracks to recognize their behavior. Object detection and tracking especially for human and vehicle is currently most active research topic. A lot of research has been undergoing ranging from applications to noble algorithms. The main objective of this paper is to review (survey) of various moving object detection and object tracking methodologies.

Object tracking final

MrsShwetaBanait1

Occlusion and Abandoned Object Detection for Surveillance Applications

Editor IJCATR

Object detection is an important step in any video analysis. Difficulties of the object detection are finding hidden objects and finding unrecognized objects. Although many algorithms have been developed to avoid them as outliers, occlusion boundaries could potentially provide useful information about the scene’s structure and composition. A novel framework for blob based occluded object detection is proposed. A technique that can be used to detect occlusion is presented. It detects and tracks the occluded objects in video sequences captured by a fixed camera in crowded environment with occlusions. Initially the background subtraction is modeled by a Mixture of Gaussians technique (MOG). Pedestrians are detected using the pedestrian detector by computing the Histogram of Oriented Gradients descriptors (HOG), using a linear Support Vector Machine (SVM) as the classifier. In this work, a recognition and tracking system is built to detect the abandoned objects in the public transportation area such as train stations, airports etc. Several experiments were conducted to demonstrate the effectiveness of the proposed approach. The results show the robustness and effectiveness of the proposed method.

Object Detection and tracking in Video Sequences

IDES Editor

Deep sort and sort paper introduce presentation

경훈 김

Video processing has gained a lot of significance because of its applications in various areas of research. This includes monitoring movements in public places for surveillance. Video sequences from various standard datasets such as I2R, CAVIAR and UCSD are often referred for video processing applications and research. Identification of actors as well as the movements in video sequences should be accomplished with the static and dynamic background. The significance of research in video processing lies in identifying the foreground movement of actors and objects in video sequences. Foreground identification can be done with a static or dynamic background. This type of identification becomes complex while detecting the movements in video sequences with a dynamic background. For identification of foreground movement in video sequences with dynamic background, two algorithms are proposed in this article. The algorithms are termed as Frame Difference between Neighboring Frames using Hue, Saturation and Value (FDNF-HSV) and Frame Difference between Neighboring Frames using Greyscale (FDNF-G). With regard to F-measure, recall and precision, the proposed algorithms are evaluated with state-of-art techniques. Results of evaluation show that, the proposed algorithms have shown enhanced performance.

Resume Yu-Li LiangYuli Liang

HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE

NEHA THADEUS

Test

nikux

[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...

IJET - International Journal of Engineering and Techniques

This paper contain the study about vibration analysis for gearbox casing using finite element analysis (FEA).The aim of this paper is to apply ANSYS software to determine the natural frequency of gearbox casing. The objective of the project is to analyze differential gearbox casing of tata indigo cs vehicle for modal and stress analysis. The theoretical modal analysis needs to be validated with experimental results from Fourier frequency transformer (FFT) analysis. The main motivation behind the work is to go for a complete FEA of casing rather than empirical formulae and iterative procedures.

An overview of machine learning

drcfetr

Machine Learning ICS 273Abutest

A Survey On Tracking Moving Objects Using Various Algorithms

IJMTST Journal

Sparse representation has been applied to the object tracking problem. Mining the self- similarities between particles via multitask learning can improve tracking performance. How-ever, some particles may be different from others when they are sampled from a large region. Imposing all particles share the same structure may degrade the results. To overcome this problem, we propose a tracking algorithm based on robust multitask sparse representation (RMTT) in this letter. When we learn the particle representations, we decompose the sparse coefficient matrix into two parts in our algorithm. Joint sparse regularization is imposed on one coefficient matrix while element-wise sparse regularization is imposed on another matrix. The former regularization exploits self-similarities of particles while the later one considers the differences between them.

Lecture 1 computer vision introduction

cairo university

Kadir A_20160804_res_teaKadir A Peker

Resume 29_04_17

Subhabrata Debnath

“Can You See What I See? The Power of Deep Learning,” a Presentation from Str...

Edge AI and Vision Alliance

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2021/02/can-you-see-what-i-see-the-power-of-deep-learning-a-presentation-from-streamlogic/ Scott Thibault, President and Founder of StreamLogic, presents the “Can You See What I See? The Power of Deep Learning” tutorial at the September 2020 Embedded Vision Summit. It’s an exciting time to work in computer vision, mainly due to the technological advances in the area of deep learning. This talk is an introduction to some of the most important computer vision tasks that can be solved with deep learning. In particular, Thibault focuses on the application of convolutional neural networks to the tasks of image classification, object detection and facial image recognition using embeddings. You will learn about the types of applications in which DNNs performing these functions are typically used, and discover some of the publicly available models and data sets that you can use to help bootstrap your own applications.

The Opportunities and Challenges of Putting the Latest Computer Vision and De...

Albert Y. C. Chen

01Introduction.pptx - C280, Computer Visionbutest

How is a Vision Transformer (ViT) model built and implemented?

Benjaminlapid1

A Literature Survey on Image Linguistic Visual Question Answering

IRJET Journal

What's hot

Object tracking a survey

Haseeb Hassan

Video object tracking with classification and recognition of objectsManish Khare

Survey on video object detection & tracking

ijctet

Object trackingchirase44

L0816166IOSR Journals

Research on object detection and recognition using machine learning algorithm...

YousefElbayomi

Strategy for Foreground Movement Identification Adaptive to Background Variat...

IJECEIAES

Resume Yu-Li LiangYuli Liang

HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE

NEHA THADEUS

Test

nikux

[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...

IJET - International Journal of Engineering and Techniques

An overview of machine learning

drcfetr

Machine Learning ICS 273Abutest

A Survey On Tracking Moving Objects Using Various Algorithms

IJMTST Journal

What's hot (14)

Object tracking a survey

Video object tracking with classification and recognition of objects

Survey on video object detection & tracking

Object tracking

L0816166

Research on object detection and recognition using machine learning algorithm...

Strategy for Foreground Movement Identification Adaptive to Background Variat...

Resume Yu-Li Liang

HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE

Test

[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...

An overview of machine learning

Machine Learning ICS 273A

A Survey On Tracking Moving Objects Using Various Algorithms

Similar to [5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV

Lecture 1 computer vision introduction

cairo university

Kadir A_20160804_res_teaKadir A Peker

Resume 29_04_17

Subhabrata Debnath

“Can You See What I See? The Power of Deep Learning,” a Presentation from Str...

Edge AI and Vision Alliance

The Opportunities and Challenges of Putting the Latest Computer Vision and De...

Albert Y. C. Chen

01Introduction.pptx - C280, Computer Visionbutest

How is a Vision Transformer (ViT) model built and implemented?

Benjaminlapid1

A Literature Survey on Image Linguistic Visual Question Answering

IRJET Journal

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...

Data Con LA

Introduction talk to Computer Vision

Chen Sagiv

Computer Vision Crash Course

Jia-Bin Huang

IRJET - Human Pose Detection using Deep Learning

IRJET Journal

Human pose detection using machine learning by Grandel

GrandelDsouza

Satellite Image Classification and Analysis using Machine Learning with ISRO ...

vision.ppt

vision_2.ppt

vision.ppt

Computer Vision Crash Course

台灣資料科學年會

電腦視覺一二三電腦視覺旨在發展演算法使得電腦能理解影像的內容，近年來電腦視覺相關的技術已廣泛應用於我們生活中，舉凡物件偵測，識別，追蹤，三維重建，多媒體分析以及檢索，監控系統，醫療影像，以及電視電影中的許多視覺效果都可以看到電腦視覺技術的應用。這場演講的目的在於介紹電腦視覺中的基本觀念和核心技術，透過大量實際的範例幫助大家快速了解這些技術如何被應用在日常生活中，以期讓聽眾有效率地了解這個領域，最新的發展以及未來展望。

Introduction

sagayaaurelia1

Mini Project- Face Recognition

University of Hertfordshire, School of Electronic Communications and Electrical Engineering

The following resources come from the 2009/10 BSc (Hons) in Multimedia Technology (course number 2ELE0075) from the University of Hertfordshire. All the mini projects are designed as level two modules of the undergraduate programmes. The objectives of this project are to demonstrate abilities to: • Handle camera setup, calibrate and capture still and video faces • Pre-process images and extract features • Perform face recognition by a) using existing methods and b) trying new techniques. This project requires the students to apply their abilities to handle image capture hardware and software. Since this is an active area of research, students will need to perform literature survey and discuss ( through brainstorm sessions) their performance characteristics. In addition, they will need to design and implement pre-processing and recognition codes leading to face recognition.

Similar to [5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV (20)

Lecture 1 computer vision introduction

Kadir A_20160804_res_tea

Resume 29_04_17

“Can You See What I See? The Power of Deep Learning,” a Presentation from Str...

The Opportunities and Challenges of Putting the Latest Computer Vision and De...

01Introduction.pptx - C280, Computer Vision

How is a Vision Transformer (ViT) model built and implemented?

A Literature Survey on Image Linguistic Visual Question Answering

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...

Introduction talk to Computer Vision

Computer Vision Crash Course

IRJET - Human Pose Detection using Deep Learning

Human pose detection using machine learning by Grandel

Satellite Image Classification and Analysis using Machine Learning with ISRO ...

vision.ppt

vision_2.ppt

vision.ppt

Computer Vision Crash Course

Introduction

Mini Project- Face Recognition

Recently uploaded

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Product School

Assuring Contact Center Experiences for Your Customers With ThousandEyes

ThousandEyes

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

FIDO Alliance

Leading Change strategies and insights for effective change management pdf 1.pdf

OnBoard

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

Product School

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

Accelerate your Kubernetes clusters with Varnish Caching

Thijs Feryn

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

FIDO Alliance

Knowledge engineering: from people to machines and back

Elena Simperl

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

UiPath Test Automation using UiPath Test Suite series, part 3

DianaGray10

UiPath Test Automation using UiPath Test Suite series, part 4

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap. The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies. Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques What will you get from this session? 1. Insights into SAP testing best practices 2. Heatmap utilization for testing 3. Optimization of testing processes 4. Demo Topics covered: Execution from the test manager Orchestrator execution result Defect reporting SAP heatmap example with demo Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

JMeter webinar - integration with InfluxDB and Grafana

RTTS

Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application. In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics. Length: 30 minutes Session Overview ------------------------------------------- During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana: - What out-of-the-box solutions are available for real-time monitoring JMeter tests? - What are the benefits of integrating InfluxDB and Grafana into the load testing stack? - Which features are provided by Grafana? - Demonstration of InfluxDB and Grafana using a practice web application To view the webinar recording, go to: https://www.rttsweb.com/jmeter-integration-webinar

Mission to Decommission: Importance of Decommissioning Products to Increase E...

Product School

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

Prayukth K V

The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development. The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers: State of global ICS asset and network exposure Sectoral targets and attacks as well as the cost of ransom Global APT activity, AI usage, actor and tactic profiles, and implications Rise in volumes of AI-powered cyberattacks Major cyber events in 2024 Malware and malicious payload trends Cyberattack types and targets Vulnerability exploit attempts on CVEs Attacks on counties – USA Expansion of bot farms – how, where, and why In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East Why are attacks on smart factories rising? Cyber risk predictions Axis of attacks – Europe Systemic attacks in the Middle East Download the full report from here: https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

Inflectra

In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring. Learn about: • The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks. • Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective. • Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification. • Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process. Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Product School

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

91mobiles

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

DanBrown980551

Do you want to learn how to model and simulate an electrical network from scratch in under an hour? Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)! During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook. PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides: - A fully editable and extendable library for grid component modelling; - Visualization tools to display your network; - Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses; The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well. What you will learn during the webinar: - For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills; - For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.

Recently uploaded (20)

Essentials of Automations: Optimizing FME Workflows with Parameters

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Assuring Contact Center Experiences for Your Customers With ThousandEyes

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf

Leading Change strategies and insights for effective change management pdf 1.pdf

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

Accelerate your Kubernetes clusters with Varnish Caching

FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf

Knowledge engineering: from people to machines and back

GraphRAG is All You need? LLM & Knowledge Graph

UiPath Test Automation using UiPath Test Suite series, part 3

UiPath Test Automation using UiPath Test Suite series, part 4

JMeter webinar - integration with InfluxDB and Grafana

Mission to Decommission: Importance of Decommissioning Products to Increase E...

State of ICS and IoT Cyber Threat Landscape Report 2024 preview

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV

1. Brief Introduction to Recent Image Recognition Methods and ChainerCV Shunta Saito Researcher at Preferred Networks, Inc.

2. Self-introduction ● Shunta Saito, Ph. D. in Engineering (@mitmul) ● Background: Keio Univ. → UC Berkeley → Keio Leading Edge Lab. → Facebook, Inc. → Preferred Networks, Inc. ● Research interest: Computer Vision, semantic segmentation, etc. ● Job: Research on computer vision applications, Development of Chainer, Driving global alliance with Microsoft, Intel, etc...

3. Related work to fashion…? Virtual fitting demo using Kinect (2011) at Geis, Inc.

4. Major Image Recognition Problems ● Image Classification ● Object Detection ● Semantic Segmentation ● Instance-aware Segmentation ● Image Captioning ● Visual Question Answering

5. Image Classification ● MNIST, CIFAR-10/101, ImageNet, Places205, etc… ● Various methods based on ConvNets have been proposed ● ILSVRC 2017: the last ImageNet challenge – 1.28 million images, 1000 classes

6. Object Detection ● Dataset: Pascal VOC, MS COCO, KITTI, Cityscapes, etc… ● Methods: Faster R-CNN, SSD, R-FCN, etc... Complicated...

7. Semantic Segmentation ● Dataset: Pascal VOC, MS COCO, Cityscapes, KITTI, CamVid, LabelMe, SUN RGB-D, Mapillaly, etc... ● Methods: FCN, Deconv Net, U-Net, SegNet, Dilated, RefineNet, PSPNet, etc... See details here: http://bit.ly/seg-slide Semantic Segmentation result on Cityscapes dataset ->

8. Instance-aware Segmentation ● Dataset: MS COCO, Cityscapes, Mapillary, etc... ● Methods: DeepMask, SharpMask, Multi-task Network Cascades, Mask R-CNN, etc... Instance-aware: Differentiate each instance of the same class

9. Image Captioning ● Dataset: MS COCO, etc... ● Methods: Show and Tell, Show, Attend and Tell, Review Networks, Knowing When to Look, Hierarchical Recurrent Network, etc...

10. Visual Question Answering ● Dataset: VQA dataset ( http://visualqa.org ) ... ● Methods: MCB for VQA, Dual Attention Networks, etc...

11. ChainerCV Want to compare your own method on public datasets... Datasets Pascal VOC, Caltech-UCSD Birds-200-2011, Stanford Online Products, CamVid, etc. Models Faster R-CNN, SSD, SegNet (will add more models!) Training scripts Evaluation tools Dataset abstraction Want to apply to new data... Make image recognition research / develpment much easier https://github.com/chainer/chainercv

12. Computer Vision for Fashion ● Dataset: Etsy dataset, Wear dataset, Fashion144k, DeepFashion, ● Methods: Algorithmic clothing, Pose Guided Person Image Generation, etc... ↑ Kota Yamaguchi ←Edgar Simo-Serra https://sites.google.com/zalando.de/cvf-iccv2017

[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV

Recommended

Recommended

More Related Content

What's hot

What's hot (14)

Similar to [5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV

Similar to [5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV (20)

More from Shunta Saito

More from Shunta Saito (11)

Recently uploaded

Recently uploaded (20)

[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and ChainerCV