SlideShare a Scribd company logo
1 of 27
Download to read offline
Applying image matching algorithms to
video recognition and
autonomous robot navigation
Maxim Kamensky, CEO, Invarivision
Dmitriy Yeremeyev, CTO, Invarivision
EECVC presentation, July 9 2016
Image matching algorithms
Features selection
● well explored technology
● able to find partially closed images
● find rotated images
● works slowly
● recognize a little number of objects
Template-based algorithms
● work fast
● able to store many images
● do not cope well with overlapped images
● do not recognize rotated images
Feature
extraction
Input image
Classification
Object
type
Feature
vector
Template
Input image
Search in
template
database
Object
type
Keywords: SURF, SIFT, ConvNet, etc Keywords: BiGG
AVM - Associative Video Memory
Templates - recognition matrices
3x3
7x7
15x15
31x31
Associative tree
Root base
Base 1L1 Base 2L1 Base nL1
Base 1Lm Base 1Lm Base 1Lm
Level 1
Level m
Associative base
Recognition matrix Associated data
Image
Read / Write operation
Associated data
Template-based image matching algorithm
Technique of AVM testingAlso we have tested algorithm AVM on images of "Amsterdam Library of Object Images" (ALOI).
ALOI database have several expressions of the same object. So, it allows to compare how well the algorithm recognizes different
expressions of the same object against how well the algorithm discriminates different objects. To calculate this we perform step listed
below:
Separate database to training and test parts:
● Each object is off-plane rotated every 5 degrees;
● Training part: rotations 0, 10, 20, 30, ... 350; 36 expressions at all;
● Testing part: rotations 5, 15, 25, ... 355; 36 expressions at all;
● Do this for N objects from the dataset.
So for every object we have separate AVM with 36 learned object expressions:
● Match each model against each image of the test part of the database;
● Take the model maximal similarity response for a test image;
● If a model and a test image are of the same object this is genuine (same) matching pair;
● If a model and a test image are of different objects this is impostor (different) matching pair;
● Now we have N * 36 genuine matching pairs and N * ((N – 1) * 36) impostor matching pairs.
We draw special kind of ROC graph called Decision Error Tradeoff (DET) graph:
● On X axis we have False Acceptance Rate (FAR) or False Positive Rate (FPR);
● On Y axis we have False Recjection Rate (FRR) or False Negative Rate (FNR);
● Both axis are logarithmic;
Create models from the training part:
● Each model is in the separate AVM;
● Add 36 training expressions to the AVM with 80x80 key image size for instance;
Christmas bear, © Amsterdam Library of Object Images
AVM performance
Time performance - average time of processing each image (in ms)
Tree capacity - total number of images in the tree (Intel® Xeon® CPU L5630 @ 2.13GHz)
Object search in image
Object training (write)
Sliding window (read).
Scan step is ⅛ of window size.
Window size is scaled up by 25% on each step
Window position is adjusted by AVM
Result : object id, x, y, scale
Autonomous navigation of robots in indoor spaces
Navigation module based on AVM technology
allows the robot to orientate in a space and
navigate precisely to a defined point on the map.
Images
AVM search treeWebcam
Recognized?
Actual position
* X, Y coordinates
* azimuth
*Pairs:
image -> X,Y
and azimuth
Yes
In our case the visual navigation for robot is just sequence of
images with associated coordinates that was memorized
inside AVM tree.
Using of AVM in robotics
Object trackingFollow me
Augmented reality by AVM
3D marker of target position
Implementation for Roborealm - AVM Navigator
AVM Navigator is an module of the RoboRealm system that provides
object recognition and autonomous robot navigation using a single video
camera on the robot as the main sensor for navigation.
Localization error
The localization errors is about 0.1 meter (10
centimeters).
Quake3 robot simulator mod
Navigating outdoors
Route recognitionRoute training
Image matching
in video processing
Automatic searching of video fragments
Film Frame
Image
s-core #1
AVM search tree
S-core cluster
Database
Film ID Position
MultiTrack - assembling module
s-core #2
AVM search tree
s-core #N
AVM search tree
Video fragment #1
Film name, position, length
Video fragment #2
Film name, position, length
Video fragment #M
Film name, position, length
MultiTrack - assembling of duplicates
Fragment #1 Unknown video Fragment #2 Fragment #3 Fragment #4
Scanned video
Source fragment #1
Duplicate video #1.1
Duplicate video #1.2
Duplicate video #1.3
Source fragment #2
Duplicate video #2.1
Duplicate video #2.2
Source fragment #3 Source fragment #4
Duplicate video #4.1
Search results
Distributed system
Customer system
REST API
Invarivision - ISS
Base server
* Task management
* Database
* s-core
* s-coordinator
Node server #1
* s-core
* s-coordinator
Node server #2
* s-core
* s-coordinator
Node server #N
* s-core
* s-coordinator
All these servers can contain applications for
video processing and image recognition.
s-coordinator - application for coordination
of video processing.
s-core - application for reading/writing of the
separate images in the search tree.
Software structure
Video
Database
s-coordinator s-core #1
s-core #2
s-core #N
Ethernet
UDP Multicast
Image
*Film ID
*Position
Frame change
detector
Write
operation
Read/search
operation
12%
Scaling of the search system
s-core #1,1 s-core #2,1 s-core #N,1
s-core #1,2 s-core #2,2 s-core #N,2
s-core #1,M s-core #2,M s-core #N,M
Write speed and capacity
Readspeed
Network of the search cores
Computer cluster
Scheme of the video write
1 2 3 4 5 6 7 8 9 ...
Video frames
3x3
s-core #1,1 s-core #2,1 s-core #3,1
s-core #1,2 s-core #2,2 s-core #3,2
s-core #1,3 s-core #2,3 s-core #3,3
Scheme of the video read
123456789...
Videoframes
3x3
s-core #1,3 s-core #2,3 s-core #3,3
s-core #1,2 s-core #2,2 s-core #3,2
s-core #1,1 s-core #2,1 s-core #3,1
Tree splitting
Scaling of capacity
Tree #1
Tree #1.1 Tree #1.2
Capacity alignment Adding video
for searching
Alignment system Images
Tree #1
Tree #2
Tree #N
Next stage
Tree #1
Tree #2
Tree #N
Storage capacity
RAM 6.2Mb → 1 hour of video
(with FCD 12%)
Server with 256Gb RAM → 41290 hours
of source video for searching
Using SSD disk as a swap space
1.4TB → 225806 hours
Speed with FCD set to 12% of frames on 1 base server
Dual Xeon 2xE5690 (3.47 GHz) is about 50 video
hours per hour.
Interference resistance
Worst quality - CRF 51
Padded corner 15%
Padded center 10%
White noise 100% 10 degrees rotated
Padded center 5% Grayscale
Padded corner 5% Padded corner 10%
White noise 50%
Cropped from center 15%
5 degrees rotated
Test results
Data set Average precision
%
Average recall
%
Average F-measure
%
5 degrees rotated* 100 93.82 % 96.81 %
10 degrees rotated* 100 18.54 % 31.28 %
White noise 50%* 100 98.09 % 99.03 %
White noise 100%* 100 93.9 % 96.85 %
Padded center 5%* 100 97.04 % 98.5 %
Padded center 10%* 100 48.8 % 65.59 %
Padded corner 5%* 100 97.84 % 98.91 %
Padded corner 10%* 100 89.89 % 94.68 %
Padded corner 15%* 100 41.12 % 58.28 %
Cropped from center 10%* 100 96.54 % 98.24 %
Cropped from center 15%* 100 67.44 % 80.55 %
Constant Rate Factor 51* 100 96.53 % 98.23 %
Grayscale* 100 97.5 % 98.73 %
For each scanned interval in video we
can define one of the following
situations:
● True Positive (TP) — system
found correct matching original
interval
● False Positive (FP) — system
found incorrect matching original
interval
● False Negative (FN) — system
didn’t find matching original
interval, but it does exist
Thank you for your attention!
Questions?
Site: Invarivision.com
Email: maxim.kamensky@invarivision.com
Skype: maxim.kamensky
Phone: +380662346738
EECVC presentation, July 9 2016

More Related Content

What's hot

[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for ArtistsOwen Wu
 
[TGDF 2020] Mobile Graphics Best Practices for Artist
[TGDF 2020] Mobile Graphics Best Practices for Artist[TGDF 2020] Mobile Graphics Best Practices for Artist
[TGDF 2020] Mobile Graphics Best Practices for ArtistOwen Wu
 
Challenges in Embedded Development
Challenges in Embedded DevelopmentChallenges in Embedded Development
Challenges in Embedded DevelopmentSQABD
 
Minimizing CPU Shortage Risks in Integrated Embedded Software
Minimizing CPU Shortage Risks in Integrated Embedded SoftwareMinimizing CPU Shortage Risks in Integrated Embedded Software
Minimizing CPU Shortage Risks in Integrated Embedded SoftwareLionel Briand
 
Memory Leak Analysis in Android Games
Memory Leak Analysis in Android GamesMemory Leak Analysis in Android Games
Memory Leak Analysis in Android GamesHeghine Hakobyan
 
Unity mobile game performance profiling – using arm mobile studio
Unity mobile game performance profiling – using arm mobile studioUnity mobile game performance profiling – using arm mobile studio
Unity mobile game performance profiling – using arm mobile studioOwen Wu
 
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...Owen Wu
 
Con-FESS 2015 - Is your profiler speaking to you?
Con-FESS 2015 - Is your profiler speaking to you?Con-FESS 2015 - Is your profiler speaking to you?
Con-FESS 2015 - Is your profiler speaking to you?Anton Arhipov
 
BruCON 2010 Lightning Talks - DIY Grid Computing
BruCON 2010 Lightning Talks - DIY Grid ComputingBruCON 2010 Lightning Talks - DIY Grid Computing
BruCON 2010 Lightning Talks - DIY Grid Computingtomaszmiklas
 
GPU Pipeline - Realtime Rendering CH3
GPU Pipeline - Realtime Rendering CH3GPU Pipeline - Realtime Rendering CH3
GPU Pipeline - Realtime Rendering CH3Aries Cs
 
GPU Computing for Data Science
GPU Computing for Data Science GPU Computing for Data Science
GPU Computing for Data Science Domino Data Lab
 
Horovod ubers distributed deep learning framework by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework  by Alex Sergeev from UberHorovod ubers distributed deep learning framework  by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework by Alex Sergeev from UberBill Liu
 
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Benoit Combemale
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsPrabindh Sundareson
 
TinyML as-a-Service
TinyML as-a-ServiceTinyML as-a-Service
TinyML as-a-ServiceHiroshi Doyu
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsKoan-Sin Tan
 

What's hot (20)

[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
[Unite Seoul 2020] Mobile Graphics Best Practices for Artists
 
[TGDF 2020] Mobile Graphics Best Practices for Artist
[TGDF 2020] Mobile Graphics Best Practices for Artist[TGDF 2020] Mobile Graphics Best Practices for Artist
[TGDF 2020] Mobile Graphics Best Practices for Artist
 
Challenges in Embedded Development
Challenges in Embedded DevelopmentChallenges in Embedded Development
Challenges in Embedded Development
 
Minimizing CPU Shortage Risks in Integrated Embedded Software
Minimizing CPU Shortage Risks in Integrated Embedded SoftwareMinimizing CPU Shortage Risks in Integrated Embedded Software
Minimizing CPU Shortage Risks in Integrated Embedded Software
 
Memory Leak Analysis in Android Games
Memory Leak Analysis in Android GamesMemory Leak Analysis in Android Games
Memory Leak Analysis in Android Games
 
Unity mobile game performance profiling – using arm mobile studio
Unity mobile game performance profiling – using arm mobile studioUnity mobile game performance profiling – using arm mobile studio
Unity mobile game performance profiling – using arm mobile studio
 
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
[GDC 2012] Enhancing Graphics in Unreal Engine 3 Titles Using AMD Code Submis...
 
Con-FESS 2015 - Is your profiler speaking to you?
Con-FESS 2015 - Is your profiler speaking to you?Con-FESS 2015 - Is your profiler speaking to you?
Con-FESS 2015 - Is your profiler speaking to you?
 
BruCON 2010 Lightning Talks - DIY Grid Computing
BruCON 2010 Lightning Talks - DIY Grid ComputingBruCON 2010 Lightning Talks - DIY Grid Computing
BruCON 2010 Lightning Talks - DIY Grid Computing
 
GPU Pipeline - Realtime Rendering CH3
GPU Pipeline - Realtime Rendering CH3GPU Pipeline - Realtime Rendering CH3
GPU Pipeline - Realtime Rendering CH3
 
GPU Computing for Data Science
GPU Computing for Data Science GPU Computing for Data Science
GPU Computing for Data Science
 
Openmp
OpenmpOpenmp
Openmp
 
Horovod ubers distributed deep learning framework by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework  by Alex Sergeev from UberHorovod ubers distributed deep learning framework  by Alex Sergeev from Uber
Horovod ubers distributed deep learning framework by Alex Sergeev from Uber
 
Multicore
MulticoreMulticore
Multicore
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
OpenMP
OpenMPOpenMP
OpenMP
 
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
 
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specificationsGFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
GFX Part 1 - Introduction to GPU HW and OpenGL ES specifications
 
TinyML as-a-Service
TinyML as-a-ServiceTinyML as-a-Service
TinyML as-a-Service
 
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source SolutionsDark Silicon, Mobile Devices, and Possible Open-Source Solutions
Dark Silicon, Mobile Devices, and Possible Open-Source Solutions
 

Viewers also liked

James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text
James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text
James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text Eastern European Computer Vision Conference
 
#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot
#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot
#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBotchatbotscommunity
 
#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab
#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab
#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLabchatbotscommunity
 
#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel
#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel
#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuelchatbotscommunity
 
Анализ ниши 80-го левела - нюансы, кейсы, практика
Анализ ниши 80-го левела - нюансы, кейсы, практикаАнализ ниши 80-го левела - нюансы, кейсы, практика
Анализ ниши 80-го левела - нюансы, кейсы, практикаSeoProfy Presentations
 
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLABvkn13
 
30 Reasons to Start a Business
30 Reasons to Start a Business30 Reasons to Start a Business
30 Reasons to Start a BusinessPalo Alto Software
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image ProcessingSahil Biswas
 

Viewers also liked (12)

Andrii Babii - Application of fuzzy transform to image fusion
Andrii Babii - Application of fuzzy transform to image fusion Andrii Babii - Application of fuzzy transform to image fusion
Andrii Babii - Application of fuzzy transform to image fusion
 
Michael Norel - High Accuracy Camera Calibration
Michael Norel - High Accuracy Camera Calibration Michael Norel - High Accuracy Camera Calibration
Michael Norel - High Accuracy Camera Calibration
 
James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text
James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text
James Pritts - Visual Recognition in the Wild: Image Retrieval, Faces, and Text
 
3 d image processsing operations
3 d image processsing operations3 d image processsing operations
3 d image processsing operations
 
#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot
#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot
#3 Global AI Meetup (NLP) - Станислав Гафаров, MrBot
 
#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab
#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab
#3 Global AI Meetup (NLP) - Михаил Бурцев, DeepHackLab
 
#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel
#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel
#3 Global AI Meetup (NLP) - Олег Шляжко, Chatfuel
 
Анализ ниши 80-го левела - нюансы, кейсы, практика
Анализ ниши 80-го левела - нюансы, кейсы, практикаАнализ ниши 80-го левела - нюансы, кейсы, практика
Анализ ниши 80-го левела - нюансы, кейсы, практика
 
Xgboost
XgboostXgboost
Xgboost
 
Basics of Image Processing using MATLAB
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
 
30 Reasons to Start a Business
30 Reasons to Start a Business30 Reasons to Start a Business
30 Reasons to Start a Business
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 

Similar to Maxim Kamensky - Applying image matching algorithms to video recognition and autonomous robot navigation

Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesUmbra
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for GamesSampo Lappalainen
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
 
Dynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and RetrievalDynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and RetrievalCSCJournals
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearningscalawox
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataIRJET Journal
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .netMarco Parenzan
 
At&t research at trecvid 2009
At&t research at trecvid 2009At&t research at trecvid 2009
At&t research at trecvid 2009Kirill Lazarev
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETMarco Parenzan
 
Develop store apps with kinect for windows v2
Develop store apps with kinect for windows v2Develop store apps with kinect for windows v2
Develop store apps with kinect for windows v2Matteo Valoriani
 
Develop Store Apps with Kinect for Windows v2
Develop Store Apps with Kinect for Windows v2Develop Store Apps with Kinect for Windows v2
Develop Store Apps with Kinect for Windows v2Clemente Giorio
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Jeff Sipko
 
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonVideo Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonCSCJournals
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
FGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie TycoonFGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie Tycoonmochimedia
 
Video indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracksVideo indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracksIAEME Publication
 

Similar to Maxim Kamensky - Applying image matching algorithms to video recognition and autonomous robot navigation (20)

Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
Visibility Optimization for Games
Visibility Optimization for GamesVisibility Optimization for Games
Visibility Optimization for Games
 
Recognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenesRecognition and tracking moving objects using moving camera in complex scenes
Recognition and tracking moving objects using moving camera in complex scenes
 
Video Indexing and Retrieval
Video Indexing and RetrievalVideo Indexing and Retrieval
Video Indexing and Retrieval
 
Dynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and RetrievalDynamic Threshold in Clip Analysis and Retrieval
Dynamic Threshold in Clip Analysis and Retrieval
 
F0953235
F0953235F0953235
F0953235
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big Data
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
At&t research at trecvid 2009
At&t research at trecvid 2009At&t research at trecvid 2009
At&t research at trecvid 2009
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Develop store apps with kinect for windows v2
Develop store apps with kinect for windows v2Develop store apps with kinect for windows v2
Develop store apps with kinect for windows v2
 
Develop Store Apps with Kinect for Windows v2
Develop Store Apps with Kinect for Windows v2Develop Store Apps with Kinect for Windows v2
Develop Store Apps with Kinect for Windows v2
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
 
IMAGE PROCESSING
IMAGE PROCESSINGIMAGE PROCESSING
IMAGE PROCESSING
 
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonVideo Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
FGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie TycoonFGS 2011: Making A Game With Molehill: Zombie Tycoon
FGS 2011: Making A Game With Molehill: Zombie Tycoon
 
Video indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracksVideo indexing using shot boundary detection approach and search tracks
Video indexing using shot boundary detection approach and search tracks
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Maxim Kamensky - Applying image matching algorithms to video recognition and autonomous robot navigation

  • 1. Applying image matching algorithms to video recognition and autonomous robot navigation Maxim Kamensky, CEO, Invarivision Dmitriy Yeremeyev, CTO, Invarivision EECVC presentation, July 9 2016
  • 2. Image matching algorithms Features selection ● well explored technology ● able to find partially closed images ● find rotated images ● works slowly ● recognize a little number of objects Template-based algorithms ● work fast ● able to store many images ● do not cope well with overlapped images ● do not recognize rotated images Feature extraction Input image Classification Object type Feature vector Template Input image Search in template database Object type Keywords: SURF, SIFT, ConvNet, etc Keywords: BiGG
  • 3. AVM - Associative Video Memory Templates - recognition matrices 3x3 7x7 15x15 31x31 Associative tree Root base Base 1L1 Base 2L1 Base nL1 Base 1Lm Base 1Lm Base 1Lm Level 1 Level m Associative base Recognition matrix Associated data Image Read / Write operation Associated data Template-based image matching algorithm
  • 4. Technique of AVM testingAlso we have tested algorithm AVM on images of "Amsterdam Library of Object Images" (ALOI). ALOI database have several expressions of the same object. So, it allows to compare how well the algorithm recognizes different expressions of the same object against how well the algorithm discriminates different objects. To calculate this we perform step listed below: Separate database to training and test parts: ● Each object is off-plane rotated every 5 degrees; ● Training part: rotations 0, 10, 20, 30, ... 350; 36 expressions at all; ● Testing part: rotations 5, 15, 25, ... 355; 36 expressions at all; ● Do this for N objects from the dataset. So for every object we have separate AVM with 36 learned object expressions: ● Match each model against each image of the test part of the database; ● Take the model maximal similarity response for a test image; ● If a model and a test image are of the same object this is genuine (same) matching pair; ● If a model and a test image are of different objects this is impostor (different) matching pair; ● Now we have N * 36 genuine matching pairs and N * ((N – 1) * 36) impostor matching pairs. We draw special kind of ROC graph called Decision Error Tradeoff (DET) graph: ● On X axis we have False Acceptance Rate (FAR) or False Positive Rate (FPR); ● On Y axis we have False Recjection Rate (FRR) or False Negative Rate (FNR); ● Both axis are logarithmic; Create models from the training part: ● Each model is in the separate AVM; ● Add 36 training expressions to the AVM with 80x80 key image size for instance;
  • 5. Christmas bear, © Amsterdam Library of Object Images
  • 6.
  • 7. AVM performance Time performance - average time of processing each image (in ms) Tree capacity - total number of images in the tree (Intel® Xeon® CPU L5630 @ 2.13GHz)
  • 8. Object search in image Object training (write) Sliding window (read). Scan step is ⅛ of window size. Window size is scaled up by 25% on each step Window position is adjusted by AVM Result : object id, x, y, scale
  • 9. Autonomous navigation of robots in indoor spaces Navigation module based on AVM technology allows the robot to orientate in a space and navigate precisely to a defined point on the map. Images AVM search treeWebcam Recognized? Actual position * X, Y coordinates * azimuth *Pairs: image -> X,Y and azimuth Yes In our case the visual navigation for robot is just sequence of images with associated coordinates that was memorized inside AVM tree.
  • 10. Using of AVM in robotics Object trackingFollow me
  • 11. Augmented reality by AVM 3D marker of target position
  • 12. Implementation for Roborealm - AVM Navigator AVM Navigator is an module of the RoboRealm system that provides object recognition and autonomous robot navigation using a single video camera on the robot as the main sensor for navigation. Localization error The localization errors is about 0.1 meter (10 centimeters).
  • 16. Automatic searching of video fragments Film Frame Image s-core #1 AVM search tree S-core cluster Database Film ID Position MultiTrack - assembling module s-core #2 AVM search tree s-core #N AVM search tree Video fragment #1 Film name, position, length Video fragment #2 Film name, position, length Video fragment #M Film name, position, length
  • 17. MultiTrack - assembling of duplicates Fragment #1 Unknown video Fragment #2 Fragment #3 Fragment #4 Scanned video Source fragment #1 Duplicate video #1.1 Duplicate video #1.2 Duplicate video #1.3 Source fragment #2 Duplicate video #2.1 Duplicate video #2.2 Source fragment #3 Source fragment #4 Duplicate video #4.1 Search results
  • 18. Distributed system Customer system REST API Invarivision - ISS Base server * Task management * Database * s-core * s-coordinator Node server #1 * s-core * s-coordinator Node server #2 * s-core * s-coordinator Node server #N * s-core * s-coordinator All these servers can contain applications for video processing and image recognition. s-coordinator - application for coordination of video processing. s-core - application for reading/writing of the separate images in the search tree.
  • 19. Software structure Video Database s-coordinator s-core #1 s-core #2 s-core #N Ethernet UDP Multicast Image *Film ID *Position Frame change detector Write operation Read/search operation 12%
  • 20. Scaling of the search system s-core #1,1 s-core #2,1 s-core #N,1 s-core #1,2 s-core #2,2 s-core #N,2 s-core #1,M s-core #2,M s-core #N,M Write speed and capacity Readspeed Network of the search cores Computer cluster
  • 21. Scheme of the video write 1 2 3 4 5 6 7 8 9 ... Video frames 3x3 s-core #1,1 s-core #2,1 s-core #3,1 s-core #1,2 s-core #2,2 s-core #3,2 s-core #1,3 s-core #2,3 s-core #3,3
  • 22. Scheme of the video read 123456789... Videoframes 3x3 s-core #1,3 s-core #2,3 s-core #3,3 s-core #1,2 s-core #2,2 s-core #3,2 s-core #1,1 s-core #2,1 s-core #3,1
  • 23. Tree splitting Scaling of capacity Tree #1 Tree #1.1 Tree #1.2 Capacity alignment Adding video for searching Alignment system Images Tree #1 Tree #2 Tree #N Next stage Tree #1 Tree #2 Tree #N
  • 24. Storage capacity RAM 6.2Mb → 1 hour of video (with FCD 12%) Server with 256Gb RAM → 41290 hours of source video for searching Using SSD disk as a swap space 1.4TB → 225806 hours Speed with FCD set to 12% of frames on 1 base server Dual Xeon 2xE5690 (3.47 GHz) is about 50 video hours per hour.
  • 25. Interference resistance Worst quality - CRF 51 Padded corner 15% Padded center 10% White noise 100% 10 degrees rotated Padded center 5% Grayscale Padded corner 5% Padded corner 10% White noise 50% Cropped from center 15% 5 degrees rotated
  • 26. Test results Data set Average precision % Average recall % Average F-measure % 5 degrees rotated* 100 93.82 % 96.81 % 10 degrees rotated* 100 18.54 % 31.28 % White noise 50%* 100 98.09 % 99.03 % White noise 100%* 100 93.9 % 96.85 % Padded center 5%* 100 97.04 % 98.5 % Padded center 10%* 100 48.8 % 65.59 % Padded corner 5%* 100 97.84 % 98.91 % Padded corner 10%* 100 89.89 % 94.68 % Padded corner 15%* 100 41.12 % 58.28 % Cropped from center 10%* 100 96.54 % 98.24 % Cropped from center 15%* 100 67.44 % 80.55 % Constant Rate Factor 51* 100 96.53 % 98.23 % Grayscale* 100 97.5 % 98.73 % For each scanned interval in video we can define one of the following situations: ● True Positive (TP) — system found correct matching original interval ● False Positive (FP) — system found incorrect matching original interval ● False Negative (FN) — system didn’t find matching original interval, but it does exist
  • 27. Thank you for your attention! Questions? Site: Invarivision.com Email: maxim.kamensky@invarivision.com Skype: maxim.kamensky Phone: +380662346738 EECVC presentation, July 9 2016