SlideShare a Scribd company logo
1 of 41
Download to read offline
Automatic Selection of Object
Recognition Methods Using
Reinforcement Learning
Reinaldo A.C. Bianchi†, Arnau Ramisa‡,
and Ramón López de Mántaras‡
†Centro Universitário da FEI, Brazil
‡Artificial Intelligence Research Institute, Spain
Presenter: Shunta SAITO
13年5月22日水曜日
Authors
• Reinaldo A.C. Bianchi
‣ full professor at the electric engineering department of the
centro universitário da FEI, at são bernardo do campo, são
paulo, brasil
• Arnau Ramisa
‣ postdoc with the Perception and Manipulation team at the
Industrial Robotics Institute (IRI UPC-CSIC), at the
Universitat Politecnica de Catalunya
• Ramón López de Mántaras
‣ Director of the IIIA (Artificial Intelligence Research Institute)
of the CSIC (Spanish National Research Council)
13年5月22日水曜日
Abstract
Which algorithm should be used to recognize objects?
Question:
Goal:
Automatically select the best algorithm from 2 state-of-
the-art object recognition algorithms
Methodology:
Using Reinforcement Learning
Background:
The robot should be able to decide by itself which object
recognition method should be used, depending on the
current conditions of the world
13年5月22日水曜日
Abstract
Which algorithm should be used to recognize objects?
Question:
Goal:
Automatically select the best algorithm from 2 state-of-
the-art object recognition algorithms
Methodology:
Using Reinforcement Learning
Background:
The robot should be able to decide by itself which object
recognition method should be used, depending on the
current conditions of the world
13年5月22日水曜日
Reinforcement Learning
The RL problem is meant to be a straightforward framing of
the problem of learning from interaction to achieve a goal
(Sutton and Barto, 1998)
Formulation:
assuming that the environment fill Markov property
- finite set of states   that the agent can achieve;s 2 S
- A finite set of possible actions   that the agent can perform;a 2 A
- A state transition function T : S ⇥ A !
Y
(S)
where   is a probability distribution over
Y
(S) S
- A finite set of bounded reinforcements (payoffs) R : S ⇥ A ! R
task: find out a stationary policy of actions ⇡⇤
: S ! A
13年5月22日水曜日
Reinforcement Learning
Formulation:
Q⇤
(s, a)
is the reward received upon performing action in statea s
+ the discounted value of following the optimal policy thereafter
Q⇤
(s, a) ⌘ R(s, a) +
X
s02S
T(s, a, s0
)V ⇤
(s0
)
optimal policy ⇡⇤
⌘ arg maxaQ⇤
(s, a)
transition probability
: optimal state-action function
Q⇤
(s, a) ⌘ R(s, a) +
X
s02S
T(s, a, s0
)maxa0 Q⇤
(s0
, a0
)
13年5月22日水曜日
Q-Learning
Formulation:
The Q-learning algorithm iteratively approximates ˆQ
the values will converge with probability 1 toˆQ Q⇤
ˆQLet be the learner s estimate of Q⇤
(s, a)
ˆQ(s, a) ˆQ(s, a) + ↵
h
r + maxa0 ˆQ(s0
, a0
) ˆQ(s, a)
i
a0
s0
s
a
...
...
...
...
...
...
...
...
...
backup diagram
ˆQ(s, a)
ˆQ(s0
, a0
)
(0  < 1)
↵ =
1
1 + visits(s, a)
total number of times this state-
action pair has been visited
times, and backup↵
step-size parameter
13年5月22日水曜日
Reinforcement Learning
Q-learning Algorithm:
Q.initialize(arbitrarily)
Episodes.each do
s.initialize
while s.is_terminal?
a = epsilon_greedy(s) # using policy derived rom Q
reward, next_state = take_action a
Q.update(s, a, reward, next_state)
s = next_state
end
end
ˆQ(s, a) ˆQ(s, a) + ↵
h
r + maxa0 ˆQ(s0
, a0
) ˆQ(s, a)
i
13年5月22日水曜日
RL applications in Computer Vision
Active Vision
Whitehead and Ballard (1991) => Machine Learning (Jnl.)
described an adaptive control architecture to integrate active sensory-
motor systems with RL based decision systems
Minut and Mahadevan (2001) => ICAA
proposed a model of selective attention for visual search tasks, such as
deciding where to fixate next in order to reach the region where an object is
most likely to be found
Darrell and Pentland (1996a,b) => NIPS, ICPR
proposed a gesture recognition system that guides an active camera to
foveate salient features based on a RL paradigm
To angle one's eyes such that the foveae are directed at (an object in one's field of view)
13年5月22日水曜日
RL applications in Computer Vision
Active Vision
Darrell (1998) => NIPS
concisely represented active recognition behavior derived from hidden-
state reinforcement learning techniques
Paletta and Pinz (2000) => Robotics and Autonomous Systems (Jnl.)
applied RL in an active object recognition system, to learn how to move
the camera to informative viewpoints, defining the recognition process as a
sequential decision problem with the objective of disambiguating initial
object hypotheses
Reinforcement Learning provides then an efficient method to
autonomously develop near-optimal decision strategies in terms of
sensori- motor mappings (Paletta et al, 1998)
For these authors...
13年5月22日水曜日
RL applications in Computer Vision
Active Vision
Borotschnig, et al. (1999) => Image and Vision Computing (Jnl.)
built a system that learns to reposition the camera to capture additional
views to improve the iamge classification result which is obtained from a
single view
Paletta, et al. (2005) => ICML
proposed the use of Q-learning to associate shift of attention actions to
cumulative reward with respect to object recognition
Image Segmentation
Peng and Bhanu (1998) => IEEE Trnsc. on PAMI
used RL to learn to adapt the image segmentation params of a specific
algorithm to the changing environmental conditions
13年5月22日水曜日
RL applications in Computer Vision
Image Segmentation and Object Recognition
Peng and Bhanu (1998, 2000) => IEEE Trnsc. on SMC
improved the recognition results by using the output at the highest level as
feedback for the learning system
Taylor (2004) => MSc Thesis
proposed a general framework for applying RL to parameter selection
problem in vision
Parameter Selection in Vision Problem
Tizhoosh and Taylor (2006) => Int. Jnl. of Image and Graphics
proposed a automated technique for obtaining a subjectively ideal image
enhancement
13年5月22日水曜日
RL applications in Computer Vision
Parameter Selection in Vision Problem
Shokri and Tizhoosh (2003∼2008) => some Int. Jnl.s...
proposed a reinforcement agent for finding an optimal threshold in order
to segment digital images
Yin (2002) => Signal Process (Jnl.)
design a general framework for an intelligent system to extract one object
of interest from ultrasound images based on reinforcement learning
Sahba, et al. (2008) => Expert Systems with Applications (Jnl.)
proposed a RL system for adaptive tropical cyclone patterns segmentation
and feature extraction from satellite imagery and introduced a closed-loop
system based tropical cyclone forecast on RL
Hossain, et al. (1999) => IEEE SMC
13年5月22日水曜日
RL applications in Computer Vision
Object Recognition
modeled the object recognition problem as a Markov Decision Problem,
and proposed a theoretically sound method for constructing object
recogntion strategies by combining CV algorithms to perform
segmentation (The result is a system called ADORE (Adaptive Object
Recognition) that automatically learns object-recognition strategies from
training data)
Draper, et al. (1999) => Jnl. of Computer Vision Research
There are many applications of RL in Computer Vision
13年5月22日水曜日
Summary of RL in CV
Whitehead and Ballard (1991)
active sensory-motor systems RL based decision systems+
adaptive control architecture
optimize the performance of active vision systems
decide where the focus of attention should be
learn how to move a camera to more informative viewpoints
optimize parameters of existing and new CV algorithms
diversified to...
13年5月22日水曜日
Limitations caused by RL
In Object Recognition Task
the reward value associated with a situation
‣ is usually not directly available
‣ requires that a certain amount of knowledge about the
world to be defined
the large space state
‣ make it difficult to converge
‣ RL algorithms rises performance issues
13年5月22日水曜日
Two Object Recognition Methods
Lowe s Feature Matching method (Lowe, 2004)
Vocabulary Tree Algorithm (Nistér and Stewénius, 2006)
‣ is proposed together with SIFT
‣ is a single view object detection
and recognition system
‣ matches features between a test and model images as below
‣ uses visual words (Bag-of-Words) to classify images
13年5月22日水曜日
Lowe s Method
Model Input
Result
Identification
13年5月22日水曜日
Vocabulary Tree Algorithm
which class?
Classification
13年5月22日水曜日
Preprocessing
• segmentation
1. apply bilateral filtering to remove texture from the
image
2. the Canny edge detector is applied to define the
edges in the image
3. mathematical morphology operators are applied in
order to close the contours that remained open
4. a flood-fill algorithm is used to fill connected areas
divided by the edges
13年5月22日水曜日
Weaknesses of two methods
Lowe s Feature Matching Method
performs poorly when recognizing sparsely textured
objects or objects with repetitive patterns
Vocabulary Tree Algorithm
needs an accurate segmentation state, prior to
classification, which can be very time consuming, and it
depends on the quality of the segmentation stage to
provide good results
13年5月22日水曜日
Learning to Select Object Recognition Methods
1st stage 1. decide to use the image for recognition
‣ because the image contains an object
2. decide the image should be discarded
‣ because the image does not contain objects
2nd stage 1. decide to use Lowe s algorithm
2. decide to use Vocabulary Tree(VT) Algorithm
use Reinforcement Learning as a classification method
13年5月22日水曜日
State
attributes extracted from the images
the possible classification of the image
+
Space state
State definition example in 1st stage
s = [I, , c]
c : class ID
σ: standard deviation of image intensity
I : mean image intensity
13年5月22日水曜日
Action
Update action (not real action happening in the world)
Q(s, a)update the value of a state-action pair   at one state
using the value of a neighbor pair
Example:
if space state is composed of [ I, σ] (2D)
I
σ
0.1 0.7 1.2 3.1 1.8
0.5 1.1 0.3 2.6 4.1
1.4 2.3 3.2 0.9 2.7
0.7 4.3 2.7 1.4 3.9
3.2 4.6 1.3 1.7 0.7
(after action a)
I
σ
0.1 0.7 1.2 3.1 1.8
0.5 1.1 0.3 2.6 4.1
1.4 2.3 2.3 0.9 2.7
0.7 4.3 2.7 1.4 3.9
3.2 4.6 1.3 1.7 0.7
update action
→action toward space state
Q(s, a) values
13年5月22日水曜日
Reward
If the learning agent reaches a state where a traning image
exists and the state corresponds to the correct classification
of the image, the agent receives a reward
Example:
a training image (mean = 50, std = 10, not contains objects)
a state (mean = 50, std = 10, classification = discard)
exists?
yes no
reward > 0 reward = 0
13年5月22日水曜日
MDP Definition
• The set of update actions that the agent can perform,
defined as update the Q value using the value of a neighbor
• The finite set of states in this case is the n-dimensional
space of values of the attributes extracted from the images
plus its classification;
• The state transition function allows updates to be made
between any pair of neighbors in the set of states
• The Reinforcements are defined using a set of
training images
a 2 A
s 2 S
R : S ⇥ A ! R
13年5月22日水曜日
Training phase
Reinforcement learning is performed over a set of pre-
classified images
→ learn a mapping from images to image classes
Learning algorithm
13年5月22日水曜日
Training phase
What is happening during the learning phase?
Goal
Every time the robot finds a
goal state and receive a
reward, the state-action pair
where the robot was before
reaching the goal state is
updated
Every time the robot moves,
it iteratively updates the origin
state-action pair
13年5月22日水曜日
Training phase
σ
I
class id
id 0: no such image in dataset
id 1: a image should be recognized by Lowe is found
id 2: a image should be recognized by VTA is found
learning
13年5月22日水曜日
Training phase
σ
I
class id
The right image shows a table where the classification was
spread over to states where there are no prior examples, and
that allows the classification of other images
learning
13年5月22日水曜日
Learning and Test database
image dataset of nine typical household objects
(Ramisa et al, 2008)
objects with
repetitive texture
textured
objects
non-textured
objects
3 categories (3 objects per category), each category consists of 3 different
objects and each object has approximately 20 training images
13年5月22日水曜日
Learning and Test database
image dataset of nine typical household objects
(Ramisa et al, 2008)
test image
include occlusions, illumination changes, blur and other typical nuisances
that will be encountered while navigating with a mobile robot
13年5月22日水曜日
Learning and Test database
image dataset of nine typical household objects
(Ramisa et al, 2008)
background images
that do not contain objects to be recognized
13年5月22日水曜日
Space State Descriptors
MS
mean and standard deviation of the image intensity
mean and standard deviation of the image intensity plus
entropy of the image
MSE
mean and standard deviation of the image intensity plus the
number of interest points detected by the Difference of
Gaussians operator
MSI
13年5月22日水曜日
Experiments
13年5月22日水曜日
Reward table
reward +10
reward -10
x in the training set exists an image with this
combination of mean and std dev values
. represents images that does not
contain objects (backgrounds)
(whitespace) is represents absence ob
images
13年5月22日水曜日
Classification Table
x in the training set exists an image with this
combination of mean and std dev values
. represents images that does not
contain objects (backgrounds)
(whitespace) is represents absence ob
images
the results of applying the RL algorithm during the learning phase
13年5月22日水曜日
Experiments
object image
(3 categories)
test image
(with nuisance)
background image
(with no objects)
image dataset of nine typical household objects
(Ramisa et al, 2008)
Experiment phases
1. the training of the RL
‣ 40 test images, from which approximately 160 images containing objects
were segmented and previously classified
‣ 360 background images, also resulting from the segmentation process
2. the execution phase where training quality can be verified
13年5月22日水曜日
Correctly Classified Images
Full ImgFull ImgFull Img Small ImgSmall ImgSmall Img Expert
MS MSE MSI MS MSE MSI
Back 91.9 100.0 98.0 92.6 100.0 98.9 100.0
Lowe 84.5 100.0 44.4 76.0 98.4 38.1 93.2
(%)
↵ = 0.1 = 0.9
13年5月22日水曜日
Incorrect Classification
Full ImgFull ImgFull Img Small ImgSmall ImgSmall Img Expert
MS MSE MSI MS MSE MSI
Back 12.8 1.8 14.2 20.4 2.4 25.3 8.2
Lowe 11.6 1.9 7.9 15.8 1.9 9.9 10.8
(%)
↵ = 0.1 = 0.9
13年5月22日水曜日
Conclusion
• Reinforcement Learning has been widely used in the
Computer Vision field
• In this paper we presented a method that uses
Reinforcement Learning to decide which algorithm
should be used to recognize objects seen by a mobile
robot in an indoor environment
• Another important contribution of this work is a
method that allows the use of a Reinforcement
Learning algorithm as a Classifier
13年5月22日水曜日

More Related Content

What's hot

A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...Editor IJMTER
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 ahmed mokhtar
 
Uncertainty-wise Engineering of IoT Cloud Systems
Uncertainty-wise Engineering of IoT Cloud SystemsUncertainty-wise Engineering of IoT Cloud Systems
Uncertainty-wise Engineering of IoT Cloud SystemsLuca Berardinelli
 
Object tracking a survey
Object tracking a surveyObject tracking a survey
Object tracking a surveyHaseeb Hassan
 
Deep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentationDeep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentation경훈 김
 
Video object tracking with classification and recognition of objects
Video object tracking with classification and recognition of objectsVideo object tracking with classification and recognition of objects
Video object tracking with classification and recognition of objectsManish Khare
 
Introduction to Object recognition
Introduction to Object recognitionIntroduction to Object recognition
Introduction to Object recognitionAshiq Ullah
 
Object tracking
Object trackingObject tracking
Object trackingchirase44
 
Research on object detection and recognition using machine learning algorithm...
Research on object detection and recognition using machine learning algorithm...Research on object detection and recognition using machine learning algorithm...
Research on object detection and recognition using machine learning algorithm...YousefElbayomi
 
Strategy for Foreground Movement Identification Adaptive to Background Variat...
Strategy for Foreground Movement Identification Adaptive to Background Variat...Strategy for Foreground Movement Identification Adaptive to Background Variat...
Strategy for Foreground Movement Identification Adaptive to Background Variat...IJECEIAES
 
Resume Yu-Li Liang
Resume Yu-Li LiangResume Yu-Li Liang
Resume Yu-Li LiangYuli Liang
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCENEHA THADEUS
 
A Survey On Tracking Moving Objects Using Various Algorithms
A Survey On Tracking Moving Objects Using Various AlgorithmsA Survey On Tracking Moving Objects Using Various Algorithms
A Survey On Tracking Moving Objects Using Various AlgorithmsIJMTST Journal
 
Image recognition
Image recognitionImage recognition
Image recognitionJoel Jose
 

What's hot (20)

A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...A Critical Survey on Detection of Object and Tracking of Object With differen...
A Critical Survey on Detection of Object and Tracking of Object With differen...
 
Object tracking final
Object tracking finalObject tracking final
Object tracking final
 
Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1 Video surveillance Moving object detection& tracking Chapter 1
Video surveillance Moving object detection& tracking Chapter 1
 
Uncertainty-wise Engineering of IoT Cloud Systems
Uncertainty-wise Engineering of IoT Cloud SystemsUncertainty-wise Engineering of IoT Cloud Systems
Uncertainty-wise Engineering of IoT Cloud Systems
 
Object tracking a survey
Object tracking a surveyObject tracking a survey
Object tracking a survey
 
Deep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentationDeep sort and sort paper introduce presentation
Deep sort and sort paper introduce presentation
 
Video object tracking with classification and recognition of objects
Video object tracking with classification and recognition of objectsVideo object tracking with classification and recognition of objects
Video object tracking with classification and recognition of objects
 
L0816166
L0816166L0816166
L0816166
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Introduction to Object recognition
Introduction to Object recognitionIntroduction to Object recognition
Introduction to Object recognition
 
Object tracking
Object trackingObject tracking
Object tracking
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Research on object detection and recognition using machine learning algorithm...
Research on object detection and recognition using machine learning algorithm...Research on object detection and recognition using machine learning algorithm...
Research on object detection and recognition using machine learning algorithm...
 
Image recognition
Image recognitionImage recognition
Image recognition
 
Strategy for Foreground Movement Identification Adaptive to Background Variat...
Strategy for Foreground Movement Identification Adaptive to Background Variat...Strategy for Foreground Movement Identification Adaptive to Background Variat...
Strategy for Foreground Movement Identification Adaptive to Background Variat...
 
Resume Yu-Li Liang
Resume Yu-Li LiangResume Yu-Li Liang
Resume Yu-Li Liang
 
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCEHUMAN MOTION  DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
HUMAN MOTION DETECTION AND TRACKING FOR VIDEO SURVEILLANCE
 
[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...
[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...
[IJET-V1I6P15] Authors : Sadhana Raut, Poonam Rohani,Sumera Shaikh, Tehesin S...
 
A Survey On Tracking Moving Objects Using Various Algorithms
A Survey On Tracking Moving Objects Using Various AlgorithmsA Survey On Tracking Moving Objects Using Various Algorithms
A Survey On Tracking Moving Objects Using Various Algorithms
 
Image recognition
Image recognitionImage recognition
Image recognition
 

Similar to Automatic selection of object recognition methods using reinforcement learning

IRJET- Proposed System for Animal Recognition using Image Processing
IRJET-  	  Proposed System for Animal Recognition using Image ProcessingIRJET-  	  Proposed System for Animal Recognition using Image Processing
IRJET- Proposed System for Animal Recognition using Image ProcessingIRJET Journal
 
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)csandit
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET Journal
 
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)cscpconf
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundIJCSIS Research Publications
 
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...IRJET Journal
 
UHDMML.pps
UHDMML.ppsUHDMML.pps
UHDMML.ppsbutest
 
A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...Institute of Agricultural Machinery, NARO
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
Real time object tracking and learning using template matching
Real time object tracking and learning using template matchingReal time object tracking and learning using template matching
Real time object tracking and learning using template matchingeSAT Publishing House
 
Detection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A SurveyDetection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A SurveyIJERA Editor
 
UNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTS
UNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTSUNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTS
UNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTSijaia
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
Dj31514517
Dj31514517Dj31514517
Dj31514517IJMER
 
IRJET - A Systematic Observation in Digital Image Forgery Detection using MATLAB
IRJET - A Systematic Observation in Digital Image Forgery Detection using MATLABIRJET - A Systematic Observation in Digital Image Forgery Detection using MATLAB
IRJET - A Systematic Observation in Digital Image Forgery Detection using MATLABIRJET Journal
 
Object Detection and Tracking AI Robot
Object Detection and Tracking AI RobotObject Detection and Tracking AI Robot
Object Detection and Tracking AI RobotIRJET Journal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
IRJET- Automated Attendance System using Face Recognition
IRJET-  	  Automated Attendance System using Face RecognitionIRJET-  	  Automated Attendance System using Face Recognition
IRJET- Automated Attendance System using Face RecognitionIRJET Journal
 

Similar to Automatic selection of object recognition methods using reinforcement learning (20)

IRJET- Proposed System for Animal Recognition using Image Processing
IRJET-  	  Proposed System for Animal Recognition using Image ProcessingIRJET-  	  Proposed System for Animal Recognition using Image Processing
IRJET- Proposed System for Animal Recognition using Image Processing
 
D018112429
D018112429D018112429
D018112429
 
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
 
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
AUTOMATIC THEFT SECURITY SYSTEM (SMART SURVEILLANCE CAMERA)
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
 
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...IRJET- 	 Moving Object Detection using Foreground Detection for Video Surveil...
IRJET- Moving Object Detection using Foreground Detection for Video Surveil...
 
UHDMML.pps
UHDMML.ppsUHDMML.pps
UHDMML.pps
 
A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...A multi-sensor based uncut crop edge detection method for head-feeding combin...
A multi-sensor based uncut crop edge detection method for head-feeding combin...
 
Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
Real time object tracking and learning using template matching
Real time object tracking and learning using template matchingReal time object tracking and learning using template matching
Real time object tracking and learning using template matching
 
Detection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A SurveyDetection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A Survey
 
UNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTS
UNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTSUNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTS
UNSUPERVISED ROBOTIC SORTING: TOWARDS AUTONOMOUS DECISION MAKING ROBOTS
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
Dj31514517
Dj31514517Dj31514517
Dj31514517
 
IRJET - A Systematic Observation in Digital Image Forgery Detection using MATLAB
IRJET - A Systematic Observation in Digital Image Forgery Detection using MATLABIRJET - A Systematic Observation in Digital Image Forgery Detection using MATLAB
IRJET - A Systematic Observation in Digital Image Forgery Detection using MATLAB
 
Object Detection and Tracking AI Robot
Object Detection and Tracking AI RobotObject Detection and Tracking AI Robot
Object Detection and Tracking AI Robot
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
IRJET- Automated Attendance System using Face Recognition
IRJET-  	  Automated Attendance System using Face RecognitionIRJET-  	  Automated Attendance System using Face Recognition
IRJET- Automated Attendance System using Face Recognition
 
Motion analyser using image processing
Motion analyser using image processingMotion analyser using image processing
Motion analyser using image processing
 

More from Shunta Saito

Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Shunta Saito
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)Shunta Saito
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
Building and road detection from large aerial imagery
Building and road detection from large aerial imageryBuilding and road detection from large aerial imagery
Building and road detection from large aerial imageryShunta Saito
 
DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksDeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksShunta Saito
 
Building detection with decision fusion
Building detection with decision fusionBuilding detection with decision fusion
Building detection with decision fusionShunta Saito
 
強化学習入門
強化学習入門強化学習入門
強化学習入門Shunta Saito
 
視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論Shunta Saito
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回Shunta Saito
 

More from Shunta Saito (11)

Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
LT@Chainer Meetup
LT@Chainer MeetupLT@Chainer Meetup
LT@Chainer Meetup
 
Building and road detection from large aerial imagery
Building and road detection from large aerial imageryBuilding and road detection from large aerial imagery
Building and road detection from large aerial imagery
 
DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksDeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural Networks
 
Building detection with decision fusion
Building detection with decision fusionBuilding detection with decision fusion
Building detection with decision fusion
 
強化学習入門
強化学習入門強化学習入門
強化学習入門
 
視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Automatic selection of object recognition methods using reinforcement learning

  • 1. Automatic Selection of Object Recognition Methods Using Reinforcement Learning Reinaldo A.C. Bianchi†, Arnau Ramisa‡, and Ramón López de Mántaras‡ †Centro Universitário da FEI, Brazil ‡Artificial Intelligence Research Institute, Spain Presenter: Shunta SAITO 13年5月22日水曜日
  • 2. Authors • Reinaldo A.C. Bianchi ‣ full professor at the electric engineering department of the centro universitário da FEI, at são bernardo do campo, são paulo, brasil • Arnau Ramisa ‣ postdoc with the Perception and Manipulation team at the Industrial Robotics Institute (IRI UPC-CSIC), at the Universitat Politecnica de Catalunya • Ramón López de Mántaras ‣ Director of the IIIA (Artificial Intelligence Research Institute) of the CSIC (Spanish National Research Council) 13年5月22日水曜日
  • 3. Abstract Which algorithm should be used to recognize objects? Question: Goal: Automatically select the best algorithm from 2 state-of- the-art object recognition algorithms Methodology: Using Reinforcement Learning Background: The robot should be able to decide by itself which object recognition method should be used, depending on the current conditions of the world 13年5月22日水曜日
  • 4. Abstract Which algorithm should be used to recognize objects? Question: Goal: Automatically select the best algorithm from 2 state-of- the-art object recognition algorithms Methodology: Using Reinforcement Learning Background: The robot should be able to decide by itself which object recognition method should be used, depending on the current conditions of the world 13年5月22日水曜日
  • 5. Reinforcement Learning The RL problem is meant to be a straightforward framing of the problem of learning from interaction to achieve a goal (Sutton and Barto, 1998) Formulation: assuming that the environment fill Markov property - finite set of states   that the agent can achieve;s 2 S - A finite set of possible actions   that the agent can perform;a 2 A - A state transition function T : S ⇥ A ! Y (S) where   is a probability distribution over Y (S) S - A finite set of bounded reinforcements (payoffs) R : S ⇥ A ! R task: find out a stationary policy of actions ⇡⇤ : S ! A 13年5月22日水曜日
  • 6. Reinforcement Learning Formulation: Q⇤ (s, a) is the reward received upon performing action in statea s + the discounted value of following the optimal policy thereafter Q⇤ (s, a) ⌘ R(s, a) + X s02S T(s, a, s0 )V ⇤ (s0 ) optimal policy ⇡⇤ ⌘ arg maxaQ⇤ (s, a) transition probability : optimal state-action function Q⇤ (s, a) ⌘ R(s, a) + X s02S T(s, a, s0 )maxa0 Q⇤ (s0 , a0 ) 13年5月22日水曜日
  • 7. Q-Learning Formulation: The Q-learning algorithm iteratively approximates ˆQ the values will converge with probability 1 toˆQ Q⇤ ˆQLet be the learner s estimate of Q⇤ (s, a) ˆQ(s, a) ˆQ(s, a) + ↵ h r + maxa0 ˆQ(s0 , a0 ) ˆQ(s, a) i a0 s0 s a ... ... ... ... ... ... ... ... ... backup diagram ˆQ(s, a) ˆQ(s0 , a0 ) (0  < 1) ↵ = 1 1 + visits(s, a) total number of times this state- action pair has been visited times, and backup↵ step-size parameter 13年5月22日水曜日
  • 8. Reinforcement Learning Q-learning Algorithm: Q.initialize(arbitrarily) Episodes.each do s.initialize while s.is_terminal? a = epsilon_greedy(s) # using policy derived rom Q reward, next_state = take_action a Q.update(s, a, reward, next_state) s = next_state end end ˆQ(s, a) ˆQ(s, a) + ↵ h r + maxa0 ˆQ(s0 , a0 ) ˆQ(s, a) i 13年5月22日水曜日
  • 9. RL applications in Computer Vision Active Vision Whitehead and Ballard (1991) => Machine Learning (Jnl.) described an adaptive control architecture to integrate active sensory- motor systems with RL based decision systems Minut and Mahadevan (2001) => ICAA proposed a model of selective attention for visual search tasks, such as deciding where to fixate next in order to reach the region where an object is most likely to be found Darrell and Pentland (1996a,b) => NIPS, ICPR proposed a gesture recognition system that guides an active camera to foveate salient features based on a RL paradigm To angle one's eyes such that the foveae are directed at (an object in one's field of view) 13年5月22日水曜日
  • 10. RL applications in Computer Vision Active Vision Darrell (1998) => NIPS concisely represented active recognition behavior derived from hidden- state reinforcement learning techniques Paletta and Pinz (2000) => Robotics and Autonomous Systems (Jnl.) applied RL in an active object recognition system, to learn how to move the camera to informative viewpoints, defining the recognition process as a sequential decision problem with the objective of disambiguating initial object hypotheses Reinforcement Learning provides then an efficient method to autonomously develop near-optimal decision strategies in terms of sensori- motor mappings (Paletta et al, 1998) For these authors... 13年5月22日水曜日
  • 11. RL applications in Computer Vision Active Vision Borotschnig, et al. (1999) => Image and Vision Computing (Jnl.) built a system that learns to reposition the camera to capture additional views to improve the iamge classification result which is obtained from a single view Paletta, et al. (2005) => ICML proposed the use of Q-learning to associate shift of attention actions to cumulative reward with respect to object recognition Image Segmentation Peng and Bhanu (1998) => IEEE Trnsc. on PAMI used RL to learn to adapt the image segmentation params of a specific algorithm to the changing environmental conditions 13年5月22日水曜日
  • 12. RL applications in Computer Vision Image Segmentation and Object Recognition Peng and Bhanu (1998, 2000) => IEEE Trnsc. on SMC improved the recognition results by using the output at the highest level as feedback for the learning system Taylor (2004) => MSc Thesis proposed a general framework for applying RL to parameter selection problem in vision Parameter Selection in Vision Problem Tizhoosh and Taylor (2006) => Int. Jnl. of Image and Graphics proposed a automated technique for obtaining a subjectively ideal image enhancement 13年5月22日水曜日
  • 13. RL applications in Computer Vision Parameter Selection in Vision Problem Shokri and Tizhoosh (2003∼2008) => some Int. Jnl.s... proposed a reinforcement agent for finding an optimal threshold in order to segment digital images Yin (2002) => Signal Process (Jnl.) design a general framework for an intelligent system to extract one object of interest from ultrasound images based on reinforcement learning Sahba, et al. (2008) => Expert Systems with Applications (Jnl.) proposed a RL system for adaptive tropical cyclone patterns segmentation and feature extraction from satellite imagery and introduced a closed-loop system based tropical cyclone forecast on RL Hossain, et al. (1999) => IEEE SMC 13年5月22日水曜日
  • 14. RL applications in Computer Vision Object Recognition modeled the object recognition problem as a Markov Decision Problem, and proposed a theoretically sound method for constructing object recogntion strategies by combining CV algorithms to perform segmentation (The result is a system called ADORE (Adaptive Object Recognition) that automatically learns object-recognition strategies from training data) Draper, et al. (1999) => Jnl. of Computer Vision Research There are many applications of RL in Computer Vision 13年5月22日水曜日
  • 15. Summary of RL in CV Whitehead and Ballard (1991) active sensory-motor systems RL based decision systems+ adaptive control architecture optimize the performance of active vision systems decide where the focus of attention should be learn how to move a camera to more informative viewpoints optimize parameters of existing and new CV algorithms diversified to... 13年5月22日水曜日
  • 16. Limitations caused by RL In Object Recognition Task the reward value associated with a situation ‣ is usually not directly available ‣ requires that a certain amount of knowledge about the world to be defined the large space state ‣ make it difficult to converge ‣ RL algorithms rises performance issues 13年5月22日水曜日
  • 17. Two Object Recognition Methods Lowe s Feature Matching method (Lowe, 2004) Vocabulary Tree Algorithm (Nistér and Stewénius, 2006) ‣ is proposed together with SIFT ‣ is a single view object detection and recognition system ‣ matches features between a test and model images as below ‣ uses visual words (Bag-of-Words) to classify images 13年5月22日水曜日
  • 18. Lowe s Method Model Input Result Identification 13年5月22日水曜日
  • 19. Vocabulary Tree Algorithm which class? Classification 13年5月22日水曜日
  • 20. Preprocessing • segmentation 1. apply bilateral filtering to remove texture from the image 2. the Canny edge detector is applied to define the edges in the image 3. mathematical morphology operators are applied in order to close the contours that remained open 4. a flood-fill algorithm is used to fill connected areas divided by the edges 13年5月22日水曜日
  • 21. Weaknesses of two methods Lowe s Feature Matching Method performs poorly when recognizing sparsely textured objects or objects with repetitive patterns Vocabulary Tree Algorithm needs an accurate segmentation state, prior to classification, which can be very time consuming, and it depends on the quality of the segmentation stage to provide good results 13年5月22日水曜日
  • 22. Learning to Select Object Recognition Methods 1st stage 1. decide to use the image for recognition ‣ because the image contains an object 2. decide the image should be discarded ‣ because the image does not contain objects 2nd stage 1. decide to use Lowe s algorithm 2. decide to use Vocabulary Tree(VT) Algorithm use Reinforcement Learning as a classification method 13年5月22日水曜日
  • 23. State attributes extracted from the images the possible classification of the image + Space state State definition example in 1st stage s = [I, , c] c : class ID σ: standard deviation of image intensity I : mean image intensity 13年5月22日水曜日
  • 24. Action Update action (not real action happening in the world) Q(s, a)update the value of a state-action pair   at one state using the value of a neighbor pair Example: if space state is composed of [ I, σ] (2D) I σ 0.1 0.7 1.2 3.1 1.8 0.5 1.1 0.3 2.6 4.1 1.4 2.3 3.2 0.9 2.7 0.7 4.3 2.7 1.4 3.9 3.2 4.6 1.3 1.7 0.7 (after action a) I σ 0.1 0.7 1.2 3.1 1.8 0.5 1.1 0.3 2.6 4.1 1.4 2.3 2.3 0.9 2.7 0.7 4.3 2.7 1.4 3.9 3.2 4.6 1.3 1.7 0.7 update action →action toward space state Q(s, a) values 13年5月22日水曜日
  • 25. Reward If the learning agent reaches a state where a traning image exists and the state corresponds to the correct classification of the image, the agent receives a reward Example: a training image (mean = 50, std = 10, not contains objects) a state (mean = 50, std = 10, classification = discard) exists? yes no reward > 0 reward = 0 13年5月22日水曜日
  • 26. MDP Definition • The set of update actions that the agent can perform, defined as update the Q value using the value of a neighbor • The finite set of states in this case is the n-dimensional space of values of the attributes extracted from the images plus its classification; • The state transition function allows updates to be made between any pair of neighbors in the set of states • The Reinforcements are defined using a set of training images a 2 A s 2 S R : S ⇥ A ! R 13年5月22日水曜日
  • 27. Training phase Reinforcement learning is performed over a set of pre- classified images → learn a mapping from images to image classes Learning algorithm 13年5月22日水曜日
  • 28. Training phase What is happening during the learning phase? Goal Every time the robot finds a goal state and receive a reward, the state-action pair where the robot was before reaching the goal state is updated Every time the robot moves, it iteratively updates the origin state-action pair 13年5月22日水曜日
  • 29. Training phase σ I class id id 0: no such image in dataset id 1: a image should be recognized by Lowe is found id 2: a image should be recognized by VTA is found learning 13年5月22日水曜日
  • 30. Training phase σ I class id The right image shows a table where the classification was spread over to states where there are no prior examples, and that allows the classification of other images learning 13年5月22日水曜日
  • 31. Learning and Test database image dataset of nine typical household objects (Ramisa et al, 2008) objects with repetitive texture textured objects non-textured objects 3 categories (3 objects per category), each category consists of 3 different objects and each object has approximately 20 training images 13年5月22日水曜日
  • 32. Learning and Test database image dataset of nine typical household objects (Ramisa et al, 2008) test image include occlusions, illumination changes, blur and other typical nuisances that will be encountered while navigating with a mobile robot 13年5月22日水曜日
  • 33. Learning and Test database image dataset of nine typical household objects (Ramisa et al, 2008) background images that do not contain objects to be recognized 13年5月22日水曜日
  • 34. Space State Descriptors MS mean and standard deviation of the image intensity mean and standard deviation of the image intensity plus entropy of the image MSE mean and standard deviation of the image intensity plus the number of interest points detected by the Difference of Gaussians operator MSI 13年5月22日水曜日
  • 36. Reward table reward +10 reward -10 x in the training set exists an image with this combination of mean and std dev values . represents images that does not contain objects (backgrounds) (whitespace) is represents absence ob images 13年5月22日水曜日
  • 37. Classification Table x in the training set exists an image with this combination of mean and std dev values . represents images that does not contain objects (backgrounds) (whitespace) is represents absence ob images the results of applying the RL algorithm during the learning phase 13年5月22日水曜日
  • 38. Experiments object image (3 categories) test image (with nuisance) background image (with no objects) image dataset of nine typical household objects (Ramisa et al, 2008) Experiment phases 1. the training of the RL ‣ 40 test images, from which approximately 160 images containing objects were segmented and previously classified ‣ 360 background images, also resulting from the segmentation process 2. the execution phase where training quality can be verified 13年5月22日水曜日
  • 39. Correctly Classified Images Full ImgFull ImgFull Img Small ImgSmall ImgSmall Img Expert MS MSE MSI MS MSE MSI Back 91.9 100.0 98.0 92.6 100.0 98.9 100.0 Lowe 84.5 100.0 44.4 76.0 98.4 38.1 93.2 (%) ↵ = 0.1 = 0.9 13年5月22日水曜日
  • 40. Incorrect Classification Full ImgFull ImgFull Img Small ImgSmall ImgSmall Img Expert MS MSE MSI MS MSE MSI Back 12.8 1.8 14.2 20.4 2.4 25.3 8.2 Lowe 11.6 1.9 7.9 15.8 1.9 9.9 10.8 (%) ↵ = 0.1 = 0.9 13年5月22日水曜日
  • 41. Conclusion • Reinforcement Learning has been widely used in the Computer Vision field • In this paper we presented a method that uses Reinforcement Learning to decide which algorithm should be used to recognize objects seen by a mobile robot in an indoor environment • Another important contribution of this work is a method that allows the use of a Reinforcement Learning algorithm as a Classifier 13年5月22日水曜日