IROS 2017 Slides

Scalable Robust Learning from
Demonstration with Leveraged
Deep Neural Networks
Sungjoon Choi, Kyungjae Lee, and Songhwai Oh
Seoul National University

Contents
2Scalable Robust Learning from Demonstration with Leveraged Deep Neural Networks
 Leverage Optimization
 Leveraged Gaussian Process
 Leveraged Deep Neural Network
 Conclusion
 Learning from Demonstration
 Experiment
 Preliminaries

Learning from Demonstration
3
Human
Expert
Learning from
Demonstration
Execute in Unseen
Environments
Scalable Robust Learning from Demonstration with Leveraged Deep Neural Networks

Existing Limitations
We want to incorporate demonstrations with
mixed qualities without labeling.
Most LfD methods assume the optimality of actions.

Existing Limitations
We want to incorporate large-scale demonstrations.
Some LfD methods are not be scalable.

Leveraged Gaussian Process
Choi et.al.,"Leveraged Non-Stationary Gaussian Process Regression for Autonomous Robot Navigation”, ICRA 2015
For example,
𝛾 = 1.0 𝛾 = 0.8 𝛾 = −0.5 𝛾 = 0
𝜋 𝜉𝑗
𝑗 = 1, … , 𝑁
Gaussian
process
𝜋𝑖 𝜉𝑖𝑗
𝑖 = 1, … , 𝑀
𝑗 = 1, … , 𝑁
𝛾𝑖
where the correlation between 𝜋𝑖 and 𝜋𝑖′ is
defined by cos
𝜋
2
𝛾𝑖 − 𝛾𝑖′ (𝛾𝑖 is between -1 and
+1).
Leveraged Gaussian process

Leveraged Gaussian Process
Positive definiteness can be shown with the Bochner’s theorem.
(An isotropic kernel function is PSD iff. its Fourier coefficients are non-negative.)
Leveraged Kernel Function Leverage Gaussian
Process
However, Gaussian process regression is not good for large training data.
Choi et.al.,"Leveraged Non-Stationary Gaussian Process Regression for Autonomous Robot Navigation”, ICRA 2015

Leverage Optimization
Dataset with
Mixed Qualities
Choi et.al.,"Robust Learning from Demonstration Using Leveraged Gaussian Processes and Sparse-Constrained Optimization”, ICRA 2016

Dataset with
Mixed Qualities
leverage: 1 leverage: 0.9 leverage: 0.7 leverage: -1 leverage: 0

The key intuition behind the leverage optimization is that we cast
the leverage optimization problem into a model selection problem
in Gaussian process regression.
However, the number of leverage parameters is equivalent to the
number of training data.
To handle this issue, a sparse constrained leverage optimization
where we assume that the majority of leverage parameters are +1 is
presented.
where −L ⋅ is the negative log likelihood and 𝛾 = 𝛾 − 1.
Resulting optimization problem becomes:

Leveraged Neural Network
11
• We propose a leveraged deep neural network by proposing a
leveraged cost function by interpreting the objective function
of the leveraged Gaussian processes using the representer
theorem.
Scalable Robust Learning from Demonstration with Leveraged Deep Neural Networks
Representer theorem +
(Assume 𝜸𝒊 is either −1 or +1.)
Empirical risk term

Leveraged Neural Network
Input New TargetLeverage
Parameterized
Estimator Regularizer
Estimator
Leveraged
Cost Function:
New
Target

Experiment-setting
Input: 𝑑 𝐿
𝐵
, 𝑑 𝐿
𝐹
, 𝑑 𝐶
𝐵
, 𝑑 𝐶
𝐹
, 𝑑 𝑅
𝐵
, 𝑑 𝑅
𝐹
, 𝑑 𝑑𝑒𝑣 ∈ 𝑅7
Output: 𝜃 𝑑𝑒𝑣 ∈ 𝑅
*Two hidden layers (512 units) with tanh activation functions.

Experiment-data collection

Experiment-leverage optimization
Dataset
with
Mixed
Qualities
Safe:Inexp:Suicidal = 70:15:15

Experiment-driving results
Gaussian process
without optimization
Leveraged deep neural
network (20,000 demos)
Leveraged Gaussian
process (5,000 demos)
Safe:Inexp:Suicidal = 70:15:15

Conclusion
 Robust and scalable learning from demonstration is presented.
 Robustness comes from the leverage optimization [1].
 Scalability comes from the leveraged deep neural network
using the proposed leveraged cost function.
 The proposed method is successfully applied to a track driving
task where the demonstrations are collected from multiple
modes with different proficiencies.
 Further work will focus on incorporating the uncertainty
information of a model prediction using a Bayesian network
where the initial results can be found in [2].
[1] Choi et.al.,"Robust Learning from Demonstration Using Leveraged Gaussian Processes and Sparse-Constrained Optimization”, ICRA 2016
[2] Choi et. al. ‘Uncertainty-Aware Learning from Demonstration using Mixture Density Networks with Sampling-Free Variance Modeling’, ArXiv1709.02249, 2017

Thank you for your attention.
Contact information
(sungjoon.choi@cpslab.snu.ac.kr)

Choi et. al. ‘Uncertainty-Aware Learning from Demonstration using Mixture Density Networks with
Sampling-Free Variance Modeling’, ArXiv 1709.02249, 2017
Uncertainty-Aware LfD

IROS 2017 Slides

More Related Content

Similar to IROS 2017 Slides

More from Sungjoon Choi

Recently uploaded

IROS 2017 Slides

Editor's Notes