Scalable Robust LfD with Leveraged DNNs

Scalable Robust Learning from
Demonstration with Leveraged
Deep Neural Networks
Sungjoon Choi, Kyungjae Lee, and Songhwai Oh
Seoul National University

Contents
2Scalable Robust Learning from Demonstration with Leveraged Deep Neural Networks
§ Leverage Optimization
§ Leveraged Gaussian Process
§ Leveraged Deep Neural Network
§ Conclusion
§ Learning from Demonstration
§ Experiment
§ Preliminaries

Learning from Demonstration
3
Human
Expert
Learning from
Demonstration
Execute in Unseen
Environments
Scalable Robust Learning from Demonstration with Leveraged Deep Neural Networks

Existing Limitations
We want to incorporate demonstrations with
mixed qualities without labeling.
Most LfD methods assume the optimality of actions.

Existing Limitations
We want to incorporate large-scale demonstrations.
Some LfD methods are not be scalable.

Leveraged Gaussian Process
Choi et.al.,"Leveraged Non-Stationary Gaussian Process Regression for Autonomous Robot Navigation”, ICRA 2015
For example,
𝛾 = 1.0 𝛾 = 0.8 𝛾 = −0.5 𝛾 = 0
𝜋 𝜉,
𝑗 = 1, …, 𝑁
Gaussian process
𝜋1 𝜉1,
𝑖 = 1, … , 𝑀
𝑗 = 1, …, 𝑁
𝛾1
where the correlation between 𝜋1 and 𝜋14 is defined
by cos
8
9
𝛾1 − 𝛾14 (𝛾1 is between -1 and +1).
Leveraged Gaussian process

Leveraged Gaussian Process
Positive definiteness can be shown with the Bochner’s theorem.
(An isotropic kernel function is PSD iff. its Fourier coefficients are non-negative.)
Leveraged Kernel Function Leverage Gaussian Process
However, Gaussian process regression is not good for large training data.
Choi et.al.,"Leveraged Non-Stationary Gaussian Process Regression for Autonomous Robot Navigation”, ICRA 2015

Leverage Optimization
Dataset with
Mixed Qualities
Choi et.al.,"Robust Learning from Demonstration Using Leveraged Gaussian Processes and Sparse-Constrained Optimization”, ICRA 2016

Dataset with
Mixed Qualities
leverage: 1 leverage: 0.9 leverage: 0.7 leverage: -1 leverage: 0

The key intuition behind the leverage optimization is that we cast the
leverage optimization problem into a model selection problem in
Gaussian process regression.
However, the number of leverage parameters is equivalent to the
number of training data.
To handle this issue, a sparse constrained leverage optimization where
we assume that the majority of leverage parameters are +1 is
presented.
where −L ⋅ is the negative log likelihood and 𝛾̅ = 𝛾 − 1.
Resulting optimization problem becomes:

Leveraged Neural Network
11
• We propose a leveraged deep neural network by proposing a
leveraged cost function by interpreting the objective function of
the leveraged Gaussian processes using the representer theorem.
Scalable Robust Learning from Demonstration with Leveraged Deep Neural Networks
Representer theorem +
(Assume 𝜸𝒊 is either −1 or +1.)
Empirical risk term

Leveraged Neural Network
Input New TargetLeverage
Parameterized
Estimator Regularizer
Estimator
Leveraged
Cost Function:
New
Target

Experiment-setting
Input: 𝑑A
B,𝑑A
C, 𝑑D
B,𝑑D
C, 𝑑E
B,𝑑E
C, 𝑑FGH ∈ 𝑅K
Output: 𝜃FGH ∈ 𝑅
*Two hidden layers (512 units) with tanh activation functions.

Experiment-data collection

Experiment-leverage optimization
Dataset
with
Mixed
Qualities
Safe:Inexp:Suicidal = 70:15:15

Experiment-driving results
Gaussian process
without optimization
Leveraged deep neural
network (20,000 demos)
Leveraged Gaussian
process (5,000 demos)
Safe:Inexp:Suicidal = 70:15:15

Conclusion
§ Robust and scalable learning from demonstration is presented.
§ Robustness comes from the leverage optimization [1].
§ Scalability comes from the leveraged deep neural network using
the proposed leveraged cost function.
§ The proposed method is successfully applied to a track driving
task where the demonstrations are collected from multiple modes
with different proficiencies.
§ Further work will focus on incorporating the uncertainty information
of a model prediction using a Bayesian network where the initial
results can be found in [2].
[1] Choi et.al.,"Robust Learning from Demonstration Using Leveraged Gaussian Processes and Sparse-Constrained Optimization”, ICRA 2016
[2] Choi et. al. ‘Uncertainty-Aware Learning from Demonstration using Mixture Density Networks with Sampling-Free Variance Modeling’, ArXiv1709.02249, 2017

Thank you for your attention.
Contact information (sungjoon.choi@cpslab.snu.ac.kr)

Choi et. al. ‘Uncertainty-Aware Learning from Demonstration using Mixture Density Networks with
Sampling-Free Variance Modeling’, ArXiv 1709.02249, 2017
Uncertainty-Aware LfD

Scalable Robust LfD with Leveraged DNNs

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scalable Robust LfD with Leveraged DNNs

Similar to Scalable Robust LfD with Leveraged DNNs (20)

More from Sungjoon Choi

More from Sungjoon Choi (15)

Recently uploaded

Recently uploaded (20)

Scalable Robust LfD with Leveraged DNNs