Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013

DariusBurschka–MVPGroupatTUM
http://mvp.visual-navigation.com SPME Workshop, May 5, 2013
Darius Burschka
Machine Vision and Perception Group
Department of Computer Science
Technische Universität München
Semantic Perception for Semi-Autonomous
Teleoperation Tasks

Shared-Control for Telemanipulation

ASCENT – Augmented Shared-Control for
Efficient Natural Telemanipulation
(ICRA 2013 J. Bohren et al. Teleoperation WeF6 5:45pm Clubraum)
Fig. 1: The experiments were conducted with a human operator at The Johns Hopkins University (JHU) Homewood Campus
in Baltimore, MD, USA, utilizing a da Vinci R
Master Console (left) commanding a DLR LWR as part of the SAPHARI
platform at the German Aerospace Center (DLR) in Oberpfaffenhofen, Germany (right).
• Many remote telerobotic applications have limitations
on bandwidth, creating a situation where the ﬁdelity
of the imaging is compromised. The availability of
stereoscopic imaging, image resolution and frame rates
may be limited, leading to a limited ability to resolve
necessary detail for manipulation. This is particularly
challenging given the absence of haptic cues noted
above increases the reliance on visual perception.
• Some environments impose additional communication
latency (time-delay) on telemetry as well. For example,
telemanipulation from Earth to low-earth orbit typically
imposes delays that exceed half a second for direct line-
of-sight communications and 2-7 seconds when using
larger-coverage on-orbit communications networks. The
limitations of human performance in telemanipulation
constrained circumstances. ASCENT takes a collaborative
systems approach that transcends the limitations of either
purely autonomous or purely teleoperated control modes by
combining task-speciﬁc sensor-based feedback with input
from an operator. As a result, the operator is able to provide
gross motion guidance to the system, and the remote manip-
ulator is able to adapt that motion based on environmental
information. We have implemented this approach with a
DLR lightweight arm driven by a da Vinci R
S master
console separated by over 4000 miles. We demonstrate
that ASCENT greatly improves manipulation performance,
particularly when subtle motions are necessary in order to
correctly perform the task.
II. BACKGROUND
Problems:
•  Depth perception is essential for grasping
•  Limited bandwidth does not always allow remote image
transmission
•  Significant latency in transmission deteriorates dexterity
of the control
•  Moving objects in the scene limit the allowed latency in
the control for robust direct manipulation in remote
environments

What do we try to extract from the
environment?
labeling motion parameters

What is in the scene? (labeling step)

Algorithm Description
(Model Preprocessing Phase)
• For all pairs of surflets at
distance d insert the triple
plus a pointer to its model in a
hash-table.
• Do this for all models using the
same hash-table.

• For each model surflet pair
in the hash-table cell:
Compute the rigid
transform T that
best aligns
Online Recognition Phase
model hash-table

IJRR 2012 Special Issue, Papazov et al.

What
happens
if
an
object
is
similar
to
one
in
the

database?

Indexing to the Atlas database needs
to be extended to object classes
-> deformable shape registration
needed
Atlas information Observed object

Deformable Registration from
generic models (special issue SGP'11 Papazov et al.)
Matching of a detailed shape
to a primitive prior
The manipulation “heat map” from
the generic model gets propagated

Deformable Registration
(special issue SGP 11, Papazov et al)
Input data

Deformable 3D Shape Registration Based on Local Similarity TransformsMVP

Hybrid Model of the Environment (JC Ramirez)
Object
Container
3D
reconstruction
&
plane
detection
Blob
Detection
FUSION
Object
Layer
Geometric
Layer
Sensor
Blobs
3D Data
MAP
Objects 3D Structure
Geometric
Blobs
Map
Update
System
Input Data Stream Output Data Stream

World model saves additional info, like texture,
motion, etc. (VISAPP 2013 J.Ramirez et al.)
Juan Carlos Ramirez and Darius B
Faculty for Informatics, Technische Universitaet Muenchen, Boltzman
ramirezd@in.tum.de, burschka@cs.
INTRODUCTION
Scene Tentative object candidates Encapsula
An approach to consistently model and characterize potential object candidate
Three principal procedures support our method:
i) the segmentation of the captured range images into 3D clusters or blobs, b
the spatial structure of the scene,
ii) the maintenance and reliability of the map, which are obtained through the
which we assign a degree of existence (confidence value),
iii) the visual motion estimation of potential object candidates, through the com
information, allows not only to update the state of the actors and perceive t
and refine their individual 3D structures over time.
Juan Carlos Ramirez and Darius Burschka
formatics, Technische Universitaet Muenchen, Boltzmannstr. 3, Garching bei Muenc
ramirezd@in.tum.de, burschka@cs.tum.edu
INTRODUCTION
Tentative object candidates Encapsulated 3D blobs Motion
consistently model and characterize potential object candidates presented in non-static scene
procedures support our method:
tion of the captured range images into 3D clusters or blobs, by which we obtain a first gross i
ucture of the scene,
nce and reliability of the map, which are obtained through the fusion of the captured and map
ign a degree of existence (confidence value),
tion estimation of potential object candidates, through the combination of the texture and 3D-
allows not only to update the state of the actors and perceive their changes in a scene, but als
eir individual 3D structures over time.
D Mapping
or 3D Structures in Dynamic Environments
and Darius Burschka
hen, Boltzmannstr. 3, Garching bei Muenchen, Germany
urschka@cs.tum.edu
UCTION
Encapsulated 3D blobs Motion estimation
bject candidates presented in non-static scenes.

Robust Feature Tracking through Fusion of
Camera and IMU Data (IROS 2009 E. Mair et al.)

Local Feature Tracking Algorithms
• Image-gradient based à Extended KLT (ExtKLT)
•  patch-based implementation
•  feature propagation
•  corner-binding
+  sub-pixel accuracy
•  algorithm scales bad with number
of features
• Tracking-By-Matching à AGAST tracker
•  AGAST corner detector
•  efficient descriptor
•  high frame-rates (hundrets of
features in a few milliseconds)
+  algorithm scales well with number
of features
•  pixel-accuracy
8

Adaptive and Generic Accelerated Segment Test
(AGAST)
9
Improvements compared to FAST:
• full exploration of the configuration space by backward-induction (no
learning)
• binary decision tree (not ternary)
• computation of the actual probability and processing costs
(no greedy algorithm)
• automatic scene adaption by tree switching (at no cost)
• various corner pattern sizes (not just one)
No drawbacks!
Mair, Hager, Burschka, Suppa, Hirzinger
ECCV, Springer, 2010
E. Rosten

Real Time Pose Tracking (IROS 2003 Burschka & Hager)

MachineVisionandPerceptionGroup@TUMMVP Learning from Human
Mapping of
Knowledge

Physical and Geometric Properties of an Object
(Object Contaier) (ICRA 2012 Petsch et al.)

Functional Properties of an Object
stored in Functionality Map

Each tool used in the
procedure has its own
container describing its
shape, handling properties
etc.
Knowledge Representation
Functionality map for a specific
procedure describes the way
how the tool was used during
the procedure while moved
between points in the world(Petsch/Burschka IROS2011)

Basic Experiments:
Functionality Maps (Tracking Data)

Functionality Maps

MachineVisionandPerceptionGroup@TUM Knowledge Representation
Atlas:
– Long-term memory
– Experience of the system
Working memory:
– Short-term memory
– Experience grounded in a given
environment
• Temporal handling information

Conclusions
Fig. 1: The experiments were conducted with a human operator at The Johns Hopkins University (JHU) Homewood Campus
in Baltimore, MD, USA, utilizing a da Vinci R
Master Console (left) commanding a DLR LWR as part of the SAPHARI
platform at the German Aerospace Center (DLR) in Oberpfaffenhofen, Germany (right).
• Many remote telerobotic applications have limitations
on bandwidth, creating a situation where the ﬁdelity
of the imaging is compromised. The availability of
stereoscopic imaging, image resolution and frame rates
may be limited, leading to a limited ability to resolve
necessary detail for manipulation. This is particularly
challenging given the absence of haptic cues noted
above increases the reliance on visual perception.
• Some environments impose additional communication
latency (time-delay) on telemetry as well. For example,
telemanipulation from Earth to low-earth orbit typically
imposes delays that exceed half a second for direct line-
of-sight communications and 2-7 seconds when using
larger-coverage on-orbit communications networks. The
limitations of human performance in telemanipulation
are well-studied, and the threshold at which human
performance begins to suffer is far below that [12].
constrained circumstances. ASCENT takes a collaborative
systems approach that transcends the limitations of either
purely autonomous or purely teleoperated control modes by
combining task-speciﬁc sensor-based feedback with input
from an operator. As a result, the operator is able to provide
gross motion guidance to the system, and the remote manip-
ulator is able to adapt that motion based on environmental
information. We have implemented this approach with a
DLR lightweight arm driven by a da Vinci R
S master
console separated by over 4000 miles. We demonstrate
that ASCENT greatly improves manipulation performance,
particularly when subtle motions are necessary in order to
correctly perform the task.
II. BACKGROUND
Presently, robots that are deployed to perform high-value
tasks usually fall into two broad categories:
Why is perception necessary:
•  Allows data reduction over slow links. In worst case,
just symbolic information about objects in the scene
•  Allows together with motion estimation a transparent
switch between direct control and autonomous handling
•  Allows to deal with the problem with high latencies and
fast motions in the scene
..Questions?

MachineVisionandPerceptionGroup@TUMMVP
Research of the MVP Group http://mvp.visual-navigation.com
The Machine Vision and
Perception Group @TUM works
on the aspects of visual
perception and control in
medical, mobile, and HCI
applications
Visual navigation
Biologically motivated
perception
Perception for manipulation
Visual Action Analysis
Photogrammetric monocular
reconstruction
Rigid and Deformable
Registration

MachineVisionandPerceptionGroup@TUMMVP
Research of the MVP Group http://mvp.visual-navigation.com
Exploration of physical
object properties
Sensor substitution
Multimodal Sensor
Fusion
Development of new
Optical Sensors
The Machine Vision and
Perception Group @TUM works
on the aspects of visual
perception and control in
medical, mobile, and HCI
applications

Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013

Similar to Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013 (20)

Recently uploaded

Recently uploaded (20)

Semantic Perception for Telemanipulation at SPME Workshop at ICRA 2013