SitLog
SitLog** is a programming language and environment for the specification
and interpretation of service robots' tasks. All the RoboCup@Home tests in
Golem are programmed in SitLog. The computational mechanism consists
of two interpreters working in tandem: one for interpreting the task's
structure, which is represented through a Functional Recursive Transition
Network (F-RTN), and the other for interpreting content and control
information associated to the nodes of the F-RTN, which stand for task's
situations. The two interpreters allow the definition of applications at a
highly abstract level, in a declarative and compact form. DMs are
represented through this formalism. There are two main kinds of DMs:
those representing the structure of the task (e.g. RoboCup@Home's tests)
and those representing task independent generic behaviors (see, grasp,
follow, find, etc.).
Dialogue ModelHierarchy of Behaviors
∨
∨
Golem-II+ is the latest service robot developed by the
Golem Group. We design and construct domain
independent service robots. Our developments are
based in a theory of Human-Robot Communication
centered in the specification of protocols representing
the structure of service robots' tasks, which are called
Dialogue Models (DMs).
GOLEM-II+
Departamento de
Ciencias de la
Computación
GOLEM-II+
Computer Science Department
Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas
Universidad Nacional Autónoma de México
The Golem Group: Luis Pineda (team leader), Ivan Meza, Caleb Rascón,
Gibran Fuentes, Mario Peña, Lisset Salinas, Arturo Rodríguez, Mauricio Reyes,
Hernando Ortega, Joel Durán, Varinia Estrada
http://golem.iimas.unam.mxgolem@turing.iimas.unam.mx
INTERACTION-ORIENTED COGNITIVE ARCHITECTURE
The main communication cycle involves perceptual interpretation, DMs
and intentional action, and subsumes autonomous systems, which deal
with reactive behavior.
IOCA
Environment
Rendering
Coordinator
SitLog
Dialogue Models
Perceptual
Interpreter
Autonomous
Reactive SystemsRecognition
IOCA
Action
Specification
Within the present framework we have
developed IOCA. This architecture has three
main layers:
- TOP: Expectation / Action-Selection
- MIDDLE: Interpretation / Action-Specification
- BOTTOM: Recognition / Rendering
VISUAL OBJECT RECOGNITION AND GRASPING
The Golem Project uses the
Multiple Object Pose Estimation
and Detection (MOPED) algo-
rithm and framework*.
Golem is equipped with two in-
house developed arms with 4-
degrees of freedom. The vision
algorithm provides the para-
meters h, a and b representing
the distance, height and depth between the robot's eye and the object; and
a triangle composed of b and the two segments of the arm (l1 and l2) is
defined. The elbow's position is located at the intersection of the two
circles (c1 and c2) and is computed through diagrammatic reasoning, as
illustrated below, determining the angles α and β at the shoulder and the
wrist. This process corresponds to Golem's gross grasping plan. Once the
arm approaches to the object, Golem's arm searches it reactively through
a mechanism involving three infra-red sensors to adjust the target's
position. This strategy compensates dynamically for vision and mechanical
errors.
MOPED's pose estimation
- PeopleBot robotic base
- Dell Precision M4600
- LAIDETEC-IIMAS robotic arms x2
- QuickCam Pro 9000 Webcam
- Microsoft Kinect Camera
- Hokuyo UTM-30LX Laser
- Shure Base omnidirectional microphones x3
- RODE VideoMic directional microphone
- M-Audio Fast Track Ultra audio external interface
- Infinity 3.5-Inch Two-Way speakers x2
HARDWARE
PocketSphinx, JACK
OpenCV, OpenNI, MOPED
Sicstus Prolog
Festival TTS
Player/Stage and GearboxNavigation
SitLog
Vision
Voice Recognition
and Audio Processing
Voice Synthesizer
Object Manipulation Roboplus
WordSpotting, GF grammarLanguage Interpretation
SOFTWARE LIBRARIESMODULE
SOFTWARE
move_success
move_error
scan
recursive
search
neutral
search
neutral
fs_found
final
fs_error
final
scan
neutral
found not_found
Pos = [ ] Pos ≠ [ ]
Pos = [ ]
find
Diagrammatic Reasoning
Gross planning
Reactive Behaviour
Dynamic local adjustment
a
b
h
l1 l2
βα
* Collet, Alvaro and Martinez, Manuel and Srinivasa, Siddhartha S. "The MOPED framework: Object Recognition and Pose
Estimation for Manipulation". In The International Journal of Robotics Research. April, 2011.
c1 c2
** Pineda, Luis and Salinas, Lisset and Meza, Ivan and Rascón, Caleb and Fuentes, Gibrán. "SitLog: A Programming Language
for Service Robots' Tasks". Submitted to International Journal of Advanced Robotic Systems. 2013.
M-DOA (Multi-Direction Of Arrival)*** algorithm is an autonomous system
that allows Golem to locate several sources of sound in its surroundings.
Bring me the juice
Hi, Golem!
Ok.
Thanks!
Sorry! Did you
say juice?
Yes, please.
MULTI-DIRECTION OF ARRIVAL
Multiple Conversational Partners
Could you please talk
one at the time?
I want water...
I want orange juice...
I want water
Ok, water and...
Interruption Handling
This functionality can be used to react directly to such a stimulus. In
addition it can also be embedded in language interaction, making the robot
capable of handling single or multiple interruptions while it is engaged in a
conversation with multiple partners.
Sample not acceptable:
will NOT be processed
Sample acceptable: will be
processed
DOA Clustering
source 1
source 2
source 3
Image Triangle Array-Redundancy
Single Source Detection Algorithm
I'll be with
you in a moment
*** Rascón, Caleb and Pineda, Luis. "Lightweight Multi-direction-of-arrival Estimation on a Mobile Robotic Platform"
In Lecture Notes in Eng. and Comp. Science: Proc. of The World Congress on Engineering and Computer Science 2012, Vol I.
...you?
I want
orange juice

RIPS RoboCup@home 2013, The Netherlands

  • 1.
    SitLog SitLog** is aprogramming language and environment for the specification and interpretation of service robots' tasks. All the RoboCup@Home tests in Golem are programmed in SitLog. The computational mechanism consists of two interpreters working in tandem: one for interpreting the task's structure, which is represented through a Functional Recursive Transition Network (F-RTN), and the other for interpreting content and control information associated to the nodes of the F-RTN, which stand for task's situations. The two interpreters allow the definition of applications at a highly abstract level, in a declarative and compact form. DMs are represented through this formalism. There are two main kinds of DMs: those representing the structure of the task (e.g. RoboCup@Home's tests) and those representing task independent generic behaviors (see, grasp, follow, find, etc.). Dialogue ModelHierarchy of Behaviors ∨ ∨ Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots. Our developments are based in a theory of Human-Robot Communication centered in the specification of protocols representing the structure of service robots' tasks, which are called Dialogue Models (DMs). GOLEM-II+ Departamento de Ciencias de la Computación GOLEM-II+ Computer Science Department Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas Universidad Nacional Autónoma de México The Golem Group: Luis Pineda (team leader), Ivan Meza, Caleb Rascón, Gibran Fuentes, Mario Peña, Lisset Salinas, Arturo Rodríguez, Mauricio Reyes, Hernando Ortega, Joel Durán, Varinia Estrada http://golem.iimas.unam.mxgolem@turing.iimas.unam.mx INTERACTION-ORIENTED COGNITIVE ARCHITECTURE The main communication cycle involves perceptual interpretation, DMs and intentional action, and subsumes autonomous systems, which deal with reactive behavior. IOCA Environment Rendering Coordinator SitLog Dialogue Models Perceptual Interpreter Autonomous Reactive SystemsRecognition IOCA Action Specification Within the present framework we have developed IOCA. This architecture has three main layers: - TOP: Expectation / Action-Selection - MIDDLE: Interpretation / Action-Specification - BOTTOM: Recognition / Rendering VISUAL OBJECT RECOGNITION AND GRASPING The Golem Project uses the Multiple Object Pose Estimation and Detection (MOPED) algo- rithm and framework*. Golem is equipped with two in- house developed arms with 4- degrees of freedom. The vision algorithm provides the para- meters h, a and b representing the distance, height and depth between the robot's eye and the object; and a triangle composed of b and the two segments of the arm (l1 and l2) is defined. The elbow's position is located at the intersection of the two circles (c1 and c2) and is computed through diagrammatic reasoning, as illustrated below, determining the angles α and β at the shoulder and the wrist. This process corresponds to Golem's gross grasping plan. Once the arm approaches to the object, Golem's arm searches it reactively through a mechanism involving three infra-red sensors to adjust the target's position. This strategy compensates dynamically for vision and mechanical errors. MOPED's pose estimation - PeopleBot robotic base - Dell Precision M4600 - LAIDETEC-IIMAS robotic arms x2 - QuickCam Pro 9000 Webcam - Microsoft Kinect Camera - Hokuyo UTM-30LX Laser - Shure Base omnidirectional microphones x3 - RODE VideoMic directional microphone - M-Audio Fast Track Ultra audio external interface - Infinity 3.5-Inch Two-Way speakers x2 HARDWARE PocketSphinx, JACK OpenCV, OpenNI, MOPED Sicstus Prolog Festival TTS Player/Stage and GearboxNavigation SitLog Vision Voice Recognition and Audio Processing Voice Synthesizer Object Manipulation Roboplus WordSpotting, GF grammarLanguage Interpretation SOFTWARE LIBRARIESMODULE SOFTWARE move_success move_error scan recursive search neutral search neutral fs_found final fs_error final scan neutral found not_found Pos = [ ] Pos ≠ [ ] Pos = [ ] find Diagrammatic Reasoning Gross planning Reactive Behaviour Dynamic local adjustment a b h l1 l2 βα * Collet, Alvaro and Martinez, Manuel and Srinivasa, Siddhartha S. "The MOPED framework: Object Recognition and Pose Estimation for Manipulation". In The International Journal of Robotics Research. April, 2011. c1 c2 ** Pineda, Luis and Salinas, Lisset and Meza, Ivan and Rascón, Caleb and Fuentes, Gibrán. "SitLog: A Programming Language for Service Robots' Tasks". Submitted to International Journal of Advanced Robotic Systems. 2013. M-DOA (Multi-Direction Of Arrival)*** algorithm is an autonomous system that allows Golem to locate several sources of sound in its surroundings. Bring me the juice Hi, Golem! Ok. Thanks! Sorry! Did you say juice? Yes, please. MULTI-DIRECTION OF ARRIVAL Multiple Conversational Partners Could you please talk one at the time? I want water... I want orange juice... I want water Ok, water and... Interruption Handling This functionality can be used to react directly to such a stimulus. In addition it can also be embedded in language interaction, making the robot capable of handling single or multiple interruptions while it is engaged in a conversation with multiple partners. Sample not acceptable: will NOT be processed Sample acceptable: will be processed DOA Clustering source 1 source 2 source 3 Image Triangle Array-Redundancy Single Source Detection Algorithm I'll be with you in a moment *** Rascón, Caleb and Pineda, Luis. "Lightweight Multi-direction-of-arrival Estimation on a Mobile Robotic Platform" In Lecture Notes in Eng. and Comp. Science: Proc. of The World Congress on Engineering and Computer Science 2012, Vol I. ...you? I want orange juice