The Golem Group (1998-2011)

Two types of conversational
structures are considered:
Obligations and Common
Ground.
The dialogue acts contribute
to conversations as "charges"
or "credits" in at least one of
the two main structures.
Any charge should be
credited to balance the
transaction and continue with
the next one until the task is
finished.
The empirical base of DIME-
DAMSL scheme is the
Corpus DIME. It was
produced to analize dialogue
acts in practical dialogues. It
is conformed by 26 dialogues
in Spanish. The user designs
a kitchen giving spoken
instructions to the system.
DIME-DAMSL 1998-2006
The DIME group designed and produced a large speech corpus to create
acoustic models for Mexican Spanish. The Corpus DIMEx100 is composed
by 6,000 sentences (between 5 and 15 words each) recorded by 100
speakers. Each sentence was analyzed with Mexbet, a phonetic alphabet,
and tagged in multiple levels:
-T22 (phonemes)
-T44 (allophones gross)
-T54 (allophones fine)
-TP (words)
A Speech Recognition System for
Mexican Spanish was created with
the final tagging and Sphinx
algorithm.
DIMEx100 2003-2006
GOLEM
Golem's debut was at
UNAM’s Science Museum
Universum in 2007. It was
widely covered by the
Mexican TV, radio, and
press. It made several
demonstrations in academic
events in Mexico during
2008 and 2009, before
retiring.
2001-2009
GUESS THE CARD 2009-2011
GOLEM-II+ 2010-2011
Golem's capabilities were tested with the "Guess the card" game. This fixed
application is a permanent exhibition at the Universum Museum. This
system presents Artificial Intelligence technologies to the general public.
The interaction is carried out in spoken Spanish and visual interpretations.
Golem-II+ is the group's newest service robot. As the
previous version, It can also guide a poster session. In
addition, it is able to recognize pointing gestures, to navigate
through busy rooms, and to audio-locate its user; further
more, now includes in its framework the tests of the
Robocup@Home competition.
Golem-II+ is based on a cognitive architecture named IOCA
(Interaction-Oriented Cognitive Architecture).
DIME-DAMSL is a theory, inspired by DAMSL, about how practical
dialogues or task-oriented conversations are structured.
In DIME-DAMSL, a practical dialogue not only transmits information about
the task, but also manages the task and the dialogue itself.
Practical dialogues are series of transactions. In each transaction,
obligations are created to reach a specific goal of the task. To achieve this,
the levels of agreement and understanding between conversational agents
are negotiated.
DIME 1998-2009
The DIME group was a research team that focused on the development of a
theory of conversation with its computer implementation, an infrastructure for
the construction of spoken Spanish recognition systems, and a flexible
interaction-oriented architecture, that can be embodied on different
hardware platforms, for the development of applications in diverse domains.
This project gave the name to the Golem group. The
main interest of the research team is the
development of multimodal interaction systems for
fixed and mobile platforms. Golem was the first
implementation of a multimodal system in a service
robot. It was able to guide a poster session through
simple spoken conversation in Spanish, and to
move to the selected poster. The system was a set
of computational agents; each one representing a
modality of information.
Departamento de Ciencias
de la Computación
THE GOLEM GROUP
1998-2011
Luis A. Pineda and the Golem Group
Computer Science Department
Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas
Universidad Nacional Autónoma de México
luis@leibniz.iimas.unam.mx
http://golem.iimas.unam.mx
is
ls1
ε:ra1 (tour) ls2
fs
ok:ra2
([ai,pr,pl ])
no:ra3(tour)
no:ra3(tour)
rs1
rs2
rs3
ai:ra4(ai)
pr:ra4(pr)
pl:ra
4 (pl)
ε:ra
5 (ai)
ε:ra5
(pr)
ε:ra5(pl)
ls3
no:ra3
(tour)
ls2
ok:w
rs
is
ls1
ls2
fs
empty:ra1
([per,area,proy])
no:ra2(error)
no:ra4(ai)
ε:ra3
(per)
no:ra3
(ai)
is ok:w
help
area:ra2(area)
ε
ts1
ts2
ts3
per:ra2(per)
proy:ra
2 (proy)
ε:ra3(area)
ε:ra3(proy)
ls1
Dialogue Manager
Main DM Subordinated DM
DM = Dialogue Model
Expected
Intentions
Intentions
Filter
Speech
Recognition
I want to visit the AI poster...
Display
Image or Video
Navigation
Synthesizer
text The developments in AI...
Recognized
Intention
Dialogue Manager
Situation
Interpretation
Multimodal
Rhetorical
Act
Interpretation
of Basic
Rhetorical
Acts
Basic Rhetorical Acts
(Modal Specific)
Output
Devices
Golem: Hello, do you want to play?
Player: Yes, please.
Golem: Yes, it's round.
Player: Is it round?
...
Golem: No, sorry, it's not a planet. You can ask me again,
but the question session is almost finished.
Player: Is it a planet?
Golem: Yes, it's yellow. Now, show me the card you think it is.
Player: Is it yellow?
Golem: I can't see very well, but I think you card is this one.
(The system displays an image on the screen)
Player: (The user puts the card in front the camera)
Golem: You won! I chose the card of the Sun.
Player: Yes, that's the card I think you chose.
utterance
1. u: after that <sil>
can you put <sil> the
cooker hood on the top
of the <sil> of the stove
2. s: okay
3. s: <move-obj>
4. s: is this okay?
5. u: yes, it-s okay
obligations common
ground
charge credit charge credit charge credit
agreement under-
standing
dialogue
acts
action-dir
commit,
accept
move-obj
info-request
answer,
accept
1 1
12
2
1
3
3
4 4
4
4
Video Controler
Dialogue Manager
&
Facilitator
Image Controler
Speech Recognition
Agent
Speech Synthesizer
Agent
Navigation
Agent
Golem's Agents Architecture
Voice

The Golem Group (1998-2011)

Recommended

Recommended

More Related Content

Similar to The Golem Group (1998-2011)

Similar to The Golem Group (1998-2011) (20)

More from Grupo Golem (DCC-IIMAS-UNAM)

More from Grupo Golem (DCC-IIMAS-UNAM) (12)

The Golem Group (1998-2011)