The document discusses several projects by the Golem group from the Computer Science Department at UNAM from 1998-2011. It describes the creation of a large speech corpus for Mexican Spanish called DIMEx100 composed of 6,000 sentences recorded by 100 speakers. It also discusses the development of a speech recognition system for Mexican Spanish using this corpus. Several iterations of an interactive robot named Golem are presented, including its debut in 2007 and capabilities like basic conversation, navigation, and playing a card guessing game. The document also introduces DIME-DAMSL, a theory for analyzing dialogue acts in practical dialogues that was inspired by DAMSL.
Little Languages, also known as Domain Specific Languages (DSLs), can be characterized as having a simple grammar, offering high level abstractions, allowing for hierarchical factoring of solutions, reducing the solution space, and allowing us to develop the tools that matter. In this talk, Chris will provide a historical of two dozen influential Little Languages, including some he worked on himself, and highlight their unique approach to solving a given problem.
Little Languages, also known as Domain Specific Languages (DSLs), can be characterized as having a simple grammar, offering high level abstractions, allowing for hierarchical factoring of solutions, reducing the solution space, and allowing us to develop the tools that matter. In this talk, Chris will provide a historical of two dozen influential Little Languages, including some he worked on himself, and highlight their unique approach to solving a given problem.
54-6 September 2002, Edinburgh, UK, Pages 5-11.Bos, Fost.docxblondellchancy
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
Cooperation and Collaboration in Natural Command Language
Dialogues
J. Gabriel Amores and José F. Quesada
University of Seville
[email protected][email protected]
Abstract
This paper discusses cooperative and collabora-
tive behaviour in Natural Command Language
Dialogues (NCLDs). We first introduce Natural
Command Language Dialogues and then briefly
compare them with other types of dialogue. Co-
operation and collaboration in NCLDs is then
analyzed. Finally, a typology of conflicts in
NCLDs is proposed, for which a solution has
been implemented in two spoken dialogue proto-
types. Sample dialogues are taken from research
carried out under the Siridus and D’Homme Eu-
ropean projects.
1 Natural Command Language
Dialogues
A Natural Command Language (NCL) is a com-
mand language expressed through the medium
of natural language. We take NCLs as the set of
input and output natural language expressions
which are acceptable in a given application do-
main. This domain is semantically defined by
the functions (commands) known by the user
and the system, and the natural language vo-
cabulary which may be used to express those
commands. In addition, NCLs should contain
metalinguistic patterns and expressions typical
of human–like interaction. Natural Command
Language Dialogues are artificially constructed
models of action–oriented dialogues (including
knowledge representation and reasoning) able
to guide the interaction between the different
parts involved in a dialogue based on a NCL. A
NCLD should allow the following kinds of phe-
nomena:
• Multiple Task NCL: In contrast to
Task–Oriented Dialogue models, NCLDs
must be able to manage different tasks.
Thus, one of the main functions of NCLD
systems will be task detection, as will be
explained below.
• Context Dependency: Only at the di-
alogue level is it possible to understand
anaphora, ellipsis and other context de-
pendent constructions. From the dialogue
system design persepective, the treatment
of these discourse phenomena will imply
the representation and storage of the whole
dialogue history. For an illustration, see
(Quesada and Amo ...
This talk will cover various aspects of Logic Programming. We examine Logic Programming in the contexts of Programming Languages, Mathematical Logic and Machine Learning.
We will we start with an introduction to Prolog and metaprogramming in Prolog. We will also discuss how miniKanren and Core.Logic differ from Prolog while maintaining the paradigms of logic programming.
We will then cover the Unification Algorithm in depth and examine the mathematical motivations which are rooted in Skolem Normal Form. We will describe the process of converting a statement in first order logic to clausal form logic. We will also discuss the applications of the Unification Algorithm to automated theorem proving and type inferencing.
Finally we will look at the role of Prolog in the context of Machine Learning. This is known as Inductive Logic Programming. In that context we will briefly review Decision Tree Learning and it's relationship to ILP. We will then examine Sequential Covering Algorithms for learning clauses in Propositional Calculus and then the more general FOIL algorithm for learning sets of Horn clauses in First Order Predicate Calculus. Examples will be given in both Common Lisp and Clojure for these algorithms.
Pierre de Lacaze has over 20 years’ experience with Lisp and AI based technologies. He holds a Bachelor of Science in Applied Mathematics and Computer Science and a Master’s Degree in Computer Science. He is the president of LispNYC.org
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
This lecture is about The R Language use for statistical computing for beginners who want to start working on financial analytics using R software. Moreover, lecture is beneficial for finance professional who want to work on R for financial data analysis.
"Hour of Code": Back to the roots... [1987-1993]Yannis Kotsanis
A Microworld Oriented Approach in a Multi-Functional Logo-Based Curriculum
G. Bariamis, S. Chaimantas, Y. Kotsanis, L. Papathomaidi
Doukas School
EUROLOGO '93, University of Athens, 28-31/8/1993
User centered design assumes that a research phase with a representative sample of the final users should be the basis for the definition of the functional and soft requirements of a project. How can we translate the results of the ux research into actionable requirements?
In my talk, I wish to give you some suggestions on how to informally analyse the verbal results of the ux research to identify the schemata, the ontologies, the taxonomies and the functions of your application.
54-6 September 2002, Edinburgh, UK, Pages 5-11.Bos, Fost.docxblondellchancy
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
5
4-6 September 2002, Edinburgh, UK, Pages 5-11.
Bos, Foster & Matheson (eds): Proceedings of the sixth workshop on the semantics and pragmatics of dialogue (EDILOG 2002),
Cooperation and Collaboration in Natural Command Language
Dialogues
J. Gabriel Amores and José F. Quesada
University of Seville
[email protected][email protected]
Abstract
This paper discusses cooperative and collabora-
tive behaviour in Natural Command Language
Dialogues (NCLDs). We first introduce Natural
Command Language Dialogues and then briefly
compare them with other types of dialogue. Co-
operation and collaboration in NCLDs is then
analyzed. Finally, a typology of conflicts in
NCLDs is proposed, for which a solution has
been implemented in two spoken dialogue proto-
types. Sample dialogues are taken from research
carried out under the Siridus and D’Homme Eu-
ropean projects.
1 Natural Command Language
Dialogues
A Natural Command Language (NCL) is a com-
mand language expressed through the medium
of natural language. We take NCLs as the set of
input and output natural language expressions
which are acceptable in a given application do-
main. This domain is semantically defined by
the functions (commands) known by the user
and the system, and the natural language vo-
cabulary which may be used to express those
commands. In addition, NCLs should contain
metalinguistic patterns and expressions typical
of human–like interaction. Natural Command
Language Dialogues are artificially constructed
models of action–oriented dialogues (including
knowledge representation and reasoning) able
to guide the interaction between the different
parts involved in a dialogue based on a NCL. A
NCLD should allow the following kinds of phe-
nomena:
• Multiple Task NCL: In contrast to
Task–Oriented Dialogue models, NCLDs
must be able to manage different tasks.
Thus, one of the main functions of NCLD
systems will be task detection, as will be
explained below.
• Context Dependency: Only at the di-
alogue level is it possible to understand
anaphora, ellipsis and other context de-
pendent constructions. From the dialogue
system design persepective, the treatment
of these discourse phenomena will imply
the representation and storage of the whole
dialogue history. For an illustration, see
(Quesada and Amo ...
This talk will cover various aspects of Logic Programming. We examine Logic Programming in the contexts of Programming Languages, Mathematical Logic and Machine Learning.
We will we start with an introduction to Prolog and metaprogramming in Prolog. We will also discuss how miniKanren and Core.Logic differ from Prolog while maintaining the paradigms of logic programming.
We will then cover the Unification Algorithm in depth and examine the mathematical motivations which are rooted in Skolem Normal Form. We will describe the process of converting a statement in first order logic to clausal form logic. We will also discuss the applications of the Unification Algorithm to automated theorem proving and type inferencing.
Finally we will look at the role of Prolog in the context of Machine Learning. This is known as Inductive Logic Programming. In that context we will briefly review Decision Tree Learning and it's relationship to ILP. We will then examine Sequential Covering Algorithms for learning clauses in Propositional Calculus and then the more general FOIL algorithm for learning sets of Horn clauses in First Order Predicate Calculus. Examples will be given in both Common Lisp and Clojure for these algorithms.
Pierre de Lacaze has over 20 years’ experience with Lisp and AI based technologies. He holds a Bachelor of Science in Applied Mathematics and Computer Science and a Master’s Degree in Computer Science. He is the president of LispNYC.org
These Lecture series are relating the use R language software, its interface and functions required to evaluate financial risk models. Furthermore, R software applications relating financial market data, measuring risk, modern portfolio theory, risk modeling relating returns generalized hyperbolic and lambda distributions, Value at Risk (VaR) modelling, extreme value methods and models, the class of ARCH models, GARCH risk models and portfolio optimization approaches.
This lecture is about The R Language use for statistical computing for beginners who want to start working on financial analytics using R software. Moreover, lecture is beneficial for finance professional who want to work on R for financial data analysis.
"Hour of Code": Back to the roots... [1987-1993]Yannis Kotsanis
A Microworld Oriented Approach in a Multi-Functional Logo-Based Curriculum
G. Bariamis, S. Chaimantas, Y. Kotsanis, L. Papathomaidi
Doukas School
EUROLOGO '93, University of Athens, 28-31/8/1993
User centered design assumes that a research phase with a representative sample of the final users should be the basis for the definition of the functional and soft requirements of a project. How can we translate the results of the ux research into actionable requirements?
In my talk, I wish to give you some suggestions on how to informally analyse the verbal results of the ux research to identify the schemata, the ontologies, the taxonomies and the functions of your application.
Golem-II+ is the latest service robot developed by the Golem Group. We design and construct domain independent service robots. Our developments are based in a theory of Human-Robot Communication centered in the specification of protocols representing the structure of service robots' tasks, which are called Dialogue Models (DMs).
The robot Golem-II+ is the latest service robot developed by the Golem Group.
We design and construct domain independent service robots based on a theory of Human-Robot Communication centered in the interaction context, and the specification of Speech Acts protocols, which are named Dialogue Models (DMs).
The Golem Group presents a generic model for solving service robot tasks. The model proposes an abstract solution for the managing of diverse kinds of behaviors in
systems based on human-robot interaction.
The model is implemented in Golem-II+, which coordinates its multimodal capabilities to perform communicative
interactions independently of the domain and the concrete
task. It also includes communicative strategies for error prevention and recovery.
The focus of the Golem Group is to design and construct domain independent service robots based on a theory of Human-Robot Communication. Perceptual Memory
The theory is centered on a model of the interaction context, and the abstract specification and concrete interpretation of intentional or Speech Acts protocols, which are named Dialogue Models (DMs).
The core of the system is the DM-interpreter program, which interprets DMs continuously during the robot's behavior. The robot Golem-II+ is our latest service robot developed within this framework.
Golem-II+ is a service robot developed at the Computer Science Department, IIMAS, UNAM.
The cognitive modeling interactions is the main development.
Golem-II+ can speak, navigate and recognize objects, pointing gestures and persons. It combines its aptitudes to carry out different tasks, such as guiding a full poster session or following a person across the room. Ask him about it!
Through spoken conversation, Golem-II+ interacts with the user to manage different modalities of behaviors.
El comportamiento de nuestro robot, Golem-II+, es regulado por una Arquitectura Orientada a la Interacción Cognitiva (IOCA, por sus siglas en inglés). Un diagrama de IOCA se puede ver a continuación
Our approach extracts different document representations and
computes multiple metric distances from each of them. The system then tries to find the combination of distances and representations that best fit the training data by means of linear programming, Support Vector Regression and Neuronal Networks models. We tested this system on English, Greek and Spanish document sets and achieving moderately successful results.
1. Two types of conversational
structures are considered:
Obligations and Common
Ground.
The dialogue acts contribute
to conversations as "charges"
or "credits" in at least one of
the two main structures.
Any charge should be
credited to balance the
transaction and continue with
the next one until the task is
finished.
The empirical base of DIME-
DAMSL scheme is the
Corpus DIME. It was
produced to analize dialogue
acts in practical dialogues. It
is conformed by 26 dialogues
in Spanish. The user designs
a kitchen giving spoken
instructions to the system.
DIME-DAMSL 1998-2006
The DIME group designed and produced a large speech corpus to create
acoustic models for Mexican Spanish. The Corpus DIMEx100 is composed
by 6,000 sentences (between 5 and 15 words each) recorded by 100
speakers. Each sentence was analyzed with Mexbet, a phonetic alphabet,
and tagged in multiple levels:
-T22 (phonemes)
-T44 (allophones gross)
-T54 (allophones fine)
-TP (words)
A Speech Recognition System for
Mexican Spanish was created with
the final tagging and Sphinx
algorithm.
DIMEx100 2003-2006
GOLEM
Golem's debut was at
UNAM’s Science Museum
Universum in 2007. It was
widely covered by the
Mexican TV, radio, and
press. It made several
demonstrations in academic
events in Mexico during
2008 and 2009, before
retiring.
2001-2009
GUESS THE CARD 2009-2011
GOLEM-II+ 2010-2011
Golem's capabilities were tested with the "Guess the card" game. This fixed
application is a permanent exhibition at the Universum Museum. This
system presents Artificial Intelligence technologies to the general public.
The interaction is carried out in spoken Spanish and visual interpretations.
Golem-II+ is the group's newest service robot. As the
previous version, It can also guide a poster session. In
addition, it is able to recognize pointing gestures, to navigate
through busy rooms, and to audio-locate its user; further
more, now includes in its framework the tests of the
Robocup@Home competition.
Golem-II+ is based on a cognitive architecture named IOCA
(Interaction-Oriented Cognitive Architecture).
DIME-DAMSL is a theory, inspired by DAMSL, about how practical
dialogues or task-oriented conversations are structured.
In DIME-DAMSL, a practical dialogue not only transmits information about
the task, but also manages the task and the dialogue itself.
Practical dialogues are series of transactions. In each transaction,
obligations are created to reach a specific goal of the task. To achieve this,
the levels of agreement and understanding between conversational agents
are negotiated.
DIME 1998-2009
The DIME group was a research team that focused on the development of a
theory of conversation with its computer implementation, an infrastructure for
the construction of spoken Spanish recognition systems, and a flexible
interaction-oriented architecture, that can be embodied on different
hardware platforms, for the development of applications in diverse domains.
This project gave the name to the Golem group. The
main interest of the research team is the
development of multimodal interaction systems for
fixed and mobile platforms. Golem was the first
implementation of a multimodal system in a service
robot. It was able to guide a poster session through
simple spoken conversation in Spanish, and to
move to the selected poster. The system was a set
of computational agents; each one representing a
modality of information.
Departamento de Ciencias
de la Computación
THE GOLEM GROUP
1998-2011
Luis A. Pineda and the Golem Group
Computer Science Department
Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas
Universidad Nacional Autónoma de México
luis@leibniz.iimas.unam.mx
http://golem.iimas.unam.mx
is
ls1
ε:ra1 (tour) ls2
fs
ok:ra2
([ai,pr,pl ])
no:ra3(tour)
no:ra3(tour)
rs1
rs2
rs3
ai:ra4(ai)
pr:ra4(pr)
pl:ra
4 (pl)
ε:ra
5 (ai)
ε:ra5
(pr)
ε:ra5(pl)
ls3
no:ra3
(tour)
ls2
ok:w
rs
is
ls1
ls2
fs
empty:ra1
([per,area,proy])
no:ra2(error)
no:ra4(ai)
ε:ra3
(per)
no:ra3
(ai)
is ok:w
help
area:ra2(area)
ε
ts1
ts2
ts3
per:ra2(per)
proy:ra
2 (proy)
ε:ra3(area)
ε:ra3(proy)
ls1
Dialogue Manager
Main DM Subordinated DM
DM = Dialogue Model
Expected
Intentions
Intentions
Filter
Speech
Recognition
I want to visit the AI poster...
Display
Image or Video
Navigation
Synthesizer
text The developments in AI...
Recognized
Intention
Dialogue Manager
Situation
Interpretation
Multimodal
Rhetorical
Act
Interpretation
of Basic
Rhetorical
Acts
Basic Rhetorical Acts
(Modal Specific)
Output
Devices
Golem: Hello, do you want to play?
Player: Yes, please.
Golem: Yes, it's round.
Player: Is it round?
...
Golem: No, sorry, it's not a planet. You can ask me again,
but the question session is almost finished.
Player: Is it a planet?
Golem: Yes, it's yellow. Now, show me the card you think it is.
Player: Is it yellow?
Golem: I can't see very well, but I think you card is this one.
(The system displays an image on the screen)
Player: (The user puts the card in front the camera)
Golem: You won! I chose the card of the Sun.
Player: Yes, that's the card I think you chose.
utterance
1. u: after that <sil>
can you put <sil> the
cooker hood on the top
of the <sil> of the stove
2. s: okay
3. s: <move-obj>
4. s: is this okay?
5. u: yes, it-s okay
obligations common
ground
charge credit charge credit charge credit
agreement under-
standing
dialogue
acts
action-dir
commit,
accept
move-obj
info-request
answer,
accept
1 1
12
2
1
3
3
4 4
4
4
Video Controler
Dialogue Manager
&
Facilitator
Image Controler
Speech Recognition
Agent
Speech Synthesizer
Agent
Navigation
Agent
Golem's Agents Architecture
Voice