1. Speech Commands for Bots
CS4731 Fall ‘07 Final Presentation
Dannon Baker, David McCann, Anand Taralika, and
Reeves Washington
2. Speech Commands for Bots, by Dannon Baker, David McCann, Anand TM, and
Reeves Washington 2
Message
Analyzing the limitations of current speech recognition
technology as an effective communication medium in
games, the player experience could be enriched by
designing dynamic game AI that responds to the player’s
emotional state and voice commands
3. Speech Commands for Bots, by Dannon Baker, David McCann, Anand TM, and
Reeves Washington 3
Problem
• Enrich team game play through voice
commands
• Command a mixed team of bots and
humans in a natural, procedural way
• Enrich entertainment through player
emotion detection
4. Speech in gaming
• Many MMOGs use teamspeak to
communicate between human players
• Speech Recognition
– Konami’s Lifeline
• Large, complicated vocabulary
• “Gimmicky” use of speech control as sole control
mechanism
– Ghost Recon 2 and Rainbow Six 3
• Simple, atomic command structure
(Title), by (Your Name or Organization Here) 4
6. Emotion Recognition
• Speech API exposes a raw WAV data
format stream of the player’s voice to
extract mood features
– Average intensity gives the volume
– Average number of x intercepts over time
gives the pitch
(Title), by (Your Name or Organization Here) 6
7. Emotion Classification
(Title), by (Your Name or Organization Here) 7
Mood Identifier
Angry
Agitated
Excited
High volume
High pitch
Cautious Low volume
Low pitch
Bored Low pitch
Off-topic speech
Prolonged silence
Normal Everything else
8. Natural Language Parsing
• Microsoft SDK allows for custom CFGs
– Structure, small-scale grammar vastly improves
SDK performance
• Framework parses syntax tree, building
individual commands to issue
– Generic rule handling allows for iterative grammar
development
– Add flexibility by using synonyms
• Subsumption architecture handles individual
event execution
(Title), by (Your Name or Organization Here) 8
9. Engine-Level Modifications
• Chat mechanism for directly controlling
bots
• Primitive events to detect command
completion
• Primitive keywords: “me, you, this, there,
here,” etc.
• Map annotations
(Title), by (Your Name or Organization Here) 9
10. Subsumption Architecture
• Sends chat event to the server based on
priority and completion criteria of incoming
event / command
• Emotional responses are high priority
atomic commands that override current
high-level plan
• Immediate commands override a high-level
plan
(Title), by (Your Name or Organization Here) 10
11. Conclusion
• An enriched game play was implemented
for Quake3 using speech commands
• Simple atomic commands provide robust
control mechanism
• Procedural semantics allow the user to
build high-level plans
• Emotional responses enrich the game
(Title), by (Your Name or Organization Here) 11
12. Future Work
• In addition to recognizing the commands,
use NLP also to augment the emotion
classifier
• Use extended mood identifiers such as
facial expressions, heart rate, blood
pressure, skin conductance sensors etc.
• Many, many vocabulary and engine
primitive extensions
(Title), by (Your Name or Organization Here) 12