SlideShare a Scribd company logo
1 of 12
Speech Commands for Bots
CS4731 Fall ‘07 Final Presentation
Dannon Baker, David McCann, Anand Taralika, and
Reeves Washington
Speech Commands for Bots, by Dannon Baker, David McCann, Anand TM, and
Reeves Washington 2
Message
Analyzing the limitations of current speech recognition
technology as an effective communication medium in
games, the player experience could be enriched by
designing dynamic game AI that responds to the player’s
emotional state and voice commands
Speech Commands for Bots, by Dannon Baker, David McCann, Anand TM, and
Reeves Washington 3
Problem
• Enrich team game play through voice
commands
• Command a mixed team of bots and
humans in a natural, procedural way
• Enrich entertainment through player
emotion detection
Speech in gaming
• Many MMOGs use teamspeak to
communicate between human players
• Speech Recognition
– Konami’s Lifeline
• Large, complicated vocabulary
• “Gimmicky” use of speech control as sole control
mechanism
– Ghost Recon 2 and Rainbow Six 3
• Simple, atomic command structure
(Title), by (Your Name or Organization Here) 4
Design
Game
Engine
Subsumption
Architecture
Command
Parsing and
Emotion
Detection
Microsoft
Speech API
Recognition
Events
Raw Audio
Stream
Emotion Modifier:
Normal/Angry/Bored/Cautious
Recognized
Phrases from
SAPI Grammar
● Emotion
Modifiers
● Low-level
command or
High-level plan
High Level Plan
Current Action
Atomic Engine
Chat Commands Engine Events
(Title), by (Your Name or Organization Here) 5
Emotion Recognition
• Speech API exposes a raw WAV data
format stream of the player’s voice to
extract mood features
– Average intensity gives the volume
– Average number of x intercepts over time
gives the pitch
(Title), by (Your Name or Organization Here) 6
Emotion Classification
(Title), by (Your Name or Organization Here) 7
Mood Identifier
Angry
Agitated
Excited
High volume
High pitch
Cautious Low volume
Low pitch
Bored Low pitch
Off-topic speech
Prolonged silence
Normal Everything else
Natural Language Parsing
• Microsoft SDK allows for custom CFGs
– Structure, small-scale grammar vastly improves
SDK performance
• Framework parses syntax tree, building
individual commands to issue
– Generic rule handling allows for iterative grammar
development
– Add flexibility by using synonyms
• Subsumption architecture handles individual
event execution
(Title), by (Your Name or Organization Here) 8
Engine-Level Modifications
• Chat mechanism for directly controlling
bots
• Primitive events to detect command
completion
• Primitive keywords: “me, you, this, there,
here,” etc.
• Map annotations
(Title), by (Your Name or Organization Here) 9
Subsumption Architecture
• Sends chat event to the server based on
priority and completion criteria of incoming
event / command
• Emotional responses are high priority
atomic commands that override current
high-level plan
• Immediate commands override a high-level
plan
(Title), by (Your Name or Organization Here) 10
Conclusion
• An enriched game play was implemented
for Quake3 using speech commands
• Simple atomic commands provide robust
control mechanism
• Procedural semantics allow the user to
build high-level plans
• Emotional responses enrich the game
(Title), by (Your Name or Organization Here) 11
Future Work
• In addition to recognizing the commands,
use NLP also to augment the emotion
classifier
• Use extended mood identifiers such as
facial expressions, heart rate, blood
pressure, skin conductance sensors etc.
• Many, many vocabulary and engine
primitive extensions
(Title), by (Your Name or Organization Here) 12

More Related Content

Viewers also liked

Viewers also liked (13)

Affective computing
Affective computingAffective computing
Affective computing
 
Part 1 - Gesture Recognition Technology
Part   1 - Gesture Recognition TechnologyPart   1 - Gesture Recognition Technology
Part 1 - Gesture Recognition Technology
 
Emotion recognition
Emotion recognitionEmotion recognition
Emotion recognition
 
Emotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speechEmotion recognition using facial expressions and speech
Emotion recognition using facial expressions and speech
 
HUMAN EMOTION RECOGNIITION SYSTEM
HUMAN EMOTION RECOGNIITION SYSTEMHUMAN EMOTION RECOGNIITION SYSTEM
HUMAN EMOTION RECOGNIITION SYSTEM
 
Facial expression recognition based on local binary patterns final
Facial expression recognition based on local binary patterns finalFacial expression recognition based on local binary patterns final
Facial expression recognition based on local binary patterns final
 
Facel expression recognition
Facel expression recognitionFacel expression recognition
Facel expression recognition
 
Facial Expression Recognition / Removal
Facial Expression Recognition / RemovalFacial Expression Recognition / Removal
Facial Expression Recognition / Removal
 
Applications of Emotions Recognition
Applications of Emotions RecognitionApplications of Emotions Recognition
Applications of Emotions Recognition
 
Gesture recognition
Gesture recognitionGesture recognition
Gesture recognition
 
8 Parts of Speech PowerPoint
8 Parts of Speech PowerPoint8 Parts of Speech PowerPoint
8 Parts of Speech PowerPoint
 
Neuroscience in User Research - Alexis Brantes - Interaction Design Foundatio...
Neuroscience in User Research - Alexis Brantes - Interaction Design Foundatio...Neuroscience in User Research - Alexis Brantes - Interaction Design Foundatio...
Neuroscience in User Research - Alexis Brantes - Interaction Design Foundatio...
 
Communication body language
Communication   body languageCommunication   body language
Communication body language
 

Similar to SpeechEmotionGameBot_Project

Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
slantsixgames
 
Windows 7 V2 Vineet
Windows 7 V2   VineetWindows 7 V2   Vineet
Windows 7 V2 Vineet
technext1
 
PRESENTATION ON Game Engine
PRESENTATION ON Game EnginePRESENTATION ON Game Engine
PRESENTATION ON Game Engine
Diksha Bhargava
 
Windows phone 7 xna
Windows phone 7 xnaWindows phone 7 xna
Windows phone 7 xna
Glen Gordon
 

Similar to SpeechEmotionGameBot_Project (20)

How to develop and localize Xbox 360 titles
How to develop and localize Xbox 360 titlesHow to develop and localize Xbox 360 titles
How to develop and localize Xbox 360 titles
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
 
Introduction to Game Engine: Concepts & Components
Introduction to Game Engine: Concepts & ComponentsIntroduction to Game Engine: Concepts & Components
Introduction to Game Engine: Concepts & Components
 
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
New Artificial Intelligence and IoT Services (Lex, Polly, Rekognition, Greeng...
 
Building Multiplayer Games (w/ Unity)
Building Multiplayer Games (w/ Unity)Building Multiplayer Games (w/ Unity)
Building Multiplayer Games (w/ Unity)
 
Tiny Teams, Big Potential
Tiny Teams, Big PotentialTiny Teams, Big Potential
Tiny Teams, Big Potential
 
Applying AI in Games (GDC2019)
Applying AI in Games (GDC2019)Applying AI in Games (GDC2019)
Applying AI in Games (GDC2019)
 
Deep Dive: Amazon Lumberyard & Amazon GameLift
Deep Dive: Amazon Lumberyard & Amazon GameLiftDeep Dive: Amazon Lumberyard & Amazon GameLift
Deep Dive: Amazon Lumberyard & Amazon GameLift
 
Kodu Together: Video Game Programming & Publishing
Kodu Together: Video Game Programming & PublishingKodu Together: Video Game Programming & Publishing
Kodu Together: Video Game Programming & Publishing
 
Massively Social != Massively Multiplayer
Massively Social != Massively MultiplayerMassively Social != Massively Multiplayer
Massively Social != Massively Multiplayer
 
Mini metro
Mini metroMini metro
Mini metro
 
Building fast,scalable game server in node.js
Building fast,scalable game server in node.jsBuilding fast,scalable game server in node.js
Building fast,scalable game server in node.js
 
Windows 7 V2
Windows 7 V2Windows 7 V2
Windows 7 V2
 
Windows 7 V2 Vineet
Windows 7 V2   VineetWindows 7 V2   Vineet
Windows 7 V2 Vineet
 
Windows Phone 7 Overview
Windows Phone 7 OverviewWindows Phone 7 Overview
Windows Phone 7 Overview
 
PRESENTATION ON Game Engine
PRESENTATION ON Game EnginePRESENTATION ON Game Engine
PRESENTATION ON Game Engine
 
Designing a pragmatic back-end service for mobile games
Designing a pragmatic back-end service for mobile gamesDesigning a pragmatic back-end service for mobile games
Designing a pragmatic back-end service for mobile games
 
Windows phone 7 xna
Windows phone 7 xnaWindows phone 7 xna
Windows phone 7 xna
 
Best Practices For Game Development Using Perforce Streams
Best Practices For Game Development Using Perforce Streams Best Practices For Game Development Using Perforce Streams
Best Practices For Game Development Using Perforce Streams
 
Indie Game Development Intro
Indie Game Development IntroIndie Game Development Intro
Indie Game Development Intro
 

SpeechEmotionGameBot_Project

  • 1. Speech Commands for Bots CS4731 Fall ‘07 Final Presentation Dannon Baker, David McCann, Anand Taralika, and Reeves Washington
  • 2. Speech Commands for Bots, by Dannon Baker, David McCann, Anand TM, and Reeves Washington 2 Message Analyzing the limitations of current speech recognition technology as an effective communication medium in games, the player experience could be enriched by designing dynamic game AI that responds to the player’s emotional state and voice commands
  • 3. Speech Commands for Bots, by Dannon Baker, David McCann, Anand TM, and Reeves Washington 3 Problem • Enrich team game play through voice commands • Command a mixed team of bots and humans in a natural, procedural way • Enrich entertainment through player emotion detection
  • 4. Speech in gaming • Many MMOGs use teamspeak to communicate between human players • Speech Recognition – Konami’s Lifeline • Large, complicated vocabulary • “Gimmicky” use of speech control as sole control mechanism – Ghost Recon 2 and Rainbow Six 3 • Simple, atomic command structure (Title), by (Your Name or Organization Here) 4
  • 5. Design Game Engine Subsumption Architecture Command Parsing and Emotion Detection Microsoft Speech API Recognition Events Raw Audio Stream Emotion Modifier: Normal/Angry/Bored/Cautious Recognized Phrases from SAPI Grammar ● Emotion Modifiers ● Low-level command or High-level plan High Level Plan Current Action Atomic Engine Chat Commands Engine Events (Title), by (Your Name or Organization Here) 5
  • 6. Emotion Recognition • Speech API exposes a raw WAV data format stream of the player’s voice to extract mood features – Average intensity gives the volume – Average number of x intercepts over time gives the pitch (Title), by (Your Name or Organization Here) 6
  • 7. Emotion Classification (Title), by (Your Name or Organization Here) 7 Mood Identifier Angry Agitated Excited High volume High pitch Cautious Low volume Low pitch Bored Low pitch Off-topic speech Prolonged silence Normal Everything else
  • 8. Natural Language Parsing • Microsoft SDK allows for custom CFGs – Structure, small-scale grammar vastly improves SDK performance • Framework parses syntax tree, building individual commands to issue – Generic rule handling allows for iterative grammar development – Add flexibility by using synonyms • Subsumption architecture handles individual event execution (Title), by (Your Name or Organization Here) 8
  • 9. Engine-Level Modifications • Chat mechanism for directly controlling bots • Primitive events to detect command completion • Primitive keywords: “me, you, this, there, here,” etc. • Map annotations (Title), by (Your Name or Organization Here) 9
  • 10. Subsumption Architecture • Sends chat event to the server based on priority and completion criteria of incoming event / command • Emotional responses are high priority atomic commands that override current high-level plan • Immediate commands override a high-level plan (Title), by (Your Name or Organization Here) 10
  • 11. Conclusion • An enriched game play was implemented for Quake3 using speech commands • Simple atomic commands provide robust control mechanism • Procedural semantics allow the user to build high-level plans • Emotional responses enrich the game (Title), by (Your Name or Organization Here) 11
  • 12. Future Work • In addition to recognizing the commands, use NLP also to augment the emotion classifier • Use extended mood identifiers such as facial expressions, heart rate, blood pressure, skin conductance sensors etc. • Many, many vocabulary and engine primitive extensions (Title), by (Your Name or Organization Here) 12