SlideShare a Scribd company logo
1 of 18
A Common Gesture and Speech Production Framework for
              Virtual and Physical Agents
     Quoc Anh Le - Jing Huang - Catherine Pelachaud
                        CNRS, LTCI
               Telecom-ParisTech, France



    Workshop on Speech and Gesture Production, ICMI 2012, Santa Monica, CA, USA
Introduction
      Motivations

       • Similar approaches between virtual agents and
         humanoid robots
       • Limits of existing systems: agent dependent
      Objectives

       • Common co-verbal gesture generation framework for
         both virtual and physical agents
      Methodologies

       • Based on GRETA system
       • Use
          - same representation languages
          - same algorithm for selecting and planning gestures
          - different algorithms for creating the animation
page 2
Architecture Overview
                                                     Intent Lexicon                          Behavior Lexicon

    Input Data (text, audio,                        Baselines for Nao                        Gestuary for Nao
          video, etc)                             Baselines for Greta                        Gestuary for Greta




          Intent Planner                          Behavior Planner                           Behavior Realizer
         (Common Module)                          (Common Module)                            (Common Module)
           FML-                            FML-
                                                             BML                       BML         Keyframes
           APML                            APML

                                                        ActiveMQ
                                              Messaging Central System

                               Keyframes                           Keyframes
FAP-BAP      FAP-BAP                                                                             Joint         Nao Built-in
 Player       Values            Animation Realizer                    Animation Realizer         Values        Proprietary
                                (Specific Module)                       (Specific Module)                      Procedures




                                 Greta                                  Nao
                               Animation Lexicon                      Animation Lexicon



page 3
Behavior Realizer
                                                     Intent Lexicon                          Behavior Lexicon
                                                                                              Behavior Lexicon

    Input Data (text, audio,                        Baselines for Nao                        Gestuary for Nao
          video, etc)                             Baselines for Greta                        Gestuary for Greta




          Intent Planner                          Behavior Planner                           Behavior Realizer

         (Common Module)                          (Common Module)                            (Common Module)
           FML-                            FML-
                                                             BML                       BML         Keyframes
           APML                            APML




                               Keyframes                           Keyframes
FAP-BAP      FAP-BAP                                                                             Joint         Nao Built-in
 Player       Values            Animation Realizer                    Animation Realizer         Values        Proprietary
                                (Specific Module)                       (Specific Module)                      Procedures




                                 Greta                                  Nao
                               Animation Lexicon                      Animation Lexicon



page 4
Behavior Realizer: Outline

      Common          processes to all agents
         1.   Create gesture from the gestuary of an agent
         2.   Schedule timing of gesture phases
         3.   Generate keyframes: pair (absolute time, symbolic
              description of hand configuration at this time)
      Different      databases
             For Nao
                 Gestuary (for instance, pointing with full stretch arm)
                 Velocity profile (empirically determined from Nao)
             For Greta
                 Gestuary (for instance, pointing with one finger)
                 Velocity profile (empirically determined from real humans)


page 5
Example: Different pointing gestures
                                                              <bml id=“bml1” >
Nao Gestuary
..
                                                                 <speech xmlns="" id="s1" start="0">
                                                                   <text>It is <sync id=« tm1 »/> overthere! <sync id=« tm2 »/>                BML                                Greta Gestuary
                                                                                                                                                                                  ..
                                                                 </speech>
<gesture id=« pointing »>                                        <gesture id=« g1 » lexeme=« pointing » start=«s1:tm1» end=«s2:tm2»>                                              <gesture id=« pointing »>
<phase type=« stroke »>
 <vertical>YUpperP</vertical>            1                           <description priority=« 1 » type=«GRETA»>
                                                                              <GRETA:SPC>0.80</GRETA:SPC>
                                                                             <GRETA:TMP>0.50</GRETA:TMP>
                                                                                                                                                                              1   <phase type=« stroke »>
                                                                                                                                                                                   <vertical>YP</vertical>
 <horizontal>XEP</horizontal>                                                <GRETA:FLD>-0.62</GRETA:FLD>                                                                          <horizontal>XP</horizontal>
 <distance>XFar<distance>                                                    <GRETA:PWR>0.30</GRETA:PWR>                                                                           <distance>XMiddle<distance>
 <hShape>OPEN</hShape>                                                       <GRETA:REP>0.00</GRETA:REP>                                                                           <hShape>INDEX</hShape>
                                                                             <GRETA:OPE>1.00</GRETA:OPE>
</phase>                                                                     <GRETA:TEN>0.20</GRETA:TEN>                                                                          </phase>
</gestures>                                                          </description>                                                                                               </gestures>
…                                                                </gesture>                                                                                                       …
                                                              </bml>




                                                                               2, 3                                                                  2,3
                                <keyframe 1 (time, description)>                                                                       <keyframe 1 (time, description)>
                                <keyframe 2 (time, description)>                                                                       <keyframe 2 (time, description)>
                                …                                                                                                      …
                                <keyframe N (time, description)>                                                                       <keyframe N (time, description)>




                                                                   4                                                                                                      4
                                            JOINT VALUES                                                                                                       BAP




       page 6
BR: Synchronization with speech

          Algorithm
          • Compute preparation phase
          • Do not perform gesture if not enough time (strokeEnd(i-1) > strokeStart(i)
            +duration)

          • Add a hold phase to fit gesture planned duration
          • Co-articulation between several gestures
            - If enough time, retraction phase (ie go back to rest position)


               Start                 end   Start                end
            - Otherwise, go from end of stroke to preparation phase of next
              gesture
                           S-start     S-end       S-start   S-end


                                                                  end
                  Start
page 7
BR: Velocity profiles

          Gesture   velocity
          • Predict a movement duration using Fitts’ law:
             • Movement Time = a+b*log2(Distance+1)
          • Threshold of maximal speeds (empirically determined)
          • Stroke phase is different from other phases in velocity and
            acceleration (Quek, 1995)
          Add   expressivity
              • Temportal extent (TMP): Modulate the duration of whole gesture
                => change coefficient of Fitts’ Law




page 8
BR: Build coefficients of Fitts’ law




page 9
Animation Realizer
                                                     Intent Lexicon                          Behavior Lexicon

    Input Data (text, audio,                        Baselines for Nao                        Gestuary for Nao
          video, etc)                             Baselines for Greta                        Gestuary for Greta




          Intent Planner                          Behavior Planner                           Behavior Realizer
      (Common Module)                             (Common Module)                            (Common Module)
           FML-                            FML-
                                                             BML                       BML         Keyframes
           APML                            APML




                               Keyframes                           Keyframes
             FAP-BAP                                                                             Joint
              Values            Animation Realizer                    Animation Realizer         Values
                                (Specific Module)                       (Specific Module)




                                 Greta                                  Nao
                               Animation Lexicon                      Animation Lexicon



page 10
Implemented expressivity parameters
EXP               Definition                       Nao                        Greta
TMP       Velocity of movement         Change coefficient of Fitts’   Change coefficient of
                                       law                            Fitts’ law
SPC       Amplitude of movement        Limited in predefined key      Change gesture
                                       positions                      space scales
PWR       Acceleration of              Modulate stroke duration       Modulate stroke
          movement                                                    acceleration
REP       Number of stroke             Yes                            Yes
          repetition times
FLD       Smoothness and               No                             No
          Continuity
OPN       Relative spatial extent to   No                             elbow swivel angle
          body
TEN       Muscular tension             No                             No

   Create animation parameters
         Joint values for Nao
         BAP values for Greta
    page 11
Create animation parameters
         Descritization of the gestural space of McNeill (1992)
         One symbolic position will be translated into concrete values of agent joints (for
          instance 6 joints of Nao as table below)
            Code   ArmX   ArmY       ArmZ      Joint values (LShoulderPitch, LShoulderRoll, LElbowYaw, LElbowRoll, LWristYaw, Hand)

            000    XEP    YUpperEP   ZNear     (-54.4953, 22.4979, -79.0171, -5.53477, -0.00240423, 1.0)
            001    XEP    YUpperEP   ZMiddle   (-65.5696, 22.0584, -78.7534, -8.52309, -0.178188, 1.0)
            002    XEP    YUpperEP   ZFar      (-79.2807, 22.0584, -78.6655,-8.4352, -0.178188, 1.0)
            010    XEP    YUpperP    ZNear     (-21.0964, 24.2557, -79.4565, -26.8046, 0.261271, 1.0)
            ...    ...    ...        ...       ...



         Translate symbolic keyframes in joint values
         Animation is obtained by interpolating between
             joint values with robot built-in proprietary procedures
             use Slerp (spherical linear interpolation) with time warping: easing in out
              functionsfor Greta



page 12
Greta: Full Body IK
                                                 Torso IK




                                          Analytic Method: Arm To Torso




     Torso target depending on hand position

page 13
Demo: Greta




page 14
Demo: Nao




page 15
Perceptive Evaluation
         Objective
          • Evaluate how robot’s gestures are perceived by human users
         Procedure
          • Participants (63 French speakers) rate videos of Nao
            storyteller
          • Random displayed versions to the participants:
          - Gestures with expressivity VS. Gestures without expressivity
          - Gesture-speech synchronization VS. Gesture-speech asynchronization
         Results (using the ANOVA method)
          • Synchronization:
          - F(1, 124) = 4.94, p < .05
          - 76% agreed that gestures were synchronized with speech for sync version
          • Expressivity:
          - F(1, 124) = 4.43, p < .05
          - 70% agreed that gestures were expressive for expressivity version
page 16
State of the art
         Most similar work: Salem et al. (2012)
          • Same idea (based on existing Max virtual agent system)
         Main differences:
          • Our system: re-designed GRETA as a common framework
          • Salem et al.’s system: adjusted Max’s ACE to ASIMO robot

          Features             Our model                 Salem et al.’s system

 Gesture Product     Online from templates        Automatically generated from trained
                     regardless specific domain   specified domain data corpus
 Gesture Shapes      Agent specific parameter     Original for Max and mapped to
                                                  ASIMO configurations

 Gesture Timing      Agent specific parameter     Original for Max and adapted to
                                                  ASIMO by feedback
 Expressivity        Yes                          No
 Synchronization     Adapt gesture to speech      Cross-Modal Adjustment



page 17
Future works

       Short-term   plan
        • Human like gestures: enhance velocity profiles
        • Expressivity: implement fluidity and tension
       Long-term plan

        • Feedback mechanism
        • Study of the coherence between consecutive
          gestures in a G-Unit (Kendon, 2004)




page 18

More Related Content

Viewers also liked

فيتامين واو
فيتامين واوفيتامين واو
فيتامين واوkininaful
 
EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要EventRegist Co., Ltd.
 
ACM ICMI Workshop 2012
ACM ICMI Workshop 2012ACM ICMI Workshop 2012
ACM ICMI Workshop 2012Lê Anh
 
Người Ảo
Người ẢoNgười Ảo
Người ẢoLê Anh
 
Cahier de charges
Cahier de chargesCahier de charges
Cahier de chargesLê Anh
 
Automatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordingsAutomatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordingsLê Anh
 
Lap trinh java hieu qua
Lap trinh java hieu quaLap trinh java hieu qua
Lap trinh java hieu quaLê Anh
 

Viewers also liked (8)

فيتامين واو
فيتامين واوفيتامين واو
فيتامين واو
 
EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要EventRegist(イベントレジスト)概要
EventRegist(イベントレジスト)概要
 
ACM ICMI Workshop 2012
ACM ICMI Workshop 2012ACM ICMI Workshop 2012
ACM ICMI Workshop 2012
 
Diftong
DiftongDiftong
Diftong
 
Người Ảo
Người ẢoNgười Ảo
Người Ảo
 
Cahier de charges
Cahier de chargesCahier de charges
Cahier de charges
 
Automatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordingsAutomatic vs. human question answering over multimedia meeting recordings
Automatic vs. human question answering over multimedia meeting recordings
 
Lap trinh java hieu qua
Lap trinh java hieu quaLap trinh java hieu qua
Lap trinh java hieu qua
 

Similar to Common Gesture and Speech Production Framework for Virtual and Physical Agents

SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!melbats
 
RSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroRSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroYosuke Matsusaka
 
Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)Lê Anh
 
Casing3d opengl
Casing3d openglCasing3d opengl
Casing3d openglgowell
 
Florian adler minute project
Florian adler   minute projectFlorian adler   minute project
Florian adler minute projectDmitry Buzdin
 
Metadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to createMetadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to createvrt-medialab
 

Similar to Common Gesture and Speech Production Framework for Virtual and Physical Agents (8)

SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!SiriusCon 2015 - Breathe Life into Your Designer!
SiriusCon 2015 - Breathe Life into Your Designer!
 
RSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI IntroRSJ2011 OSS Robotics and Tools OpenHRI Intro
RSJ2011 OSS Robotics and Tools OpenHRI Intro
 
Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)Affective Computing and Intelligent Interaction (ACII 2011)
Affective Computing and Intelligent Interaction (ACII 2011)
 
Casing3d opengl
Casing3d openglCasing3d opengl
Casing3d opengl
 
Cascon2011_5_rules+owl
Cascon2011_5_rules+owlCascon2011_5_rules+owl
Cascon2011_5_rules+owl
 
Florian adler minute project
Florian adler   minute projectFlorian adler   minute project
Florian adler minute project
 
Metadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to createMetadata om te creëren / Metadata to create
Metadata om te creëren / Metadata to create
 
2
22
2
 

More from Lê Anh

Spark docker
Spark dockerSpark docker
Spark dockerLê Anh
 
Presentation des outils traitements distribues
Presentation des outils traitements distribuesPresentation des outils traitements distribues
Presentation des outils traitements distribuesLê Anh
 
Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013Lê Anh
 
Lequocanh
LequocanhLequocanh
LequocanhLê Anh
 
These lequocanh v7
These lequocanh v7These lequocanh v7
These lequocanh v7Lê Anh
 
Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam Lê Anh
 
Poster WACAI 2012
Poster WACAI 2012Poster WACAI 2012
Poster WACAI 2012Lê Anh
 
Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)Lê Anh
 
IEEE Humanoids 2011
IEEE Humanoids 2011IEEE Humanoids 2011
IEEE Humanoids 2011Lê Anh
 
ACII 2011, USA
ACII 2011, USAACII 2011, USA
ACII 2011, USALê Anh
 
Mid-term thesis report
Mid-term thesis reportMid-term thesis report
Mid-term thesis reportLê Anh
 
Journée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-RobotJournée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-RobotLê Anh
 

More from Lê Anh (12)

Spark docker
Spark dockerSpark docker
Spark docker
 
Presentation des outils traitements distribues
Presentation des outils traitements distribuesPresentation des outils traitements distribues
Presentation des outils traitements distribues
 
Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013Final report. nguyen ngoc anh.01.07.2013
Final report. nguyen ngoc anh.01.07.2013
 
Lequocanh
LequocanhLequocanh
Lequocanh
 
These lequocanh v7
These lequocanh v7These lequocanh v7
These lequocanh v7
 
Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam Applying Computer Vision to Traffic Monitoring System in Vietnam
Applying Computer Vision to Traffic Monitoring System in Vietnam
 
Poster WACAI 2012
Poster WACAI 2012Poster WACAI 2012
Poster WACAI 2012
 
Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)Lecture Notes in Computer Science (LNCS)
Lecture Notes in Computer Science (LNCS)
 
IEEE Humanoids 2011
IEEE Humanoids 2011IEEE Humanoids 2011
IEEE Humanoids 2011
 
ACII 2011, USA
ACII 2011, USAACII 2011, USA
ACII 2011, USA
 
Mid-term thesis report
Mid-term thesis reportMid-term thesis report
Mid-term thesis report
 
Journée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-RobotJournée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
Journée Inter-GDR ISIS et Robotique: Interaction Homme-Robot
 

Recently uploaded

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 

Recently uploaded (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 

Common Gesture and Speech Production Framework for Virtual and Physical Agents

  • 1. A Common Gesture and Speech Production Framework for Virtual and Physical Agents Quoc Anh Le - Jing Huang - Catherine Pelachaud CNRS, LTCI Telecom-ParisTech, France Workshop on Speech and Gesture Production, ICMI 2012, Santa Monica, CA, USA
  • 2. Introduction  Motivations • Similar approaches between virtual agents and humanoid robots • Limits of existing systems: agent dependent  Objectives • Common co-verbal gesture generation framework for both virtual and physical agents  Methodologies • Based on GRETA system • Use - same representation languages - same algorithm for selecting and planning gestures - different algorithms for creating the animation page 2
  • 3. Architecture Overview Intent Lexicon Behavior Lexicon Input Data (text, audio, Baselines for Nao Gestuary for Nao video, etc) Baselines for Greta Gestuary for Greta Intent Planner Behavior Planner Behavior Realizer (Common Module) (Common Module) (Common Module) FML- FML- BML BML Keyframes APML APML ActiveMQ Messaging Central System Keyframes Keyframes FAP-BAP FAP-BAP Joint Nao Built-in Player Values Animation Realizer Animation Realizer Values Proprietary (Specific Module) (Specific Module) Procedures Greta Nao Animation Lexicon Animation Lexicon page 3
  • 4. Behavior Realizer Intent Lexicon Behavior Lexicon Behavior Lexicon Input Data (text, audio, Baselines for Nao Gestuary for Nao video, etc) Baselines for Greta Gestuary for Greta Intent Planner Behavior Planner Behavior Realizer (Common Module) (Common Module) (Common Module) FML- FML- BML BML Keyframes APML APML Keyframes Keyframes FAP-BAP FAP-BAP Joint Nao Built-in Player Values Animation Realizer Animation Realizer Values Proprietary (Specific Module) (Specific Module) Procedures Greta Nao Animation Lexicon Animation Lexicon page 4
  • 5. Behavior Realizer: Outline  Common processes to all agents 1. Create gesture from the gestuary of an agent 2. Schedule timing of gesture phases 3. Generate keyframes: pair (absolute time, symbolic description of hand configuration at this time)  Different databases  For Nao  Gestuary (for instance, pointing with full stretch arm)  Velocity profile (empirically determined from Nao)  For Greta  Gestuary (for instance, pointing with one finger)  Velocity profile (empirically determined from real humans) page 5
  • 6. Example: Different pointing gestures <bml id=“bml1” > Nao Gestuary .. <speech xmlns="" id="s1" start="0"> <text>It is <sync id=« tm1 »/> overthere! <sync id=« tm2 »/> BML Greta Gestuary .. </speech> <gesture id=« pointing »> <gesture id=« g1 » lexeme=« pointing » start=«s1:tm1» end=«s2:tm2»> <gesture id=« pointing »> <phase type=« stroke »> <vertical>YUpperP</vertical> 1 <description priority=« 1 » type=«GRETA»> <GRETA:SPC>0.80</GRETA:SPC> <GRETA:TMP>0.50</GRETA:TMP> 1 <phase type=« stroke »> <vertical>YP</vertical> <horizontal>XEP</horizontal> <GRETA:FLD>-0.62</GRETA:FLD> <horizontal>XP</horizontal> <distance>XFar<distance> <GRETA:PWR>0.30</GRETA:PWR> <distance>XMiddle<distance> <hShape>OPEN</hShape> <GRETA:REP>0.00</GRETA:REP> <hShape>INDEX</hShape> <GRETA:OPE>1.00</GRETA:OPE> </phase> <GRETA:TEN>0.20</GRETA:TEN> </phase> </gestures> </description> </gestures> … </gesture> … </bml> 2, 3 2,3 <keyframe 1 (time, description)> <keyframe 1 (time, description)> <keyframe 2 (time, description)> <keyframe 2 (time, description)> … … <keyframe N (time, description)> <keyframe N (time, description)> 4 4 JOINT VALUES BAP page 6
  • 7. BR: Synchronization with speech  Algorithm • Compute preparation phase • Do not perform gesture if not enough time (strokeEnd(i-1) > strokeStart(i) +duration) • Add a hold phase to fit gesture planned duration • Co-articulation between several gestures - If enough time, retraction phase (ie go back to rest position) Start end Start end - Otherwise, go from end of stroke to preparation phase of next gesture S-start S-end S-start S-end end Start page 7
  • 8. BR: Velocity profiles  Gesture velocity • Predict a movement duration using Fitts’ law: • Movement Time = a+b*log2(Distance+1) • Threshold of maximal speeds (empirically determined) • Stroke phase is different from other phases in velocity and acceleration (Quek, 1995)  Add expressivity • Temportal extent (TMP): Modulate the duration of whole gesture => change coefficient of Fitts’ Law page 8
  • 9. BR: Build coefficients of Fitts’ law page 9
  • 10. Animation Realizer Intent Lexicon Behavior Lexicon Input Data (text, audio, Baselines for Nao Gestuary for Nao video, etc) Baselines for Greta Gestuary for Greta Intent Planner Behavior Planner Behavior Realizer (Common Module) (Common Module) (Common Module) FML- FML- BML BML Keyframes APML APML Keyframes Keyframes FAP-BAP Joint Values Animation Realizer Animation Realizer Values (Specific Module) (Specific Module) Greta Nao Animation Lexicon Animation Lexicon page 10
  • 11. Implemented expressivity parameters EXP Definition Nao Greta TMP Velocity of movement Change coefficient of Fitts’ Change coefficient of law Fitts’ law SPC Amplitude of movement Limited in predefined key Change gesture positions space scales PWR Acceleration of Modulate stroke duration Modulate stroke movement acceleration REP Number of stroke Yes Yes repetition times FLD Smoothness and No No Continuity OPN Relative spatial extent to No elbow swivel angle body TEN Muscular tension No No  Create animation parameters  Joint values for Nao  BAP values for Greta page 11
  • 12. Create animation parameters  Descritization of the gestural space of McNeill (1992)  One symbolic position will be translated into concrete values of agent joints (for instance 6 joints of Nao as table below) Code ArmX ArmY ArmZ Joint values (LShoulderPitch, LShoulderRoll, LElbowYaw, LElbowRoll, LWristYaw, Hand) 000 XEP YUpperEP ZNear (-54.4953, 22.4979, -79.0171, -5.53477, -0.00240423, 1.0) 001 XEP YUpperEP ZMiddle (-65.5696, 22.0584, -78.7534, -8.52309, -0.178188, 1.0) 002 XEP YUpperEP ZFar (-79.2807, 22.0584, -78.6655,-8.4352, -0.178188, 1.0) 010 XEP YUpperP ZNear (-21.0964, 24.2557, -79.4565, -26.8046, 0.261271, 1.0) ... ... ... ... ...  Translate symbolic keyframes in joint values  Animation is obtained by interpolating between  joint values with robot built-in proprietary procedures  use Slerp (spherical linear interpolation) with time warping: easing in out functionsfor Greta page 12
  • 13. Greta: Full Body IK Torso IK Analytic Method: Arm To Torso Torso target depending on hand position page 13
  • 16. Perceptive Evaluation  Objective • Evaluate how robot’s gestures are perceived by human users  Procedure • Participants (63 French speakers) rate videos of Nao storyteller • Random displayed versions to the participants: - Gestures with expressivity VS. Gestures without expressivity - Gesture-speech synchronization VS. Gesture-speech asynchronization  Results (using the ANOVA method) • Synchronization: - F(1, 124) = 4.94, p < .05 - 76% agreed that gestures were synchronized with speech for sync version • Expressivity: - F(1, 124) = 4.43, p < .05 - 70% agreed that gestures were expressive for expressivity version page 16
  • 17. State of the art  Most similar work: Salem et al. (2012) • Same idea (based on existing Max virtual agent system)  Main differences: • Our system: re-designed GRETA as a common framework • Salem et al.’s system: adjusted Max’s ACE to ASIMO robot Features Our model Salem et al.’s system Gesture Product Online from templates Automatically generated from trained regardless specific domain specified domain data corpus Gesture Shapes Agent specific parameter Original for Max and mapped to ASIMO configurations Gesture Timing Agent specific parameter Original for Max and adapted to ASIMO by feedback Expressivity Yes No Synchronization Adapt gesture to speech Cross-Modal Adjustment page 17
  • 18. Future works  Short-term plan • Human like gestures: enhance velocity profiles • Expressivity: implement fluidity and tension  Long-term plan • Feedback mechanism • Study of the coherence between consecutive gestures in a G-Unit (Kendon, 2004) page 18

Editor's Notes

  1. Schedule Mechanisme Such as Account Realize Obtain /ob chen/ Architecture /ar ki tec tro/ Exchange /ex s change z/ Twice / wi so/ Table /ta ble/ Creating /cre et ting/ Message /me se/ Virtual /vir tu al/
  2. donnes une description des keyframes que contiennent-elles comme information
  3. rajouter les definitions manquantes “ Power”: acceleration simulation through slerp (frame interpolation) or trajectory interpolation: use of time variation functions (easing in out functions) Expressive Posture: Volume Editing Power parameter: torso relative rotation varies with time and gesture target positions due to inertia Expressive Animated Sequence: Sequential Editing “ fluidity” and “tension” using TCB spline and noise functions(for trajectory) “ Power”: acceleration simulation through slerp (frame interpolation) or trajectory interpolation: use of time variation functions (easing in out functions)
  4. Joint rotation interpolation: use Slerp (spherical linear interpolation) with time warping: easing in out functions. Definition of trajectory parameters: Various trajectory paths: line, circle, spiral, etc. Expressivity: Kochanek Bartels splines(TCB splines)
  5. For posture generation, we use Forward kine. FK defines the initial states; the IK retargets the postures. Relative torso movement is first generated by using potential torso target depending on both hand gestures positions. (vt1, vl5) We decompose torso movement into horizontal and vertical movements, it depends on the center of both hands targets, we solve it directly by analytical method. Head direction is generated by FK, and trigonometric function for gaze. For Arm gesture we use a mass spring solver, which can apply light weight shoulder movements by defining arm chain from sternoclavicular till wrist. It allows us to model passive shoulder movement
  6. The system of Salem et al. produce gesture parameters &gt; potentially result in mistimed synchronization with speech affiliate due to physical joint velocity limits Max: Gesture shapes are designed for virtual agent &gt; Mapping solution
  7. Long-term plan: Mutual synchronization: Adapting phoneme duration to gestures