SlideShare a Scribd company logo
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Multiple Categorization by iCub:
Learning Relationships between
Multiple Modalities and Words
○Akira Taniguchi*1,Tadahiro Taniguchi*1,
Angelo Cangelosi*2
*1 Ritsumeikan University, Japan
*2 Plymouth University, UK
1
IROS Workshop on Machine Learning Methods for High-Level Cognitive Capabilities
in Robotics 2016 (ML-HLCR 2016)
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Research background
• Infants can acquire word meanings by estimating the
relationships between multiple situations and words.
• For example, if infant grasps a red ball at hand, the parent
may describe an action of infant and an object using a
sentence.
In this case, infant does not know the
relationship between words and
situations because infant has not
acquired the word meanings.
Infant cannot determine whether the
word “red” indicates an action, an
object, a position or a color.
2
“grasp front
red ball”
ball ?
grasp ?
front ?
red ?
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
• Infants can acquire word meanings by estimating the
relationships between multiple situations and words.
“grasp front
red ball”
ball ?
grasp ?
front ?
red ?
Research background
“look at red
apple”
apple ?
red ?
look at?
“right red
car”
right ?
car ?
red ?
We consider that infant can learn that the word
“red” represents the red color by observing the
co-occurrence of the word “red” with objects of
red color in multiple situations.
This is called cross-situational learning.
[Smith et al. 2011], [Fontanari et al. 2009]
3
“car”!“car”!“red”!
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Related work
• Peniak et al. 2011
Action learning by multiple time-scale
recurrent neural network
In our study, we perform cross-
situational learning, including action
learning, by a Bayesian probabilistic
model.
• M. Attamimi et al. 2016
Learning word meanings and grammar
by multilayered multimodal latent
Dirichlet allocation (mMLDA) and
Bayesian HMM
Estimation of the relationships between
words and multiple concepts by
weighting the learned words according
to their mutual information as post-
processing.
In our study, the proposed method can
estimate multiple categories and the
relationships between words and
modalities simultaneously. 4
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Research purpose
grasp green front cupHuman tutor
Multiple categorization (action, object, color, position)
and
Learning Relationships between Multiple Modalities and Words
Position of objectsColor of objects
Action information
of the robot
Visual
feature of
objects
?
5
The humanoid iCub robot
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Overview of the task
1. The robot is in front of the table with objects
on it.
2. The robot selects an object. The robot performs
visual attention and an action on an object.
– e.g., touch, reach, grasp, look at
3. The human tutor speaks a sentence about the
object and the action of the robot.
4. The robot processes the sentence to discover
the meanings of the words.
This process (steps 1-4) is carried out many times in different situations.
The robot learns word meanings and multiple categories by using visual, tactile,
and proprioceptive information, as well as words.
6
(Video clip)
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
The proposed method
Multiple categorizations and word meaning learning
• A categorization for each modality is
represented by Gaussian mixture model (GMM).
• 𝐹𝑑 is a modality related to a word.
• 𝐴 𝑑 is an object on the table.𝐿
𝑀
o
dmz dmo o
k o
o

𝐾 𝑜
o

dnw l 
𝐷
𝑁
dA
𝐾 𝑐
c
dmz dmc c
k c
c
c

p
dmz dmp p
k p
p

𝐾 𝑝
p

𝐾 𝑎
a
dz da a
k a
a
a

dF
Word
distribution
GMM
(color)
GMM
(object feature)
GMM
(position)
GMM
(action)
Selection of
an object
Selection of
the modality
𝐹𝑑 = ( a, p, c, o )
grasp front green cup
1
2
7
𝑊1 𝑊2 𝑊3 𝑊4
𝐴 𝑑=object1
a: action, p: position, c: color, o: object feature
The number of objects M The number of data D
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
The proposed method
Generative model
𝐿
𝑀
o
dmz dmo o
k o
o

𝐾 𝑜
o

dnw l 
𝐷
𝑁
dA
𝐾 𝑐
c
dmz dmc c
k c
c
c

p
dmz dmp p
k p
p

𝐾 𝑝
p

𝐾 𝑎
a
dz da a
k a
a
a

dF
𝐿 = 𝐾 𝑎 + 𝐾 𝑝 + 𝐾 𝑜 + 𝐾 𝑐
8
In equation (1), we assume that a word related to
each modality is spoken only once in each sentence.
Word
distribution
Selection of
an object
Selection of
the modality
✔ 𝐹𝑑 = a, p, c, o , ✖ 𝐹𝑑 = o, o,o, o
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Simulator experiment
The procedure for getting and processing data
ID : 𝑥 , 𝑦
1: -0.351, -0.175
2: -0.348, 0.184
3: -0.291, 0.007
9
Action
• looking at an object of
attention
• Reaching for an object
• Grasping with random
degree
Getting visual information
Getting action information
• Posture
• Tactile information
• Relative coordinates to
the object from the hand
Object feature (SIFT)
Color (RBG histogram)
Position (Homography)1
3
2
Area detection of objects
(Background subtraction)
k-means & normalization
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 10
1 1121314151617181911 1121314151617181911 112131415161718191
grasp right green ball
Word information
An object of attention:2
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
grasp front green cup
Simulator experiment
Condition
5 categories for each modality
Normalization to [0,1] for each
dimension of data
”box” “ball” “cup”
10
The number of action trials: 20 trials
The number of objects on the table: 1 – 3 objects
The number of words for each trial:4 words
The word order for each category was 𝐹𝑑 = (a,p,c,o)
in all of the sentences.
The number of kind of words :14 words
• “reach”, “touch”, “grasp”, “look at”
• “front”, “left”, “right”, “far”
• “green”, “red”, “blue”
• “box”, “cup”, “ball”
reach front green box
touch right green cup
look at right blue box
reach front blue ball
grasp far red box
action position color object
Example of teaching sentences
20 trials
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Experimental results
Word probability distributions 𝜃𝑙 (Multinomial distribution)
touch grasp look at reach far left front right box ball cup green red blue
a 0
a 1
a 2
a 3
a 4
p 0
p 1
p 2
p 3
p 4
o 0
o 1
o 2
o 3
o 4
c 0
c 1
c 2
c 3
c 4
11
Higher probability values are represented by darker shades.
a: action, p: position, o: object feature, c: color
The results show that the proposed method was able to associate each word with its
each modality. (in thick-bordered boxes)
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Experimental results
Position, object, and color category
p0
p1 p4
p2
far left front right
p 0
p 1
p 2
p 3
p 4
box ball cup
o 0
o 1
o 2
o 3
o 4
Object category
Color category
green red blue
c 0
c 1
c 2
c 3
c 4
Part of the example of categorization results
Position category
12
Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words
Conclusions
• We have proposed a Bayesian probabilistic model that can learn
multiple categories and the relationships between words and
multiple modalities.
• The experimental results showed that the robot can perform the
categorization for each modality and the estimation of a modality
related to a word in complex situations.
Future directions
• Experiments using a real iCub
• Learning by uncertain spoken sentences
– Changing the number of words and order
• Action generation task, description task
13
THANK YOU FOR YOUR KIND ATTENTION.

More Related Content

Similar to Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words

Social Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docxSocial Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docx
whitneyleman54422
 
Plan ceibal and english immersion program
Plan ceibal and english immersion programPlan ceibal and english immersion program
Plan ceibal and english immersion program
Graciela Bilat
 
Using Scotland's Curriculum for Excellence for Effective Learning
Using Scotland's Curriculum for Excellence for Effective LearningUsing Scotland's Curriculum for Excellence for Effective Learning
Using Scotland's Curriculum for Excellence for Effective Learning
Learning and Teaching Scotland
 
Project-Based Learning for Theory of Knowledge (TOK)
Project-Based Learning for Theory of Knowledge (TOK)Project-Based Learning for Theory of Knowledge (TOK)
Project-Based Learning for Theory of Knowledge (TOK)
jenniferjoelle
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
talha ijaz
 
Daily lesson log in science grade 4 quarter 3
Daily lesson log  in science grade 4 quarter 3Daily lesson log  in science grade 4 quarter 3
Daily lesson log in science grade 4 quarter 3
OmarGalacasSanday
 
CHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date Apr
CHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date  AprCHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date  Apr
CHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date Apr
JinElias52
 
Communicative skills
Communicative skillsCommunicative skills
Communicative skills
KuanyshZHappar
 
100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx
100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx
100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx
xeinyenmoon
 
Gecs talk on assessment learning by design
Gecs talk on assessment learning by designGecs talk on assessment learning by design
Gecs talk on assessment learning by design
Brock Dubbels
 
IALLT April 2017 Presentation - Networked Language Learning
IALLT April 2017 Presentation - Networked Language LearningIALLT April 2017 Presentation - Networked Language Learning
IALLT April 2017 Presentation - Networked Language Learning
Apostolos Koutropoulos
 
Ela look fors
Ela look forsEla look fors
Ela look fors
Jennifer Evans
 
Present kuantan
Present kuantanPresent kuantan
Present kuantan
Laili Farhana M.I.
 
Designs for Active Learning, Cambridge 2017
Designs for Active Learning, Cambridge 2017Designs for Active Learning, Cambridge 2017
Designs for Active Learning, Cambridge 2017
Mike Sharples
 
Covington ElementaryAshley CovingtonProfessor Lori Infants.docx
Covington ElementaryAshley CovingtonProfessor Lori Infants.docxCovington ElementaryAshley CovingtonProfessor Lori Infants.docx
Covington ElementaryAshley CovingtonProfessor Lori Infants.docx
faithxdunce63732
 
Augmented Reality in Education: Present Accomplishments, Future Visions
Augmented Reality in Education: Present Accomplishments, Future VisionsAugmented Reality in Education: Present Accomplishments, Future Visions
Augmented Reality in Education: Present Accomplishments, Future Visions
Julie Evans
 
The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...
The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...
The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...
Touro College
 
Designing Social Interactions in a Teachable Agent
Designing Social Interactions in a Teachable AgentDesigning Social Interactions in a Teachable Agent
Designing Social Interactions in a Teachable Agent
diannepatricia
 
Wingate article critique summary
Wingate article critique summaryWingate article critique summary
Wingate article critique summary
Nicole Wingate
 
Artificial Thinking: can machines reason with analogies?
Artificial Thinking:  can machines reason with analogies? Artificial Thinking:  can machines reason with analogies?
Artificial Thinking: can machines reason with analogies?
Federico Bianchi
 

Similar to Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words (20)

Social Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docxSocial Media and Online Collaboration ToolsBusiness In.docx
Social Media and Online Collaboration ToolsBusiness In.docx
 
Plan ceibal and english immersion program
Plan ceibal and english immersion programPlan ceibal and english immersion program
Plan ceibal and english immersion program
 
Using Scotland's Curriculum for Excellence for Effective Learning
Using Scotland's Curriculum for Excellence for Effective LearningUsing Scotland's Curriculum for Excellence for Effective Learning
Using Scotland's Curriculum for Excellence for Effective Learning
 
Project-Based Learning for Theory of Knowledge (TOK)
Project-Based Learning for Theory of Knowledge (TOK)Project-Based Learning for Theory of Knowledge (TOK)
Project-Based Learning for Theory of Knowledge (TOK)
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Daily lesson log in science grade 4 quarter 3
Daily lesson log  in science grade 4 quarter 3Daily lesson log  in science grade 4 quarter 3
Daily lesson log in science grade 4 quarter 3
 
CHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date Apr
CHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date  AprCHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date  Apr
CHILD PSYCHOLOGYFall 2019Project OptionsProject Due Date Apr
 
Communicative skills
Communicative skillsCommunicative skills
Communicative skills
 
100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx
100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx
100 LET QUESTIONS WITH ANSWER AND RATIONALIZATION.docx
 
Gecs talk on assessment learning by design
Gecs talk on assessment learning by designGecs talk on assessment learning by design
Gecs talk on assessment learning by design
 
IALLT April 2017 Presentation - Networked Language Learning
IALLT April 2017 Presentation - Networked Language LearningIALLT April 2017 Presentation - Networked Language Learning
IALLT April 2017 Presentation - Networked Language Learning
 
Ela look fors
Ela look forsEla look fors
Ela look fors
 
Present kuantan
Present kuantanPresent kuantan
Present kuantan
 
Designs for Active Learning, Cambridge 2017
Designs for Active Learning, Cambridge 2017Designs for Active Learning, Cambridge 2017
Designs for Active Learning, Cambridge 2017
 
Covington ElementaryAshley CovingtonProfessor Lori Infants.docx
Covington ElementaryAshley CovingtonProfessor Lori Infants.docxCovington ElementaryAshley CovingtonProfessor Lori Infants.docx
Covington ElementaryAshley CovingtonProfessor Lori Infants.docx
 
Augmented Reality in Education: Present Accomplishments, Future Visions
Augmented Reality in Education: Present Accomplishments, Future VisionsAugmented Reality in Education: Present Accomplishments, Future Visions
Augmented Reality in Education: Present Accomplishments, Future Visions
 
The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...
The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...
The Phoenix Firestorm Project: Virtual Worlds, Jokaydia Grid and Second Life;...
 
Designing Social Interactions in a Teachable Agent
Designing Social Interactions in a Teachable AgentDesigning Social Interactions in a Teachable Agent
Designing Social Interactions in a Teachable Agent
 
Wingate article critique summary
Wingate article critique summaryWingate article critique summary
Wingate article critique summary
 
Artificial Thinking: can machines reason with analogies?
Artificial Thinking:  can machines reason with analogies? Artificial Thinking:  can machines reason with analogies?
Artificial Thinking: can machines reason with analogies?
 

More from Akira Taniguchi

第8回Language and Robotics研究会20221010_AkiraTaniguchi
第8回Language and Robotics研究会20221010_AkiraTaniguchi第8回Language and Robotics研究会20221010_AkiraTaniguchi
第8回Language and Robotics研究会20221010_AkiraTaniguchi
Akira Taniguchi
 
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
Akira Taniguchi
 
論文紹介 LexToMap: lexical-based topological mapping
論文紹介 LexToMap: lexical-based topological mapping論文紹介 LexToMap: lexical-based topological mapping
論文紹介 LexToMap: lexical-based topological mapping
Akira Taniguchi
 
論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey
論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey
論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey
Akira Taniguchi
 
論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience
論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience
論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience
Akira Taniguchi
 
論文紹介  Communication between Lingodroids with different cognitive capabilities
論文紹介  Communication between Lingodroids with different cognitive capabilities論文紹介  Communication between Lingodroids with different cognitive capabilities
論文紹介  Communication between Lingodroids with different cognitive capabilities
Akira Taniguchi
 
Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...
Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...
Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...
Akira Taniguchi
 
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Akira Taniguchi
 
論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...
論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...
論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...
Akira Taniguchi
 
SpCoA: Nonparametric Bayesian Spatial Concept Acquisition
SpCoA: Nonparametric Bayesian Spatial Concept AcquisitionSpCoA: Nonparametric Bayesian Spatial Concept Acquisition
SpCoA: Nonparametric Bayesian Spatial Concept Acquisition
Akira Taniguchi
 

More from Akira Taniguchi (10)

第8回Language and Robotics研究会20221010_AkiraTaniguchi
第8回Language and Robotics研究会20221010_AkiraTaniguchi第8回Language and Robotics研究会20221010_AkiraTaniguchi
第8回Language and Robotics研究会20221010_AkiraTaniguchi
 
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
[IROS2017] Online Spatial Concept and Lexical Acquisition with Simultaneous L...
 
論文紹介 LexToMap: lexical-based topological mapping
論文紹介 LexToMap: lexical-based topological mapping論文紹介 LexToMap: lexical-based topological mapping
論文紹介 LexToMap: lexical-based topological mapping
 
論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey
論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey
論文紹介 Semantic Mapping for Mobile Robotics Tasks: A Survey
 
論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience
論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience
論文紹介 Grounded Spatial Symbols for Task Planning Based on Experience
 
論文紹介  Communication between Lingodroids with different cognitive capabilities
論文紹介  Communication between Lingodroids with different cognitive capabilities論文紹介  Communication between Lingodroids with different cognitive capabilities
論文紹介  Communication between Lingodroids with different cognitive capabilities
 
Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...
Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...
Simultaneous Estimation of Self-position and Word from Noisy Utterances and S...
 
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
Simultaneous Localization, Mapping and Self-body Shape Estimation by a Mobile...
 
論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...
論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...
論文紹介 A Bayesian framework for word segmentation: Exploring the effects of con...
 
SpCoA: Nonparametric Bayesian Spatial Concept Acquisition
SpCoA: Nonparametric Bayesian Spatial Concept AcquisitionSpCoA: Nonparametric Bayesian Spatial Concept Acquisition
SpCoA: Nonparametric Bayesian Spatial Concept Acquisition
 

Recently uploaded

Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 

Recently uploaded (20)

Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 

Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words

  • 1. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words ○Akira Taniguchi*1,Tadahiro Taniguchi*1, Angelo Cangelosi*2 *1 Ritsumeikan University, Japan *2 Plymouth University, UK 1 IROS Workshop on Machine Learning Methods for High-Level Cognitive Capabilities in Robotics 2016 (ML-HLCR 2016)
  • 2. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Research background • Infants can acquire word meanings by estimating the relationships between multiple situations and words. • For example, if infant grasps a red ball at hand, the parent may describe an action of infant and an object using a sentence. In this case, infant does not know the relationship between words and situations because infant has not acquired the word meanings. Infant cannot determine whether the word “red” indicates an action, an object, a position or a color. 2 “grasp front red ball” ball ? grasp ? front ? red ?
  • 3. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words • Infants can acquire word meanings by estimating the relationships between multiple situations and words. “grasp front red ball” ball ? grasp ? front ? red ? Research background “look at red apple” apple ? red ? look at? “right red car” right ? car ? red ? We consider that infant can learn that the word “red” represents the red color by observing the co-occurrence of the word “red” with objects of red color in multiple situations. This is called cross-situational learning. [Smith et al. 2011], [Fontanari et al. 2009] 3 “car”!“car”!“red”!
  • 4. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Related work • Peniak et al. 2011 Action learning by multiple time-scale recurrent neural network In our study, we perform cross- situational learning, including action learning, by a Bayesian probabilistic model. • M. Attamimi et al. 2016 Learning word meanings and grammar by multilayered multimodal latent Dirichlet allocation (mMLDA) and Bayesian HMM Estimation of the relationships between words and multiple concepts by weighting the learned words according to their mutual information as post- processing. In our study, the proposed method can estimate multiple categories and the relationships between words and modalities simultaneously. 4
  • 5. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Research purpose grasp green front cupHuman tutor Multiple categorization (action, object, color, position) and Learning Relationships between Multiple Modalities and Words Position of objectsColor of objects Action information of the robot Visual feature of objects ? 5 The humanoid iCub robot
  • 6. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Overview of the task 1. The robot is in front of the table with objects on it. 2. The robot selects an object. The robot performs visual attention and an action on an object. – e.g., touch, reach, grasp, look at 3. The human tutor speaks a sentence about the object and the action of the robot. 4. The robot processes the sentence to discover the meanings of the words. This process (steps 1-4) is carried out many times in different situations. The robot learns word meanings and multiple categories by using visual, tactile, and proprioceptive information, as well as words. 6 (Video clip)
  • 7. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words The proposed method Multiple categorizations and word meaning learning • A categorization for each modality is represented by Gaussian mixture model (GMM). • 𝐹𝑑 is a modality related to a word. • 𝐴 𝑑 is an object on the table.𝐿 𝑀 o dmz dmo o k o o  𝐾 𝑜 o  dnw l  𝐷 𝑁 dA 𝐾 𝑐 c dmz dmc c k c c c  p dmz dmp p k p p  𝐾 𝑝 p  𝐾 𝑎 a dz da a k a a a  dF Word distribution GMM (color) GMM (object feature) GMM (position) GMM (action) Selection of an object Selection of the modality 𝐹𝑑 = ( a, p, c, o ) grasp front green cup 1 2 7 𝑊1 𝑊2 𝑊3 𝑊4 𝐴 𝑑=object1 a: action, p: position, c: color, o: object feature The number of objects M The number of data D
  • 8. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words The proposed method Generative model 𝐿 𝑀 o dmz dmo o k o o  𝐾 𝑜 o  dnw l  𝐷 𝑁 dA 𝐾 𝑐 c dmz dmc c k c c c  p dmz dmp p k p p  𝐾 𝑝 p  𝐾 𝑎 a dz da a k a a a  dF 𝐿 = 𝐾 𝑎 + 𝐾 𝑝 + 𝐾 𝑜 + 𝐾 𝑐 8 In equation (1), we assume that a word related to each modality is spoken only once in each sentence. Word distribution Selection of an object Selection of the modality ✔ 𝐹𝑑 = a, p, c, o , ✖ 𝐹𝑑 = o, o,o, o
  • 9. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Simulator experiment The procedure for getting and processing data ID : 𝑥 , 𝑦 1: -0.351, -0.175 2: -0.348, 0.184 3: -0.291, 0.007 9 Action • looking at an object of attention • Reaching for an object • Grasping with random degree Getting visual information Getting action information • Posture • Tactile information • Relative coordinates to the object from the hand Object feature (SIFT) Color (RBG histogram) Position (Homography)1 3 2 Area detection of objects (Background subtraction) k-means & normalization 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 1121314151617181911 1121314151617181911 112131415161718191 grasp right green ball Word information An object of attention:2
  • 10. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words grasp front green cup Simulator experiment Condition 5 categories for each modality Normalization to [0,1] for each dimension of data ”box” “ball” “cup” 10 The number of action trials: 20 trials The number of objects on the table: 1 – 3 objects The number of words for each trial:4 words The word order for each category was 𝐹𝑑 = (a,p,c,o) in all of the sentences. The number of kind of words :14 words • “reach”, “touch”, “grasp”, “look at” • “front”, “left”, “right”, “far” • “green”, “red”, “blue” • “box”, “cup”, “ball” reach front green box touch right green cup look at right blue box reach front blue ball grasp far red box action position color object Example of teaching sentences 20 trials
  • 11. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Experimental results Word probability distributions 𝜃𝑙 (Multinomial distribution) touch grasp look at reach far left front right box ball cup green red blue a 0 a 1 a 2 a 3 a 4 p 0 p 1 p 2 p 3 p 4 o 0 o 1 o 2 o 3 o 4 c 0 c 1 c 2 c 3 c 4 11 Higher probability values are represented by darker shades. a: action, p: position, o: object feature, c: color The results show that the proposed method was able to associate each word with its each modality. (in thick-bordered boxes)
  • 12. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Experimental results Position, object, and color category p0 p1 p4 p2 far left front right p 0 p 1 p 2 p 3 p 4 box ball cup o 0 o 1 o 2 o 3 o 4 Object category Color category green red blue c 0 c 1 c 2 c 3 c 4 Part of the example of categorization results Position category 12
  • 13. Multiple Categorization by iCub: Learning Relationships between Multiple Modalities and Words Conclusions • We have proposed a Bayesian probabilistic model that can learn multiple categories and the relationships between words and multiple modalities. • The experimental results showed that the robot can perform the categorization for each modality and the estimation of a modality related to a word in complex situations. Future directions • Experiments using a real iCub • Learning by uncertain spoken sentences – Changing the number of words and order • Action generation task, description task 13 THANK YOU FOR YOUR KIND ATTENTION.