SlideShare a Scribd company logo
Using Communication to Reduce Locality in Multi-Robot
                     Learning
                         By: Maja J. Mataric

                         Presentation By:
                         Derak Berreyesa

                         UNR, CS, 11/17/04
   Attempt to bridge the fields of machine
    learning, robotics and distributed AI.



 Deals with two key problems:
 Hidden state
 Credit state
Hidden State

   Situated agents typically can’t sense all
    information for completing the task and
    learning to perform it efficiently.
Credit Assignment

   Arises because of reinforcement in a
    distributed system is often provided at a global
    level, and must somehow be divided over
    multiple agents whose impact differs and
    varies over time.
Solving the problems

   Apply communication as sensing and as
    reinforcement, in each case through local
    undirected broadcast.

   Demonstrated the idea on two multi-robot
    learning experiments.
Two robots

   A tightly-coupled coordination task (box
    pushing.)
   Communication for sharing sensory data to
    overcome hidden state.
   Reinforcement data to overcome the credit
    assignment.
Four Robots

   Loosely-coupled task, learning social rules,
    (yielding and sharing information.)
   Uses Communication to bridge the gap
    between global and local payoff.
In both cases

   The main goal is to increase the scope of
    impact of a single agent.
   Clusters agents when they are tightly
    interacting.
   Has the effect of making the system less
    distributed and alleviates the hidden state and
    credit assignment problems.
Communication as Sensing
   Sensors are in-accurate and un-reliable.
   Interaction between agents is very important.
   Communication can be used as a form of
    sensing.
   Things that are hard to sense can be
    communicated.
   Agents that broadcast their state learn better.
   There is still inaccuracy with sending the
    messages.
Communication as Reinforcement

   It is hard for multi-agent systems to achieve
    group-level coherence.
   Central controller maintains optimizations over
    state space and sends commands to the
    group.
   Information is usually not available and can’t
    be completed in real time
   Communication poses a bottleneck.
Reinforcement (cont.)

   As multi-agent systems learn their behavior
    changes resulting in inconsistencies.
   Credit assignment problem the level of the
    individual because interaction with the other
    agents delays the agents payoff.
   At the group level because local individual
    behavior must be associated with global
    outcomes.
Reinforcement (cont.) again.

   Communication as reinforcement enables
    agents to locally share a reward in order to
    overcome the credit assignment problem.
Communication for Shared Sensing
   2 robots pushing a box.
   Box is to heavy for one robot to push alone.
   Six-legged robots.
   Radio communication mechanisms.
   Whiskers that detect contact with the box.
   5 sensors that detect direction and distance
    form the goal.
   Goal marked with a bright light.
Shared Sensing (cont.)

   Reinforcement learning framework
   Learning mapping between it’s sensors and
    pre-programmed behaviors:
    –   Find-box
    –   Push-forward
    –   Push-left
    –   Push-right
    –   Stop
    –   Send-msg
Shared Sensing (cont.) again.

   Algorithm chose “best” action 75% of the time
    and random action 25% of the time.
   Hidden state problem was solved by having the
    two agents pool their sensory resources.
   Credit assignment is solved by each agent
    telling the other what action to perform,
    observe the outcome and share the reward or
    punishment.
Shared Sensing (cont.) once more.

   Desired policy was learned by both robots in
    over 85% of the trials.
   Its was learned on average in 7.3 minutes.
   There were about 40 trials.
   Each robot learned differently depending on
    which side it was on.
Communication for Shared Reward

   4 robots.
   2 social rules:
    –   Yielding to each other
    –   Sharing information about the location of pucks.
   Infa-red and bump sensors.
   Radio communication
   Sensors don’t give information about robots
    external state or behavior.
Shared Reward (cont.)

   Basis behaviors:
    –   Pickup
    –   Drop
    –   Home
    –   Wander
    –   Follow
    –   Send-msg
Shared Reward (cont.) again.

   Goal was to have robot improve it’s individual
    collection of pucks while still yielding to another
    and send messages to know when follow or
    proceed to the location.
   “best” behavior would be 60% of time and
    random 10%, it would follow 30% of the time.
   It is difficult for robots to learn social rules and
    the credit assignment problem was a big
    problem.
Shared Reward (cont.) once more.
   Locally sharing information was sufficient to enable the
    group to learn social behaviors.
   The social policies of yielding and sharing were learned
    by all robots in over 90% of the trials.
   They were learned on an average of 20 to 25 minutes.
   Unlike the first experiment sensory information was
    individual and reinforcement was shared.
   This took care of the credit assignment problem.
   Radio broadcast communication proved to be robust.
   Lost data was mostly ignored, but did slow down
    learning a little.
Conclusions

   Dealt with two problems of learning in multi
    agent environments.
   Simple communication over local broadcast
    can be used to address both problems.
   We had fun doing it!!!!

More Related Content

Viewers also liked

Patient specific induced pluripotent stem-cell models for long-qt syndrome
Patient specific induced pluripotent stem-cell models for long-qt syndromePatient specific induced pluripotent stem-cell models for long-qt syndrome
Patient specific induced pluripotent stem-cell models for long-qt syndrome
Luz Eugenia
 
Http Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...
Http   Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...Http   Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...
Http Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...
Alexandra Mottram
 
Where next for OER in Ireland? Edtech15
Where next for OER in Ireland? Edtech15Where next for OER in Ireland? Edtech15
Where next for OER in Ireland? Edtech15
Angelica Risquez
 
Angelica risquez students experience of vl es
Angelica risquez students experience of vl esAngelica risquez students experience of vl es
Angelica risquez students experience of vl es
Angelica Risquez
 
8/24/10
8/24/108/24/10
Research paper presentation
Research paper presentationResearch paper presentation
Research paper presentation
Aijaz Ali
 
Estados
EstadosEstados
Folding Biologia
Folding BiologiaFolding Biologia
Folding Biologia
Luz Eugenia
 
Profile_nguyen_van_ha.
Profile_nguyen_van_ha.Profile_nguyen_van_ha.
Profile_nguyen_van_ha.
VanHa
 
Using Sinatra as a lightweight web service
Using Sinatra as a lightweight web serviceUsing Sinatra as a lightweight web service
Using Sinatra as a lightweight web service
Gerred Dillon
 
manja btech
manja btechmanja btech
manja btech
GKVK
 
8/26/10
8/26/108/26/10
Adv ppt
Adv pptAdv ppt
RDO and Ceph meetup BCN - Testing in RDO
RDO and Ceph meetup BCN - Testing in RDORDO and Ceph meetup BCN - Testing in RDO
RDO and Ceph meetup BCN - Testing in RDO
Alfredo Moralejo
 

Viewers also liked (14)

Patient specific induced pluripotent stem-cell models for long-qt syndrome
Patient specific induced pluripotent stem-cell models for long-qt syndromePatient specific induced pluripotent stem-cell models for long-qt syndrome
Patient specific induced pluripotent stem-cell models for long-qt syndrome
 
Http Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...
Http   Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...Http   Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...
Http Socialmedia.Alterian.Com Learn More Articles Whitepapers Store Getting...
 
Where next for OER in Ireland? Edtech15
Where next for OER in Ireland? Edtech15Where next for OER in Ireland? Edtech15
Where next for OER in Ireland? Edtech15
 
Angelica risquez students experience of vl es
Angelica risquez students experience of vl esAngelica risquez students experience of vl es
Angelica risquez students experience of vl es
 
8/24/10
8/24/108/24/10
8/24/10
 
Research paper presentation
Research paper presentationResearch paper presentation
Research paper presentation
 
Estados
EstadosEstados
Estados
 
Folding Biologia
Folding BiologiaFolding Biologia
Folding Biologia
 
Profile_nguyen_van_ha.
Profile_nguyen_van_ha.Profile_nguyen_van_ha.
Profile_nguyen_van_ha.
 
Using Sinatra as a lightweight web service
Using Sinatra as a lightweight web serviceUsing Sinatra as a lightweight web service
Using Sinatra as a lightweight web service
 
manja btech
manja btechmanja btech
manja btech
 
8/26/10
8/26/108/26/10
8/26/10
 
Adv ppt
Adv pptAdv ppt
Adv ppt
 
RDO and Ceph meetup BCN - Testing in RDO
RDO and Ceph meetup BCN - Testing in RDORDO and Ceph meetup BCN - Testing in RDO
RDO and Ceph meetup BCN - Testing in RDO
 

Similar to Using Communication to Reduce Locality in Multi-Robo

Identifier of human emotions based on convolutional neural network for assist...
Identifier of human emotions based on convolutional neural network for assist...Identifier of human emotions based on convolutional neural network for assist...
Identifier of human emotions based on convolutional neural network for assist...
TELKOMNIKA JOURNAL
 
Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...
Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...
Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...
ahmad bassiouny
 
Swarming drones
Swarming dronesSwarming drones
Swarming drones
Antonio Luca Alfeo
 
August 29, Overview over Systems studied in the course
August 29, Overview over Systems studied in the courseAugust 29, Overview over Systems studied in the course
August 29, Overview over Systems studied in the course
University of Colorado at Boulder
 
Community detection in complex social networks
Community detection in complex social networksCommunity detection in complex social networks
Community detection in complex social networks
Aboul Ella Hassanien
 
c27_mas.ppt
c27_mas.pptc27_mas.ppt
c27_mas.ppt
Hassan458257
 
JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...
JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...
JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...
IEEEGLOBALSOFTTECHNOLOGIES
 
Agents(1).ppt
Agents(1).pptAgents(1).ppt
Agents(1).ppt
jameskilonzo1
 
Cyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdfCyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdf
Hunais Abdul Nafi
 
Fake News Detection using Deep Learning
Fake News Detection using Deep LearningFake News Detection using Deep Learning
Fake News Detection using Deep Learning
NIET Journal of Engineering & Technology (NIETJET)
 
Ai applications study
Ai applications  studyAi applications  study
Ai applications study
Kavita Rastogi
 
Ai applications study
Ai applications  studyAi applications  study
Ai applications study
Kavita Rastogi
 
MAS Course - Lect10 - coordination
MAS Course - Lect10 - coordinationMAS Course - Lect10 - coordination
MAS Course - Lect10 - coordination
Antonio Moreno
 
Introduction to agents and multi-agent systems
Introduction to agents and multi-agent systemsIntroduction to agents and multi-agent systems
Introduction to agents and multi-agent systems
Antonio Moreno
 
Agent basedqos
Agent basedqosAgent basedqos
Agent basedqos
JuanRamon Acosta
 
Lect7MAS-Coordination
Lect7MAS-CoordinationLect7MAS-Coordination
Lect7MAS-Coordination
Antonio Moreno
 
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
IRJET Journal
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
butest
 
Agent Interaction Agents
Agent Interaction AgentsAgent Interaction Agents
Agent Interaction Agents
Audrey Britton
 
MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...
MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...
MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...
ijaia
 

Similar to Using Communication to Reduce Locality in Multi-Robo (20)

Identifier of human emotions based on convolutional neural network for assist...
Identifier of human emotions based on convolutional neural network for assist...Identifier of human emotions based on convolutional neural network for assist...
Identifier of human emotions based on convolutional neural network for assist...
 
Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...
Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...
Learning Structure, Reusability And Real Time Modeling In Teams Of Autonomous...
 
Swarming drones
Swarming dronesSwarming drones
Swarming drones
 
August 29, Overview over Systems studied in the course
August 29, Overview over Systems studied in the courseAugust 29, Overview over Systems studied in the course
August 29, Overview over Systems studied in the course
 
Community detection in complex social networks
Community detection in complex social networksCommunity detection in complex social networks
Community detection in complex social networks
 
c27_mas.ppt
c27_mas.pptc27_mas.ppt
c27_mas.ppt
 
JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...
JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...
JAVA 2013 IEEE MOBILECOMPUTING PROJECT Community aware opportunistic routing ...
 
Agents(1).ppt
Agents(1).pptAgents(1).ppt
Agents(1).ppt
 
Cyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdfCyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdf
 
Fake News Detection using Deep Learning
Fake News Detection using Deep LearningFake News Detection using Deep Learning
Fake News Detection using Deep Learning
 
Ai applications study
Ai applications  studyAi applications  study
Ai applications study
 
Ai applications study
Ai applications  studyAi applications  study
Ai applications study
 
MAS Course - Lect10 - coordination
MAS Course - Lect10 - coordinationMAS Course - Lect10 - coordination
MAS Course - Lect10 - coordination
 
Introduction to agents and multi-agent systems
Introduction to agents and multi-agent systemsIntroduction to agents and multi-agent systems
Introduction to agents and multi-agent systems
 
Agent basedqos
Agent basedqosAgent basedqos
Agent basedqos
 
Lect7MAS-Coordination
Lect7MAS-CoordinationLect7MAS-Coordination
Lect7MAS-Coordination
 
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...Emotion Recognition through Speech Analysis using various Deep Learning Algor...
Emotion Recognition through Speech Analysis using various Deep Learning Algor...
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
Agent Interaction Agents
Agent Interaction AgentsAgent Interaction Agents
Agent Interaction Agents
 
MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...
MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...
MULTI-AGENT BASED SMART METERING AND MONITORING OF POWER DISTRIBUTION SYSTEM:...
 

Recently uploaded

Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 

Recently uploaded (20)

Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 

Using Communication to Reduce Locality in Multi-Robo

  • 1. Using Communication to Reduce Locality in Multi-Robot Learning By: Maja J. Mataric Presentation By: Derak Berreyesa UNR, CS, 11/17/04
  • 2. Attempt to bridge the fields of machine learning, robotics and distributed AI.  Deals with two key problems:  Hidden state  Credit state
  • 3. Hidden State  Situated agents typically can’t sense all information for completing the task and learning to perform it efficiently.
  • 4. Credit Assignment  Arises because of reinforcement in a distributed system is often provided at a global level, and must somehow be divided over multiple agents whose impact differs and varies over time.
  • 5. Solving the problems  Apply communication as sensing and as reinforcement, in each case through local undirected broadcast.  Demonstrated the idea on two multi-robot learning experiments.
  • 6. Two robots  A tightly-coupled coordination task (box pushing.)  Communication for sharing sensory data to overcome hidden state.  Reinforcement data to overcome the credit assignment.
  • 7. Four Robots  Loosely-coupled task, learning social rules, (yielding and sharing information.)  Uses Communication to bridge the gap between global and local payoff.
  • 8. In both cases  The main goal is to increase the scope of impact of a single agent.  Clusters agents when they are tightly interacting.  Has the effect of making the system less distributed and alleviates the hidden state and credit assignment problems.
  • 9. Communication as Sensing  Sensors are in-accurate and un-reliable.  Interaction between agents is very important.  Communication can be used as a form of sensing.  Things that are hard to sense can be communicated.  Agents that broadcast their state learn better.  There is still inaccuracy with sending the messages.
  • 10. Communication as Reinforcement  It is hard for multi-agent systems to achieve group-level coherence.  Central controller maintains optimizations over state space and sends commands to the group.  Information is usually not available and can’t be completed in real time  Communication poses a bottleneck.
  • 11. Reinforcement (cont.)  As multi-agent systems learn their behavior changes resulting in inconsistencies.  Credit assignment problem the level of the individual because interaction with the other agents delays the agents payoff.  At the group level because local individual behavior must be associated with global outcomes.
  • 12. Reinforcement (cont.) again.  Communication as reinforcement enables agents to locally share a reward in order to overcome the credit assignment problem.
  • 13. Communication for Shared Sensing  2 robots pushing a box.  Box is to heavy for one robot to push alone.  Six-legged robots.  Radio communication mechanisms.  Whiskers that detect contact with the box.  5 sensors that detect direction and distance form the goal.  Goal marked with a bright light.
  • 14. Shared Sensing (cont.)  Reinforcement learning framework  Learning mapping between it’s sensors and pre-programmed behaviors: – Find-box – Push-forward – Push-left – Push-right – Stop – Send-msg
  • 15. Shared Sensing (cont.) again.  Algorithm chose “best” action 75% of the time and random action 25% of the time.  Hidden state problem was solved by having the two agents pool their sensory resources.  Credit assignment is solved by each agent telling the other what action to perform, observe the outcome and share the reward or punishment.
  • 16. Shared Sensing (cont.) once more.  Desired policy was learned by both robots in over 85% of the trials.  Its was learned on average in 7.3 minutes.  There were about 40 trials.  Each robot learned differently depending on which side it was on.
  • 17. Communication for Shared Reward  4 robots.  2 social rules: – Yielding to each other – Sharing information about the location of pucks.  Infa-red and bump sensors.  Radio communication  Sensors don’t give information about robots external state or behavior.
  • 18. Shared Reward (cont.)  Basis behaviors: – Pickup – Drop – Home – Wander – Follow – Send-msg
  • 19. Shared Reward (cont.) again.  Goal was to have robot improve it’s individual collection of pucks while still yielding to another and send messages to know when follow or proceed to the location.  “best” behavior would be 60% of time and random 10%, it would follow 30% of the time.  It is difficult for robots to learn social rules and the credit assignment problem was a big problem.
  • 20. Shared Reward (cont.) once more.  Locally sharing information was sufficient to enable the group to learn social behaviors.  The social policies of yielding and sharing were learned by all robots in over 90% of the trials.  They were learned on an average of 20 to 25 minutes.  Unlike the first experiment sensory information was individual and reinforcement was shared.  This took care of the credit assignment problem.  Radio broadcast communication proved to be robust.  Lost data was mostly ignored, but did slow down learning a little.
  • 21. Conclusions  Dealt with two problems of learning in multi agent environments.  Simple communication over local broadcast can be used to address both problems.  We had fun doing it!!!!