T9. Trust and reputation in multi-agent systems


Published on

14th European Agent Systems Summer School

Published in: Education, Technology, Spiritual
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

T9. Trust and reputation in multi-agent systems

  1. 1. Trust & Reputation in Multi-Agent SystemsDr. Jordi Sabater Mir Dr. Javier Carbójsabater@iiia.csic.es jcarbo@inf.uc3m.es EASSS 2012, Valencia, Spain 1
  2. 2. Dr. Jordi Sabater-MirIIIA – Artificial Intelligence Research Institute CSIC – Spanish National Research Council
  3. 3. Outline• Introduction• Approaches to control the interaction• Computational reputation models – eBay – ReGreT• A cognitive perspective to computational reputation models – A cognitive view on Reputation – Repage, a computational cognitive reputation model – [Properly] Integrating a [cognitive] reputation model into a [cognitive] agent architecture – Arguing about reputation concepts
  4. 4. Trust “A complete absence of trustwould prevent [one] even getting up in the morning.” Niklas Luhman - 1979
  5. 5. TrustA couple of definitions that I like:“Trust begins where knowledge [certainty] ends: trust provides abasis dealing with uncertain, complex, and threatening images ofthe future.” (Luhmann,1979)“Trust is the outcome of observations leading to the belief that theactions of another may be relied upon, without explicit guarantee,to achieve a goal in a risky situation.” (Elofson, 2001)
  6. 6. Trust Epistemic“The subjective probability by which an individual, A,expects that another individual, B, performs a givenaction on which its welfare depends” [Gambetta] “An expectation about an uncertain behaviour” [Marsh]“The decision and the act of relying on, counting on,depending on [the trustee]” [Castelfranchi & Falcone] Motivational6
  7. 7. Reputation"After death, a tiger leaves behind his skin, a man his reputation" Vietnamese proverb
  8. 8. Reputation“What a social entity says about a target regarding his/her behavior” It is always associated to a specific behaviour/property • The social evaluation linked to the reputation is not necessarily a belief of the issuer. • Reputation cannot exist without communication. Set of individuals plus a set of social relations among these individuals or properties that identify them as a group in front of its own members and the society at large.
  9. 9. What is reputation good for?• Reputation is one of the elements that allows us to build trust.• Reputation has also a social dimension. It is not only useful for the individual but also for the society as a mechanism for social order.
  10. 10. But... why we need computational models of those concepts?
  11. 11. What we are talking about...Mr. Yellow
  12. 12. What we are talking about...Two years ago... Trust based on... Direct experiences Mr. Yellow
  13. 13. What we are talking about... Trust based on... Third party information Mr. PinkMr. Yellow
  14. 14. What we are talking about... Trust based on... Third party informationMr. Green Mr. Pink Mr. Yellow
  15. 15. What we are talking about... Trust based on... ReputationMr. Yellow
  16. 16. What we are talking about...Mr. Yellow
  17. 17. What we are talking about...?
  18. 18. Characteristics of computational trust and reputation mechanisms• Each agent is a norm enforcer and is also under surveillance by the others. No central authority needed.• Their nature allows to arrive where laws and central authorities cannot.• Punishment is based usually in ostracism. Therefore, exclusion must be a punishment for the outsider.
  19. 19. Characteristics of computational trust and reputation mechanisms• Bootstrap problem.• Not all kind of environments are suitable to apply these mechanisms. It is necessary a social environment.
  20. 20. Approaches to control the interaction
  21. 21. Different approaches to control the interaction Security approach
  22. 22. Different approaches to control the interaction• Security approach Agent identity validation. Integrity, authenticity of messages. ...
  23. 23. Different approaches to control the interaction Institutional approach Security approach
  24. 24. Different approaches to control the interaction• Institutional approach
  25. 25. Different approaches to control the interaction Trust and reputation Social approach mechanisms are at this level. Institutional approach Security approach They are complementary and cover different aspects of interaction.
  26. 26. Computational reputation models
  27. 27. Classification dimensions• Paradigm type • Model’s granularity • Mathematical approach • Single context • Cognitive approach • Multi context• Information sources • Agent behaviour assumptions • Cheating is not considered • Direct experiences • Agents can hide or bias the • Witness information information but they never lie • Sociological information • Type of exchanged information • Prejudice• Visibility types • Subjective • Global
  28. 28. Subjective vs Global• Global • The reputation is maintained as a centralized resource. • All the agents in that society have access to the same reputation values. Advantages: • Reputation information is available even if you are a newcomer and do not depend on how well connected or good informants you have. • Agents can be simpler because they don’t need to calculate reputation values, just use them. Disadvantages: • Particular mental states of the agent or its singular situation are not taken into account when reputation is calculated. Therefore, a global view it is only possible when we can assume that all the agents think and behave similar. • Not always is desireable for an agent to make public information about the direct experiences or submit that information to an external authority. • Therefore, a high trust on the central institution managing reputation is essential.
  29. 29. Subjective vs Global• Subjective • The reputation is maintained by each agent and is calculated according to its own direct experiences, information from its contacts, its social relations... Advantages: • Reputation values can be calculated taking into account the current state of the agent and its individual particularities. Disadvantages: • The models are more complex, usually because they can use extra sources of information. • Each agent has to worry about getting the information to build reputation values. • Less information is available so the models have to be more accurate to avoid noise.
  30. 30. A global reputation model: eBayModel oriented to support trust between buyer and seller.• Completely centralized.• Buyers and sellers may leave comments about each otherafter transactions.• Comment: a line of text + numeric evaluation (-1,0,1)• Each eBay member has a Feedback score that is thesummation of the numerical evaluations.
  31. 31. eBay model
  32. 32. eBay model Specifically oriented to scenarios with the followingcharacteristics: • A lot of users (we are talking about milions) • Few chances of repeating interaction with the same partner • Easy to change identity • Human oriented• Considers reputation as a global property and uses a singlevalue that is not dependent on the context.• A great number of opinions that “dilute” false or biasedinformation is the only way to increase the reliability of thereputation value.
  33. 33. A subjective reputation model: ReGreT What is the ReGreT system? It is a modular trust and reputation system oriented to complex e-commerce environments where social relations among individuals play an important role.
  34. 34. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputationDirect ReputationTrust model System reputation Trust
  35. 35. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputationDirect ReputationTrust model System reputation Trust
  36. 36. Outcomes and Impressions Outcome: The initial contract – to take a particular course of actions – to establish the terms and conditions of a transaction. AND The actual result of the contract.Example: Prize =c 2000 Quality =c A Contract Quantity =c 300 Outcome Prize =f 2000 Quality =f C Fulfillment Quantity =f 295
  37. 37. Outcomes and Impressions OutcomePrize =c 2000 offers_good_pricesQuality =c AQuantity =c 300 maintains_agreed_quantitiesPrize =f 2000Quality =f CQuantity =f 295
  38. 38. Outcomes and Impressions Impression: The subjective evaluation of an outcome from a specific point of view. Imp(o, 1 ) Outcome Prize =c 2000 Quality =c A Quantity =c 300 Imp(o,  2 ) Prize =f 2000 Quality =f C Quantity =f 295 Imp(o,  3 )
  39. 39. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputation Reliability of the value based on:Direct ReputationTrust • Number of outcomes model • Deviation: The greater the variability in the rating values the more volatile will be System the other agent in the fulfillment of its reputation agreements. Trust
  40. 40. Direct Trust Trust relationship calculated directly from an agent’s outcomes database. DTa b ( )    (t , t )  Imp(o ,  ) i i oi ODB gr,b ) a ( f (ti , t ) (t , ti )  o IDBa ,b f (t j , t ) gr (  ) ti f (ti , t )  j t
  41. 41. Direct TrustDT reliability a ,b a ,b DTRLab ( )  No ( ODBgr (  ) )  (1  Dv ( ODBgr (  ) ) Number of Deviation outcomes (Dv) (No) The greater the variability in the rating values the more volatile will be the other agent in the fulfillment of its agreements. a ,b No ( ODB gr (  ) ), itm  10
  42. 42. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputationDirect ReputationTrust model System reputation Trust
  43. 43. Witness reputation Reputation that an agent builds on another agent based on the beliefs gathered from society members (witnesses).Problems of witness information: • Can be false. • Can be incomplete. • It may suffer from the “correlated evidence” problem.
  44. 44. B o C oA o # u7 + D + + a1 # ^ o^ c1 o+ b2 c1 b1 u6 a2 o # o+ c2 #^ d1 u3 + d1 + a1 u2 u8 + d2 o u4 ^ u1 u5 c2 u2 u9 #^ b2 u9 u1 # u3 u8 u6 u5 u4 u7 d2 ^ o a2 b1 o + o + # # + trade ^
  45. 45. B o C o A o # + D + + # ^ ^ b2 c1 b1 u1 # c2 o+ u4 #^ a2 o d1 + u9 + a1 d2 o u4 u3 ^ u1 u2 u5 u5 u2 u9 u3 u8 u6 u7 u8 u6 u7 cooperation o o Big exchange of sincere infor- # + #mation and some kind of predispo- + sition to help if it is possible. ^
  46. 46. B o C o A o # + D + + # ^ ^ b2 c1 b1 # o+ c2u3 u9 a2 o #^ d1 + u1 + a1 d2 o u4 u2 ^ u1 u5 u2 u9 u7 u8 u3 u8 u5 u6 u4 u6 u7 competition o oAgents tend to use all the available # + #mechanisms to take some advantage + from their competitors. ^
  47. 47. Witness u7 reputation # a1 o c1 o+Step 1: Identifying u6 the witnesses u3 d1• Initial set of witnesses: u2 u8 + ? Agents that have had a trade Relation with the target agent c2 #^ b2 u9 u1 # u5 u4 d2 b1 ^ a2 + o trade
  48. 48. Witness u7 Grouping agents with frequent interactions reputation them and considering each one of these among groups as a single source of reputation values:Step 1: Identifying u6 • Minimizes u3 correlated evidence problem. the witnesses the• Initial set of witnesses: u2 u8 • Reduces the number of queries to agents that Agents that have had probably will give us more or less the same a trade Relation with the target agent information. b2 # To group agents ReGreT relies on sociograms. u5 u4 trade
  49. 49. Witness reputation u7Heuristic to identify groups and Central-pointthe best agents to representthem: u6 u31. Identify the components of u2 u8 the graph.2. For each component, find the set of cut-points. b2 #3. For each component that does not have any cut-point, u5 u4 select a central point (node with larger degree). Cut-point cooperation
  50. 50. Witness u7 reputationStep 1: Identifying u6 the witnesses u3• Initial set of witnesses: u2 u8 Agents that have had a trade Relation with the target agent b2• Grouping and selecting # the most representative witnesses u5 u4 trade
  51. 51. Witness reputationStep 1: Identifying the witnesses u3• Initial set of witnesses: u2 Agents that have had a trade Relation with the target agent b2• Grouping and selecting # the most representative witnesses u5 trade
  52. 52. Witness reputation  Trustu 2b 2 ( ), TrustRLu 2b 2 ( )  u2Step 1: Identifying u3 the witnesses u5Step 2: Who can I  Trustu 5b 2 ( ), TrustRLu 5b 2 ( )  trust?
  53. 53. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputationDirect ReputationTrust model System reputation Trust
  54. 54. Credibility model Two methods are used to evaluate the credibility ofwitnesses: Credibility (witnessCr) Social relations Past history (socialCr) (infoCr)
  55. 55. Credibility model• socialCr(a,w,b): credibility that agent a assigns to agent w whenw is giving information about b and considering the social structureamong w, b and himself. a a a w w w b b b a a a w w w b b b a a a w w w b b b w - witness competitive relation b - target agent cooperative relation a - source agent
  56. 56. Credibility model Regret uses fuzzy rules to calculate how the structure of social relations influences the credibility on the information. IF coop(w,b) is h THEN socialCr(a,w,b) is vl1 10 0 0 1 0 1 low moderate high very_low low moderate high very_high (l) (m) (h) (vl) (l) (m) (h) (vh)
  57. 57. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputationDirect ReputationTrust model System reputation Trust
  58. 58. Neighbourhood reputation The trust on the agents that are in the “neighbourhood” of the target agent and their relation with it are the elements used to calculate what we call the Neighbourhood reputation. ReGreT uses fuzzy rules to model this reputation. IF DTan (offers_good_quality ) is X AND coop(b,ni)  low i THEN Rab (offers_good_quality) is X n i IF DTRLan (offers_good_quality) is X’ AND coop(b,ni) is Y’ i THEN RLab (offers_good_quality) is T(X’,Y’) n i
  59. 59. The ReGreTODB IDB SDB system Credibility Neigh- Witness bourhood reputation reputationDirect ReputationTrust model System reputation Trust
  60. 60. System reputation The idea behind the System reputation is to use the common knowledge about social groups and the role that the agent is playing in the society as a mechanism to assign reputation values to other agents. The knowledge necessary to calculate a system reputation is usually inherited from the group or groups to which the agent belongs to.
  61. 61. Trust If the agent has a reliable direct trust value, it will use that as a measure of trust. If that value is not so reliable then it will use reputation. Neigh- Witness bourhood reputation reputation Direct Reputation Trust model System reputation Trust
  62. 62. A cognitive perspective to computational reputation models• A cognitive view on Reputation• Repage, a computational cognitive reputation model• [Properly] Integrating a [cognitive] reputation model into a [cognitive] agent architecture• Arguing about reputation concepts
  63. 63. Social evaluation• A social evaluation, as the name suggests, is the evaluation by a socialentity of a property related to a social aspect.• Social evaluations may concern physical, mental, and social properties oftargets.• A social evaluation includes at least three sets of agents:  a set E of agents who share the evaluation (evaluators)  a set T of evaluation targets  a set B of beneficiariesWe can find examples where the different sets intersect totally, partially,etc...e (e in E) may evaluate t (t in T) with regard to a state of the world that is inb’s (b in B) interest, but of which b not necessarily is aware. Example: quality of TV programs during children’s timeshare
  64. 64. Image and Reputation• Both are social evaluations.• They concern other agents (targets) attitudes toward socially desirablebehaviour but......whereas image consists of a set of evaluative beliefs about thecharacteristics of a target,reputation concerns the voice that is circulating on the same target. Reputation in artificial societies [Rosaria Conte, Mario Paolucci]
  65. 65. Image“An evaluative belief; it tells whether the target is good or bad with respectto a given behaviour” [Conte & Paolucci] Is the result of an internal reasoning on different sources of information that leads the agent to create a belief about the behaviour of another agent. Beliefs The agent has accepted φ as something true and its decisions from now on will take this B into account. Social evaluation 
  66. 66. Reputation• A voice is something that “it is said”, a piece of information that is beingtransmitted.• Reputation: a voice about a social evaluation that is recognised by themembers of a group to be circulating among them. Beliefs • The agent believes that the social B(S(f)) evaluation f is communicated. • This does not imply that the agent believes that f is true.
  67. 67. ReputationImplications: • The agent that spreads a reputation, because it is not implicit that it believes the associated social evaluation, takes no responsibility about that social evaluation (another thing is the responsibility associated to the action of spreading that reputation). • This fact allows reputation to circulate more easily than image (less/no fear of retaliation). • Notice that if an agent believes “what people say”, image and reputation colapse. • This distinction has important advantages from a technical point of view.
  68. 68. Gossip• In order for reputation to exist, it has to be transmitted. We cannot havereputation without communication.• Gossip currently has the meaning of an idle talk or rumour, especiallyabout the personal or private affairs of others. Usually has a badconnotation. But in fact is an essential element in human nature.• The antecedents of gossip is grooming.• Studies from evolutionary psicology have found gossip to be veryimportant as a mechanism to spread reputation [Sommerfeld et al. 07, Dunbar 04]• Gossip and reputation complement social norms: Reputation evolvesalong with implicit norms to encourage socially desirable conducts, such asbenevolence or altruism and discourage socially unacceptable ones, likecheating.
  69. 69. Outline• A cognitive view on Reputation• Repage, a computational cognitive reputation model• [Properly] Integrating a [cognitive] reputation model into a [cognitive] agent architecture• Arguing about reputation concepts
  70. 70. RepAgeWhat is the RepAge model?It is a reputation model evolved from acognitive theory by Conte and Paolucci.The model is designed with an specialattention to the internal representation of theelements used to build images andreputations as well as the inter-relations ofthese elements.
  71. 71. RepAge memory Value: Rep Img P P P Strength: 0.6 P P PP P P P P
  72. 72. RepAge memory
  73. 73. Outline• A cognitive view on Reputation• Repage, a computational cognitive reputation model• [Properly] Integrating a [cognitive] reputation model into a [cognitive] agent architecture• Arguing about reputation concepts
  74. 74. What do you mean by “properly”? Current models Planner Trust & Reputation system ? Inputs Decision mechanism CommBlack box AgentReactive
  75. 75. What do you mean by “properly”? Current models Planner Trust & Reputation system Value Inputs Decision mechanism CommBlack box AgentReactive
  76. 76. What do you mean by “properly”?The next generation? Planner Trust & Reputation system Inputs Decision mechanism Comm Agent
  77. 77. What do you mean by “properly”?The next generation? Planner Inputs Decision mechanism Comm AgentNot only reactive...... proactive
  78. 78. BDI model• Very popular model in the multiagent community.• Has the origins in the theory of human practical reasoning [Bratman] and the notion of intentional systems [Dennett].• The main idea is that we can talk about computer programs as if they have a “mental state”.• Specifically, the BDI model is based on three mental attitudes: Beliefs - what the agent thinks it is true about the world. Desires - world states the agent would like to achieve. Intentions - world states the agent is putting efforts to achieve.
  79. 79. BDI model• The agent is described in terms of these mental attitudes.• The decision-making model underlying the BDI model is known as practical reasoning.• In short, practical reasoning is what allows the agent to go from beliefs, desires and intentions to actions.
  80. 80. Multicontext systems • Declarative languages, each with a set of Logics axioms amd a number of rules of inference. • Structural entities representing the main architecture components. Each unit has a UNITS single logic associated with it. • Rules of inference wich relate formulaeBridge Rules in different units. • Sets of formulae written in the logic Theories associated with a unit
  81. 81. U1 U2 d U1:b , U2:d U3:a U3
  82. 82. U1 U2 b d U1:b , U2:d U3:a U3
  83. 83. U1 U2 b d U1:b , U2:d U3:a U3
  84. 84. U1 U2 b d U1:b , U2:d U3:a U3 a
  85. 85. Multicontext
  86. 86. Repage integration in a BDI architecture
  87. 87. BC-LOGIC
  88. 88. Grounding Image and Reputation to BC-Logic
  89. 89. Repage integration in a BDI architecture
  90. 90. Desire and Intention context
  91. 91. Generating Realistic Desires
  92. 92. Generating Intentions
  93. 93. Repage integration in a BDI architecture
  94. 94. Outline• A cognitive view on Reputation• Repage, a computational cognitive reputation model• [Properly] Integrating a [cognitive] reputation model into a [cognitive] agent architecture• Arguing about reputation concepts
  95. 95. Arguing about Reputation ConceptsGoal: Allow agents to participate in argumentation-based dialogs regardingreputation elements in order to: - Decide on the acceptance of a communicated social evaluation based on its reliability. “Is the argument associated to a communicated social evaluation (and according to my knowledge) strong enough to consider its inclusion in the knwoledge base of my reputation model?” - Help in the process of trust alignment.What we need: • A language that allows the exchange of reputation-related information. • An argumentation framework that fits the requirements imposed by the particular nature of reputation. • A dialog protocol to allow agents establish information seeking dialogs.
  96. 96. The language: LRep LREP : First-order sorted languange with special predicates representing the typology of social evaluations we use: Img, Rep, ShV, ShE, DE, Comm. Ex 2: Linguistic Labels •SF: Set of constant formulas Allows LREP formulas to be nested in communications • SV: Set of evaluative values f: { 0 , 1, 2 , 3 , 4 }
  97. 97. The reputation argumentation framework• Given the nature of social evaluations (the values of a social evaluationare graded) we need an argumentation framework that allows to weightthe attacks. Example: We have to be able to differentiate between Img(j,seller,VG) being attacked by Img(j,seller,G) or being attacked by Img(j,seller,VB).• Specifically we instantiate the Weighted Abstract ArgumentationFramework defined in P.E. Dunne, A. Hunter, P. McBurney, S. Parsons, and M. Wooldridge, ‘Inconsistency tolerance in weighted argument systems’, in AAMAS’09, pp. 851–858, (2009).• Basically, this framework introduces the notions of strength andinconsistency budgets (defined as the amount of “inconsistency” that thesystem can tolerate regarding attacks) in a classical Dung’s framework.
  98. 98. Building Argumentative Theories Argumentative theory (Build from the Simple shared consequence relation reputation theory) Argumentation level ? ? Reputation-related information Consequence relation Reputation theory: set of ground (Reputation model) elements (expressed in LREP) gathered Specific to each agent by j through interactions and communications.
  99. 99. Attack and Strength f: { 0 , 1, 2 , 3 , 4 } Strength of the attack
  100. 100. Example of argumentative dialog Role: seller Role: Inf informant Role: sell(q) Role: sell(dt)• Agent i: proponent quality delivery time• Agent j: opponent j i • Each agent is equipped with a Reputation Weighted Argument System
  101. 101. Example of argumentative dialogj i
  102. 102. Example of argumentative dialogj i Strength of the attack
  103. 103. Example of argumentative dialogj i
  104. 104. Example of argumentative dialogj i
  105. 105. Example of argumentative dialogj i
  106. 106. Example of argumentative dialogj i
  107. 107. Using Inconsistency Budgetsj i
  108. 108. Outline+ PART II:  Trust Computing Approaches Security Institutional Social  Evaluation of Trust and Reputation Models EASSS 2010, Saint-Etienne, France 111
  109. 109. Dr. Javier CarbóGIAA – Group of Applied Artificial Intelligence Univ. Carlos III de Madrid
  110. 110. Trust in Information Security Same Word, Different WorldSecurity approach tackles “hard” problems of trust.They view trust as an objective, universal and verifiable property of agents.Their trust problems have solutions:• False identity• Reading/modification of messages by third parties• Repudiation of messages• Certificates of accomplishing tasks/services according to standards EASSS 2010, Saint-Etienne, France 113
  111. 111. An example, Public Key Infrastructure LDAP directory Certificate authority 4. Publication of certificate 3. Public key 5. Certificate sent sent 2. Private key sent 1. Client identityRegistration authority EASSS 2010, Saint-Etienne, France 114
  112. 112. Trust in I&S, limitationsTheir trust relies on central entities: – Authorities, Trust Third Parties – Partially solved using hierarchies of TTPs.They ignore part of the problem:- Top authority should be trusted by any other wayTheir scope is far away from Real Life Trust issues: – lies, defection, collusions, social norm violations, … EASSS 2010, Saint-Etienne, France 115
  113. 113. Institutional approachInstitutions have proved to successfully regulate human societies for a long time:- created to achieve particular goals while complying norms.- responsible for defining the rules of the game (norms), to enforce them and assess penalties in case of violation.Examples: auction houses, parliaments, stock exchange markets,.…Institutional approach is focused on the existence of organizations:• Providing an execution infrastructure• Controlling the resources access• Sanctionning/rewarding agents’ behaviors EASSS 2010, Saint-Etienne, France 116
  114. 114. An example: e-institutions EASSS 2010, Saint-Etienne, France 117
  115. 115. Institutional approach, limitationsThey view trust as an partially objective, local and verifiable property of agents.Intrusive control on the agents (modification on the execution resources, process killing, …)They require a shared agreeement to define of what is expected (norm compliance, case laws…)They require a central entity and global supervision – Repositories, access control entities should be centralised – Low scalability if every agent is observed by the institutionAssumes that the institution itself is trusted EASSS 2010, Saint-Etienne, France 118
  116. 116. Social approachSocial approach consists in the idea of an auto-organized society (Adam Smith’s invisible hand)Each agent has its own evaluation criteria of what is expected: no social norms, just individual normsEach agent is in charge of rewards and punishments (often in terms of more/less future cooperative interactions)No central entity at all, it consists of a completely distributed social control of malicious agents.Trust as an emergent propertyAvoids Privacy issues caused by centralized approaches EASSS 2010, Saint-Etienne, France 119
  117. 117. Social approach, limitationsUnlimited, but undefined and unexpected trust scope:We view trust as a subjective, local and unverifiable property of agents.Exclusion/Isolation is the typical punishment for the malicious agents  Difficult to enforce it in open and dynamical societies of agentsMalicious behaviors may occur, they are supposed to be prevented due to the lack of incentives and punishments.Difficult to define which domain and society is appropriate to test this social approach. EASSS 2010, Saint-Etienne, France 120
  118. 118. Ways to evaluate any system Integration on real applications Using real data from public datasets Using realistic data generated artificially Using ad-hoc simulated data with no justification/motivation None of above
  119. 119. Ways to evaluate T&R in agent systems Integration of T&R on real agent applications Using real T&R data from public datasets Using realistic T&R data generated artificially Using ad-hoc simulated data with no justification/motivation None of above
  120. 120. Real Applications using T&R in an agent system• What real application are we looking for?• Trust and reputation: – System that uses (for something) and exchanges subjective opinions about other participants  Recommender Systems• Agent System: – Distributed view, no central entity collects, aggregates and publishes a final valuation  ???
  121. 121. Real Applications using T&R in an agent system• Desiderata of application domains: (To be filled by students)
  122. 122. Real data & public datasets• Assuming real agent applications exists, would data be publicly available? – Privacy concerns – Lack of incentives to save data along time – Distribution of data.Heisenberg uncertainty principle: If users knew their subjective opinions would be collected by a central entity, they would not be as if their opinions had just a private (supposed-to-be friendly) reader.• No agents, no distribution  public dataset from recomender systems
  123. 123. A view on privacy concerns• Anonymity: use of arbitrary/secure pseudonysms• Using concordance: similarity between users within a single context. Mean of differences rating a set of items. Users tend to agree. (Private Collaborative Filtering using estimated concordance measures, N. Lathia, S. Hailes, L. Capra, 2007)• Secure Pair-wise comparison of fuzzy ratings (Introducing newcomers into a fuzzy reputation agent system, J. Carbo, J.M. Molina, J. Davila, 2002)
  124. 124. Real Data & Public Datasets• MovieLens, www.grouplens.org: Two datasets: – 100,000 ratings for 1682 movies by 943 users. – 1 million ratings for 3900 movies by 6040 users.• These are the “standard” datasets that many recommendation system papers use in their evaluation
  125. 125. My paper with MovieLens• I selected users among those who had rated 70 or more movies, and we also selected the movies that were evaluated more than 35 times in order to avoid the sparsity problem.• Finally we had 53 users and 28 movies.• The average votes per user is approximately 18. So the sparsity of the selected set of users and movies is under 35% “Agent-based collaborative filtering based on fuzzy recommendations” J. Carbó, J.M. Molina, IJWET v1 n4, 2004
  126. 126. Real Data & Public DatasetsBookCrossing (BX) dataset:• www.informatik.uni-freiburg.de/~cziegler/BX• collected by Cai-Nicolas Ziegler in a 4-week crawl (August / September 2004) from the Book-Crossing community.• It contains 278,858 users providing 1,149,780 ratings (explicit / implicit) about 271,379 books.
  127. 127. Real Data & Public DatasetsLast.fm Dataset• top artists played by all users: – contains <user, artist-mbid, artist-name, total-plays> – tuples for ~360,000 users about 186,642 artists.• full listening history of 1000 users: – Tuples of <user-id, timestamp, artist-mbid, artist- name, song-mbid, song-title>• Collected by Oscar Celma, Univ. Pompeu Fabra• www.dtic.upf.edu/~ocelma/MusicRecommendationDatas et
  128. 128. Real Data & Public DatasetsJester Joke Data Set:• Ken Goldberg from UC Berkeley released a dataset from Jester Joke Recommender System.• 4.1 million continuous ratings (-10.00 to +10.00) of 100 jokes from 73,496 users.• www.ieor.berkeley.edu/~goldberg/jester-data/• It differentiates itself from other datasets by having a much smaller number of rateable items.
  129. 129. Real Data & Public DatasetsEpinions dataset, collected by P. Massa:• in a 5-week crawl (November/December 2003) from the Epinions.com• Not just ratings about items, also trust statements: – 49,290 users who rated a total of – 139,738 different items at least once, writing 664,824 reviews. – 487,181 issued trust statements.• only positive trust statements and not negative ones
  130. 130. Real Data & Public DatasetsAdvogato: www.trustlet.org• a weighted dataset. Opinions aggregated (centrally) on a 3 levels base, Apprentice, Journeyer, and Master• Tuples of: minami -> polo [level="Journeyer"];• Used to test trust propagation in social networks (asuming trust transitivity).• Trust metric (by P. Massa) uses this information in order to assign to every user a final certification level aggregating weighted opinions.
  131. 131. Real Data & Public DatasetsMoviePilot dataset: www.moviepilot.com• this dataset contains information related to concepts from the world of cinema, e.g. single movies, movie universes (such as the world of Harry Potter movies), upcoming details (trailers, teasers, news, etc• RecSysChallenge: live evaluation session will take place where algorithms trained on offline data will be evaluated online, on real users.Mendeley dataset: www.mendeley.com• recommendations to users about scientific papers that they might be interested in.
  132. 132. Real Data & Public Datasets• No agents, no distribution  public dataset from recomender systems• Authors have to distribute opinions to participants in some way.• Ratings about items, not trust statements.• Relationship between # of ratings / # of items too low• Relationship between # of ratings / # of users too low• No time-stamps• Papers intend to be based on real data, but required transformation from centralized to distributed aggregation distort reality of these data.
  133. 133. Realistic Data• We need to generate realistic data to test trust and reputation in agent systems.• Several technical/design problems arise: – Which # of users, ratings and items we need? – How much dynamic would be the society of agents?• But the hardest part is the pshichological/sociological one: – How individuals take trust decisions? Which types of individuals? – How real society of humans trust? How many of each individual type belong to real human society?
  134. 134. Realistic Data• Large-scale simulation with Netlogo (http://ccl.northwestern.edu/netlogo/)• Others: MASON (https://mason.dev.java.net/), RePast (http://repast.sourceforge.net/)• But there are mainly adhoc simulations which are difficult to repeat by third parties.• Many of them are unrealistic agents with binary behaviour altruist/egoist based on game theory views.
  135. 135. Examples of AdHoc Simulations• Convergence of reputation image to real behaviour of agents. Static behaviours, no recomendations, just consume/provide services. Worst case.• Maximum Influence of cooperation. Free and honest recomendations from every agent based on consumed services. Best case.• Inclusion of dynamic behaviours, different % of malicious agents in society, collusions between recommenders and providers, etc. Compare results with the previous ones.“Avoiding malicious agents using fuzzy recommendations” J. Carbo, J. M. Molina, J. Dávila. Journal of Organizational Computing & Electronic Commerce, vol. 17, num. 1
  136. 136. Technical/Design Problems to generate simulated data• Lessons learned from the ART testbed experience.• http://megatron.iiia.csic.es/art-testbed/• A testbed would help to compute fair comparisons: “Researchers can perform easily-repeatable experiments in a common environment against accepted benchmarks”• Relative Success: – 3 international competitions jointly with AAMAS 06- 08. – Over 15 participants in each competition. – Several journal and conference publications use it.
  137. 137. Art Domain
  138. 138. the ART testbed
  139. 139. ART InterfaceThe agent system is displayed as a topology in the left, whilein the left two panels show the details of particular agentstatistics and of global system statistics.
  140. 140. The ART testbed• The simulation creates opinions according to an error distribution of zero mean and a standard deviation s: s = (s∗ + α / cg) t• where s∗, unique for each era, is assigned to an appraiser from a uniform distribution.• t is the true value of the painting to be appraised• α is a hidden value fixed for all appraisers that balances opinion-generation cost and final accuracy.• cg, the cost an appraiser decides to pay to generate an opinion. Therefore, the minimum achievable error distribution standard deviation is s∗ · t
  141. 141. The ART testbed• Each appraiser a’s actual client share ra takes into account the appraiser’s client share from the previous timestep: ra = q · ra’ + (1 − q) · ˜ra• where ra’ is appraiser a’s client share in the previous timestep.• q is a value that reflects the influence of previous client share size on next client share size (thus the volatility in client share magnitudes due to frequent accuracy oscillations may be reduced)
  142. 142. 2006 ART Competition2006 Competition setup:• Clients per agent: 20, Painting eras: 10, games with 5 agents• Costs 100/10/1, Sensing-Cost-Accuracy=0.5, Winner iam from Southampton Univ.Post competition discussion notes:• Larger number of agents required, Definition of dummy agents, Relate # of eras with # of agents, More fair distribution of expertise (just uniform), More abrupt change in # of clients (greater q), Improving expertise over time?
  143. 143. 2006 ART Winner conclusions “The ART of IAM: The Winning Strategy for the 2006 Competition”, Luke Teacy et al, Trust WS, AAMAS 07.• It is generally more economical for an agent to purchase opinions from a number of third parties than it is to invest heavily in its own opinion• There is little apparent advantage to reputation sharing. reputation is most valuable in cases where direct experience is relatively more difficult to acquire• The final lesson is that although trust can be viewed as a sociological concept, and inspiration for computational models of trust can be drawn from multiple disciplines, the problem of combining estimates of unknown variables (such as trustee behaviour) is fundamentally a statistical one.
  144. 144. 2007 ART Competition2007 Competition Setup:• Costs 100/10/0.1, All agents have equal sum of expertise values, Painting eras: static but unknown, Expertise assignments may change during the course of the game, Include dummy agents, games with 25 agents2007 Competition Discussion Notes:• it need sto facilitate reputation exchange• It doesn’t have to produce all changes at the same time, Gradual changes• Studying barriers to entry; how a new agent joins an existing MAS: Cold start vs. Hot start (exploration vs explotation)• More competitive dummy agents• relationship between opinion generation cost and accuracy
  145. 145. 2008 ART Competition2008 Competition Setup:• limited in the number of certainty and opinion requests that he can send.• Certainty request has cost.• deny the use of self opinions• Wider range of expertise values• Every time step, select randomly a number of eras to change, and add a given amount of positive change (increase value). For every positive change, apply also a negative change of the same amount, so that the average expertise of the agent is not modified
  146. 146. Evaluation criteria• Lack of criteria on which and how the very different trust decisions should be consideredConte and Paolucci 02:• epistemic decisions: those about about updating and generating trust opinions from received reputations• pragmatic-strategic decisions are decisions of how to behave with partners using these reputation-based trust• memetic decisions stand for the decisions of how and when to share reputation with others.
  147. 147. Main Evaluation Criteria of The ART testbed• The winning agent is selected as the appraiser with the highest bank account balance in the direct confrontation of appraiser agents repeated X times.• In other words, the appraiser who is able to: – estimate the value of its paintings most accurately – purchase information most prudently.• Where an ART iteration involves 19 steps (11 decisions, 8 interactions) to be taken by an agent.
  148. 148. Trust decisions in ART testbed1. How our agent should aggregate reputation information about others?2. How our agent should trust weights of providers and recommenders are updated afterwards?3. How many agents our agent should ask for reputation information about other agents?4. How many reputations and opinions requests from other agents should our agent answer?5. How many agents our agent should ask for opinions about our assigned paintings?6. How much time (economic value) our agent should spend building requested opinions about the paintings of the other agents?7. How much time (economic value) our agent should spend building the appraisals of the own paintings? (AUTOPROVIDER!)…
  149. 149. Limitations of Main Evaluation Criteria of ART testbedFrom my point of view:• Evaluates all trust decisions jointly: should participants play provider and consumer roles jointly of just the role of opinion consumers?• Is the direct confrontation of competitor agents the right scenario to compare them?
  150. 150. Providers vs. Consumers• Playing games with two participants of 2007 competition (iam2 and afras) and other 8 dummy agents.• Dummy agents implemented ad hoc to be the solely opinion providers, they do not ask for any service to 2007 participants.• None of both 2007 participants will ever provide opinions/reputations, they are just consumers.•  Differences between both agents were much less than the official competition stated (absolutely and relatively).“An extension of a fuzzy reputation agent trust model in the ART testbed” Soft Computing v14, issue 8, 2010
  151. 151. Trust Strategies in Evolutive Agent Societies• An evolutionarily stable strategy (ESS) is a strategy which, if adopted by a population of players, cannot be invaded by any alternative strategy• An evolutionarily stable trust strategy is a strategy which, if becomes dominant (adopted by a majority of agents) can not be defeated by any alternative trust strategy.• Justification: The goal of trust strategies is to establish some kind of social control over malicious/distrustful agents• Assumption: agents may change of trust strategy. Agents with a failing trust strategy would get rid of it and they would adopt a successful trust strategy in the future.
  152. 152. An evolutive view of ART games• We consider a failing trust strategy the one who lost (earning less money than the others) the last ART game.• We consider the successful trust strategy to the one who won the last ART game (earning more money than the others).• By this way replacing in consecutive games the participant who lost the game by the one who won it.• We have applied it to the 16 participant agents of 2007 ART competition
  153. 153. and so on…16 participants Winnerin 2007 competition Winner ART gam ART game ART game Loser Loser
  154. 154. Game Winner Earnings Loser Earnings 1 iam2 17377 xerxes -8610 2 iam2 14321 lesmes -13700 3 iam2 10360 reneil -14757 4 iam2 10447 blizzard -7093 5 agentevicente 8975 Rex -5495 6 iam2 8512 alatriste -999 7 artgente 8994 agentevicente 2011 8 artgente 10611 agentevicente 1322 9 artgente 8932 novel 424 10 iam2 9017 IMM 1392 11 artgente 7715 marmota 1445 12 artgente 8722 spartan 2083 13 artgente 8966 zecariocales 1324 14 artgente 8372 iam2 2599 15 artgente 7475 iam2 2298 16 artgente 8384 UNO 2719 17 artgente 7639 iam2 2878 18 iam2 6279 JAM 3486 19 iam2 14674 artgente 2811 20 artgente 8035 iam2 3395
  155. 155. Results of repeated games 2007 winner is not a Evolutionarily Stable Strategy.• Although the strategy of the winner of the 2007 spreads in the society of agents (until 6 iam2 agents out of 16), it never becomes dominant (no majority of iam2 agents).• iam2 strategy is defeated by artgente strategy, which becomes dominant (11 artgente agents out of 16). Therefore its superiority as winner of 2007 competition is, at least, relative.• The right equilibrium of trust strategies that form an evolutionarily stable society is composed by 10-11 Artgente agents and 6-5 iam2 agents.
  156. 156. CompetitionRank EvolutionRank Agent ExcludedInGame 6 1 artgente - 1 2 iam2 - 2 3 JAM 18 7 4 UNO 16 4 5 zecariocales 13 5 6 spartan 12 9 7 marmota 11 13 8 IMM 10 10 9 novel 9 15 10 agentevicente 8 11 11 alatriste 6 12 12 rex 5 3 13 Blizzard 4 8 14 reneil 3 14 15 lesmes 2 16 16 xerxes 1
  157. 157. Other Evaluation Criteria of the ART testbed• The testbed also provides functionality to compute: – the average accuracy of the appraiser’s final appraisals (final appraisal error mean) – the consistency of that accuracy (final appraisal error standard deviation) – the quantities of each type of message passed between appraisers are recorded.• We could take into account other relevant evaluation criteria?
  158. 158. Evaluation criteria from the agent-based view Characterization and Evaluation of Multi-agent System, P. Davidsson, S. Johanson, M. Svahnberg In Software Engineering for Multi-Agent Systems IV, LNCS 3914, 2006.9 Quality atributes:1. Reactivity: How fast are opinions re-evaluated when there are changes in expertise?2. Load balancing: How evenly is the load balanced between the appraisals?3. Fairness: Are all the providers treated equally?4. Utilization of resources: Are the available abilities/information utilized as much as is possible?
  159. 159. Evaluation criteria from the agent-based view5. Responsiveness: How long does it take for the appraisals to get response to an individual request?6. Communication overhead: How much extra communication is needed for the appraisals?7. Robustness: How vulnerable is the agent to the absence of responses?8. Modifiability: How easy is it to change the behaviour of the agent in very different conditions?9. Scalability: How good is the system at handling large numbers of providers and consumers)?
  160. 160. Evaluation criteria from the agent-based viewEvaluation of Multi-Agent Systems: The case of Interaction, H. Joumaa, Y. Demazeau, J.M. Vincent, 3rd Int. Conf. on Information & Communication Technologies: from Theory to Applications. IEEE Computer Society, Los Alamitos (2008)• An evaluation at the interaction level, based on the weight of the information brought by a message.• A function Φ is defined in order to calculate the weight of pertinent messages.
  161. 161. Evaluation criteria from the agent-based view• The relation between the received message m and the effects on the agent is studied in order to calculate the Φ(m) value. According to the model, two kinds of functions are considered: – A function that associates weight to the message according to its type. – A function that associates weight to the message according to the change provoked on the internal state and the actions triggered by its reception.
  162. 162. Consciousness Scale• Too much quantification (AI is not just statistics…)• Compare agents qualitatively  Measure their level of consciusness• A scale of 13 conscious levels according to the cognitive skills of an agent, the “Cognitive Power” of an agent.• The higher the level obtained, the more the behavior of the agent resembles humans• www.consscale.com
  163. 163. Bio-inspired order of Cognitive Skills• From the point of view of emotions (Damasio, 1999): “Emotion” “Feeling” “Feeling of a Feeling” “Fake Emotions”
  164. 164. Bio-inspired order of Cognitive Skills• From the point of view of perception and action (Perner, 1999): “Perception” “Adaptation” “Attention” “Set Shifting” “Planning” “Imagination”
  165. 165. Bio-inspired order of Cognitive Skills• From the point of view of Theory of Mind (Lewis 2003): “I Know” “I Know I Know” “I Know You Know” “I Know You Know I Know”
  166. 166. Consciousness Levels Super-Conscious Human-like Social EmpathicSelf-Conscious Emotional Executive Attentional Adaptive Reactive
  167. 167. Evaluating agents with ConsScale
  168. 168. Thank you ! EASSS 2010, Saint-Etienne, France 172