1. Decision Support Systems 43 (2007) 618 – 644
A survey of trust and reputation systems for
online service provision
Audun Jøsang a,*, Roslan Ismail b, Colin Boyd b
Distributed Systems Technology Centre, University of Queensland, Level 7, GP South, UQ Qld 4072, Australia
Information Security Research Centre, Queensland University of Technology, Brisbane Qld 4001, Australia
Available online 5 July 2006
Trust and reputation systems represent a significant trend in decision support for Internet mediated service provision. The
basic idea is to let parties rate each other, for example after the completion of a transaction, and use the aggregated ratings about
a given party to derive a trust or reputation score, which can assist other parties in deciding whether or not to transact with that
party in the future. A natural side effect is that it also provides an incentive for good behaviour, and therefore tends to have a
positive effect on market quality. Reputation systems can be called collaborative sanctioning systems to reflect their
collaborative nature, and are related to collaborative filtering systems. Reputation systems are already being used in successful
commercial online applications. There is also a rapidly growing literature around trust and reputation systems, but unfortunately
this activity is not very coherent. The purpose of this article is to give an overview of existing and proposed systems that can be
used to derive measures of trust and reputation for Internet transactions, to analyse the current trends and developments in this
area, and to propose a research agenda for trust and reputation systems.
D 2005 Elsevier B.V. All rights reserved.
Keywords: Trust; Reputation; Transitivity; Collaboration; E-commerce; Security; Decision
1. Introduction services offered. This forces the consumer to accept
the brisk of prior performanceQ, i.e. to pay for services
Online service provision commonly takes place and goods before receiving them, which can leave him
between parties who have never transacted with in a vulnerable position. The consumer generally has
each other before, in an environment where the ser- no opportunity to see and try products, i.e. to bsqueeze
vice consumer often has insufficient information the orangesQ, before he buys. The service provider, on
about the service provider, and about the goods and the other hand, knows exactly what he gets, as long as
he is paid in money. The inefficiencies resulting from
this information asymmetry can be mitigated through
* Corresponding author.
trust and reputation. The idea is that even if the
E-mail addresses: firstname.lastname@example.org (A. Jøsang), consumer cannot try the product or service in ad-
email@example.com (R. Ismail), firstname.lastname@example.org (C. Boyd). vance, he can be confident that it will be what he
0167-9236/$ - see front matter D 2005 Elsevier B.V. All rights reserved.
2. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 619
expects as long as he trusts the seller. A trusted seller Section 5 we describe different trust classes, of which
therefore has a significant advantage in case the prod- provision trust is a class of trust that refers to service
uct quality cannot be verified in advance. provision. Section 6 describes four categories for
This example shows that trust plays a crucial role reputation and trust semantics that can be used in
in computer mediated transactions and processes. trust and reputation systems, Section 7 describes cen-
However, it is often hard to assess the trustworthiness tralised and distributed reputation system architec-
of remote entities, because computerised communica- tures, and Section 8 describes some reputation
tion media are increasingly removing us from familiar computation methods, i.e. how ratings are to be com-
styles of interaction. Physical encounter and tradition- puted to derive reputation scores. Section 9 provides
al forms of communication allow people to assess a an overview of reputation systems in commercial and
much wider range of cues related to trustworthiness live applications. Section 10 describes the main pro-
than is currently possible through computer mediated blems in reputation systems, and provides an over-
communication. The time and investment it takes to view of literature that proposes solutions to these
establish a traditional brick-and-mortar street presence problems. The study is rounded off with a discussion
provides some assurance that those who do it are in Section 11.
serious players. This stands in sharp contrast to the
relative simplicity and low cost of establishing a good
looking Internet presence which gives little evidence 2. Background for trust and reputation systems
about the solidity of the organisation behind it. The
difficulty of collecting evidence about unknown trans- 2.1. The notion of trust
action partners makes it hard to distinguish between
high and low quality service providers on the Internet. Manifestations of trust are easy to recognise be-
As a result, the topic of trust in open computer net- cause we experience and rely on it everyday, but at the
works is receiving considerable attention in the aca- same time trust is quite challenging to define because
demic community and e-commerce industry. it manifests itself in many different forms. The liter-
There is a rapidly growing literature on the theory ature on trust can also be quite confusing because the
and applications of trust and reputation systems, and term is being used with a variety of meanings .
the main purpose of this document is to provide a Two common definitions of trust which we will call
survey of the developments in this area. An earlier reliability trust and decision trust respectively will be
brief survey of reputation systems has been published used in this study.
by Mui et al. . Overviews of agent transaction As the name suggest, reliability trust can be inter-
systems are also relevant because they often relate to preted as the reliability of something or somebody,
reputation systems [25,42,38]. There is considerable and the definition by Gambetta  provides an
confusion around the terminology used to describe example of how this can be formulated:
these systems, and we will try to describe proposals
Definition 1 (Reliability trust). Trust is the subjective
and developments using a consistent terminology in
probability by which an individual, A, expects that
this study. There also seems to be a lack of coherence
another individual, B, performs a given action on
in this area, as indicated by the fact that authors often
which its welfare depends.
propose new systems from scratch, without trying to
extend and enhance previous proposals. This definition includes the concept of dependence
Section 2 attempts to define the concepts of trust on the trusted party, and the reliability (probability) of
and reputation, and proposes an agenda for research the trusted party, as seen by the trusting party.
into trust and reputation systems. Section 3 describes However, trust can be more complex than Gam-
why trust and reputation systems should be regarded betta’s definition indicates. For example, Falcone and
as security mechanisms. Section 4 describes the rela- Castelfranchi  recognise that having high (reliabil-
tionship between collaborative filtering systems and ity) trust in a person in general is not necessarily
reputation systems, where the latter can also be de- enough to decide to enter into a situation of depen-
fined in terms of collaborative sanctioning systems. In dence on that person. In  they write: bFor example
3. 620 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
it is possible that the value of the damage per se (in The strongest expression for this view has been given
case of failure) is too high to choose a given decision by Williamson  who argues that the notion of
branch, and this independently either from the prob- trust should be avoided when modelling economic
ability of the failure (even if it is very low) or from the interactions, because it adds nothing new, and that
possible payoff (even if it is very high). In other well known notions such as reliability, utility and risk
words, that danger might seem to the agent an intol- are adequate and sufficient for that purpose. Accord-
erable risk.Q In order to capture this broad concept of ing to Williamson, the only type of trust that can be
trust, the following definition inspired by McKnight meaningful for describing interactions is personal
and Chervany  can be used. trust. He argues that personal trust applies to emo-
tional and personal interactions such as love relation-
Definition 2 (Decision trust). Trust is the extent to
ships where mutual performance is not always
which one party is willing to depend on something or
monitored and where failures are forgiven rather
somebody in a given situation with a feeling of rela-
than sanctioned. In that sense, traditional computa-
tive security, even though negative consequences are
tional models would be inadequate e.g. because of
insufficient data and inadequate sanctioning, but also
The relative vagueness of this definition is useful because it would be detrimental to the relationships if
because it makes it the more general. It explicitly and the involved parties were to take a computational
implicitly includes aspects of a broad notion of trust approach. Non-computation models for trust can be
which are dependence on the trusted entity or party, meaningful for studying such relationships according
the reliability of the trusted entity or party, utility in to Williamson, but developing such models should be
the sense that positive utility will result from a posi- done within the domains of sociology and psycholo-
tive outcome, and negative utility will result from a gy, rather than in economy.
negative outcome, and finally a certain risk attitude in
the sense that the trusting party is willing to accept the 2.2. Reputation and trust
situational risk resulting from the previous elements.
Risk emerges, for example, when the value at stake in The concept of reputation is closely linked to that
a transaction is high, and the probability of failure is of trustworthiness, but it is evident that there is a clear
non-negligible (i.e. reliability b 1). Contextual aspects, and important difference. For the purpose of this
such law enforcement, insurance and other remedies study, we will define reputation according to the
in case something goes wrong, are only implicitly Concise Oxford dictionary.
included in the definition of trust above, but should
Definition 3 (Reputation). Reputation is what is gen-
nevertheless be considered to be part of trust.
erally said or believed about a person’s or thing’s
There are only a few computational trust models
character or standing.
that explicitly take risk into account . Studies
that combine risk and trust include Manchala  This definition corresponds well with the view of
and Jøsang and Lo Presti . Manchala explicitly social network researchers [20,45] that reputation is a
avoids expressing measures of trust directly, and quantity derived from the underlying social network
instead develops a model around other elements which is globally visible to all members of the net-
such as transaction values and the transaction history work. The difference between trust and reputation can
of the trusted party. Jøsang and Lo Presti distinguish be illustrated by the following perfectly normal and
between reliability trust and decision trust, and plausible statements:
develops a mathematical model for decision trust
based on more finely grained primitives, such as (1) bI trust you because of your good reputation.Q
agent reliability, utility values, and the risk attitude (2) bI trust you despite your bad reputation.Q
of the trusting agent.
The difficulty of capturing the notion of trust in Assuming that the two sentences relate to identical
formal models in a meaningful way has led some transactions, statement (1) reflects that the relying
economists to reject it as a computational concept. party is aware of the trustee’s reputation, and bases
4. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 621
his trust on that. Statement (2) reflects that the relying depend on in the physical world are missing in online
party has some private knowledge about the trustee, environments, so that electronic substitutes are need-
e.g. through direct experience or intimate relationship, ed. Secondly, communicating and sharing information
and that these factors overrule any reputation that a related to trust and reputation is relatively difficult,
person might have. This observation reflects that trust and normally constrained to local communities in the
ultimately is a personal and subjective phenomenon physical world, whereas IT systems combined with
that is based on various factors or evidence, and that the Internet can be leveraged to design extremely
some of those carry more weight than others. Personal efficient systems for exchanging and collecting such
experience typically carries more weight than second information on a global scale.
hand trust referrals or reputation, but in the absence of Motivated by this basic observation, the purposes of
personal experience, trust often has to be based on research in trust and reputation systems should be to:
referrals from others.
Reputation can be considered as a collective mea- (a) Find adequate online substitutes for the tradi-
sure of trustworthiness (in the sense of reliability) tional cues to trust and reputation that we are
based on the referrals or ratings from members in a used to in the physical world, and identify new
community. An individual’s subjective trust can be information elements (specific to a particular
derived from a combination of received referrals and online application) which are suitable for deriv-
personal experience. In order to avoid dependence and ing measures of trust and reputation.
loops it is required that referrals be based on first hand (b) Take advantage of IT and the Internet to create
experience only, and not on other referrals. As a efficient systems for collecting that information,
consequence, an individual should only give subjec- and for deriving measures of trust and reputa-
tive trust referral when it is based on first hand tion, in order to support decision making and to
evidence or when second hand input has been re- improve the quality of online markets.
moved from its derivation base . It is possible to
abandon this principle for example when the weight These simple principles invite rigorous research in
of the trust referral is normalised or divided by the order to answer some fundamental questions: what
total number of referrals given by a single entity, and information elements are most suitable for deriving
the latter principle is applied in Google’s PageRank measures of trust and reputation in a given applica-
algorithm  described in more detail in Section 9.5. tion? How can these information elements be captured
Reputation can relate to a group or to an individual. and collected? What are the best principles for de-
A group’s reputation can for example be modelled as signing such systems from a theoretic and from a
the average of all its members’ individual reputations, usability point of view? Can they be made resistant
or as the average of how the group is perceived as a to attacks of manipulation by strategic agents? How
whole by external parties. Tadelis’  study shows should users include the information provided by such
that an individual belonging to a given group will systems into their decision process? What role can
inherit an a priori reputation based on that group’s these systems play in the business model of commer-
reputation. If the group is reputable all its individual cial companies? Do these systems truly improve the
members will a priori be perceived as reputable and quality of online trade and interactions? These are
vice versa. important questions that need good answers in order
to determine the potential for trust and reputation
2.3. A research agenda for trust and reputation systems in online environments.
systems According to Resnick et al. , reputation sys-
tems must have the following three properties to
There are two fundamental differences between operate at all:
traditional and online environments regarding how
trust and reputation are, and can be, used. (1) Entities must be long lived, so that with every
Firstly, as already mentioned, the traditional cues interaction there is always an expectation of
of trust and reputation that we are used to observe and future interactions.
5. 622 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
(2) Ratings about current interactions are captured The main differences between trust and reputation
and distributed. systems can be described as follows: trust systems
(3) Ratings about past interactions must guide deci- produce a score that reflects the relying party’s sub-
sions about current interactions. jective view of an entity’s trustworthiness, whereas
reputation systems produce an entity’s (public) repu-
The longevity of agents means, for example, that it tation score as seen by the whole community. Sec-
should be impossible or difficult for an agent to ondly, transitivity is an explicit component in trust
change identity or pseudonym for the purpose of systems, whereas reputation systems usually only
erasing the connection to its past behaviour. The take transitivity implicitly into account. Finally,
second property depends on the protocol with which trust systems usually take subjective and general
ratings are provided, and this is usually not a problem measures of (reliability) trust as input, whereas infor-
for centralised systems, but represents a major chal- mation or ratings about specific (and objective)
lenge for distributed systems. The second property events, such as transactions, are used as input in
also depends on the willingness of participants to reputation systems.
provide ratings, for which there must be some form There can of course be trust systems that incorpo-
of incentive. The third property depends on the us- rate elements of reputation systems and vice versa, so
ability of reputation system, and how people and that it is not always clear how a given systems should
systems respond to it, and this is reflected in the be classified. The descriptions of the various trust and
commercial and live reputation systems described in reputation systems below must therefore be seen in
Section 9, but only to a small extent in the theoretic light of this.
proposals described in Sections 8 and 10.
The basic principles of reputation systems are rel-
atively easy to describe (see Sections 7 and 8). How- 3. Security and trust
ever, because the notion of trust itself is vague, what
constitutes a trust system is difficult to describe con- 3.1. Trust and reputation systems as soft security
cisely. A method for deriving trust from a transitive mechanisms
trust path is an element which is normally found in
trust systems. The idea behind trust transitivity is that In a general sense, the purpose of security mechan-
when Alice trusts Bob, and Bob trusts Claire, and Bob isms is to provide protection against malicious parties.
refers Claire to Alice, then Alice can derive a measure In this sense there is a whole range of security chal-
of trust in Claire based on Bob’s referral combined lenges that are not met by traditional approaches.
with her trust in Bob. This is illustrated in Fig. 1. Traditional security mechanisms will typically protect
The type of trust considered in this example is resources from malicious users, by restricting access
obviously reliability trust (and not decision trust). In to only authorised users. However, in many situations
addition there are semantic constraints for the transi- we have to protect ourselves from those who offer
tive trust derivation to be valid, i.e. that Alice must resources so that the problem in fact is reversed.
trust Bob to recommend Claire for a particular pur- Information providers can for example act deceitfully
pose, and that Bob must trust Claire for that same by providing false or misleading information, and
purpose . traditional security mechanisms are unable to protect
against this type of threat. Trust and reputation sys-
Alice 2 Bob Claire tems on the other hand can provide protection against
such threats. The difference between these two
approaches to security was first described by Rasmus-
1 1 sen and Jansson  who used the term hard security
trust trust for traditional mechanisms like authentication and
access control, and soft security for what they called
social control mechanisms in general, of which trust
Fig. 1. Trust transitivity principle. and reputation systems are examples.
6. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 623
3.2. Computer security and trust security mechanisms) are useful tools for deriving
Security mechanisms protect systems and data It can be observed that identity trust is a condition
from being adversely affected by malicious and non- for trusting a party behind the identity with anything
authorised parties. The effect of this is that those more than a baseline or default provision trust that
systems and data can be considered more reliable, applies to all parties in a community. This does not
and thus more trustworthy. The concepts of Trusted mean that the real world identity of the principal must
Systems and Trusted Computing Base have been used be known. An anonymous party, who can be recog-
in the IT security jargon (see e.g. Abrams ), but the nised from interaction to interaction, can also be
concept of security assurance level is more standar- trusted for the purpose of providing services.
dised as a measure of security.1 The assurance level
can be interpreted as a system’s strength to resist
malicious attacks, and some organisations require 4. Collaborative filtering and collaborative
systems with high assurance levels for high risk or sanctioning
highly sensitive applications. In an informal sense, the
assurance level expresses a level of public (reliability) Collaborative filtering systems (CF) have similar-
trustworthiness of given system. However, it is evi- ities with reputation systems in that both collect
dent that additional information, such as warnings ratings from members in a community. However
about newly discovered security flaws, can carry they also have fundamental differences. The assump-
more weight than the assurance level when people tions behind CF systems is that different people have
form their own subjective trust in the system. different tastes, and rate things differently according
to subjective taste. If two users rate a set of items
3.3. Communication security and trust similarly, they share similar tastes, and are called
neighbours in the jargon. This information can be
Communication security includes encryption of used to recommend items that one participant likes,
the communication channel and cryptographic au- to his or her neighbours, and implementations of this
thentication of identities. Authentication provides technique are often called recommender systems.
so-called identity trust, i.e. a measure of the correct- This must not be confused with reputation systems
ness of a claimed identity over a communication which are based on the seemingly opposite assump-
channel. The term btrust providerQ is sometimes tion, namely that all members in a community should
used in the industry to describe CAs2 and other judge the performance of a transaction partner or the
authentication service providers with the role of pro- quality of a product or service consistently. In this
viding the necessary mechanisms and services for sense the term bcollaborative sanctioningQ (CS) 
verifying and managing identities. The type of trust has been used to describe reputation systems, because
that CAs and identity management systems provide is the purpose is to sanction poor service providers,
simply identity trust. In case of chained identity certi- with the aim of giving an incentive for them to
ficates, the derivation of identity trust is based on trust provide quality services.
transitivity, so in that sense these systems can be CF takes ratings subject to taste as input, whereas
called identity trust systems. reputation systems take ratings assumed insensitive to
However, users are also interested in knowing taste as input. People will for example judge data files
the reliability of authenticated parties, or the quality containing film and music differently depending on
of goods and services they provide. This latter type their taste, but all users will judge files containing
of trust will be called provision trust in this study, viruses to be bad. CF systems can be used to select the
and only trust and reputation systems (i.e. soft preferred files in the former case, and reputation
systems can be used to avoid the bad files in the latter
See e.g. the UK CESG at http://www.cesg.gov.uk/ or the Com- case. There will of course be cases where CF systems
mon Criteria Project at http://www.commoncriteriaportal.org/. identify items that are invariant to taste, which simply
Certification authority. indicates low usefulness of that result for recommen-
7. 624 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
dation purposes. Inversely, there will be cases where Provision Trust
ratings that are subject to personal taste are being fed
into reputation systems. The latter can cause pro-
blems, because most reputation systems will be un- Delegation Trust Trust purpose
able to distinguish between variations in service
provider performance, and variations in the observer’s
taste, potentially leading to unreliable and misleading Context Trust
Another important point is that CF systems and Fig. 2. Trust classes (Grandison and Sloman ).
reputation systems assume an optimistic and a pessi-
mistic world view respectively. To be specific CF
provision trust. For example when a contract specifies
systems assume all participants to be trustworthy
quality requirements for the delivery of services, then
and sincere, i.e. to their job as best they can and to
this business trust would be provision trust in our
always report their genuine opinion. Reputation sys-
tems, on the other hand, assume that some participants
!Access trust describes trust in principals for the
will try to misrepresent the quality of services in order
purpose of accessing resources owned by or under the
to make more profit, and to lie or provide misleading
responsibility of the relying party. This relates to the
ratings in order to achieve some specific goal. It can
access control paradigm which is a central element in
be very useful to combine CF and reputation systems,
computer security. A good overview of access trust
and Amazon.com described in Section 9.3.3 does this
systems can be found in Grandison and Sloman .
to a certain extent. Theoretic schemes include
!Delegation trust describes trust in an agent (the
Damiani et al.’s proposal to separate between provider
delegate) that acts and makes decision on behalf of the
reputation and resource reputation in P2P networks
relying party. Grandison and Sloman point out that
acting on one’s behalf can be considered to a special
form of service provision.
!Identity trust 5 describes the belief that an agent
5. Trust classes
identity is as claimed. Trust systems that derive iden-
tity trust are typically authentication schemes such as
In order to be more specific about trust semantics,
X.509 and PGP . Identity trust systems have been
we will distinguish between a set of different trust
discussed mostly in the information security commu-
classes according to Grandison and Sloman’s classifi-
nity, and a brief overview and analysis can be found in
cation . This is illustrated in Fig. 2.3 The high-
Reiter and Stubblebine .
lighting of provision trust in Fig. 2 is done to illustrate
!Context trust 6 describes the extent to which the
that it is the focus of the trust and reputation systems
relying party believes that the necessary systems and
described in this study.
institutions are in place in order to support the
!Provision trust describes the relying party’s trust
transaction and provide a safety net in case some-
in a service or resource provider. It is relevant when
thing should go wrong. Factors for this type of trust
the relying party is a user seeking protection from
can for example be critical infrastructures, insurance,
malicious or unreliable service providers. The Lib-
legal system, law enforcement and stability of soci-
erty Alliance Project4 uses the term bbusiness trustQ
ety in general.
 to describes mutual trust between companies
Trust purpose is an overarching concept that can
emerging from contract agreements that regulate inter-
be used to express any operational instantiation of the
actions between them, and this can be interpreted as
trust classes mentioned above. In other words, it
defines the specific scope of a given trust relationship.
Grandison and Sloman use the terms service provision trust, A particular trust purpose can for example be bto be a
resource access trust, delegation trust, certification trust, and in-
frastructure trust. Called bauthentication trustQ in Liberty Alliance .
http://www.projectliberty.org/. Called bsystem trustQ in McKnight and Chervany .
8. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 625
good car mechanicQ, which can be grouped under the !Subjective and general measures are for example
provision trust class. used on eBay’s reputation system which is described
Conceptually, identity trust and provision trust can in detail in Section 9.1. An inherent problem with this
be seen as two layers on top of each other, where type of measure is that it often fails to assign credit or
provision trust normally cannot exist without identity blame to the right aspect or even the right party. For
trust. In the absence of identity trust, it is only example, if a shipment of an item bought on eBay
possible to have a baseline provision trust in an arrives late or is broken, the buyer might give the
agent or entity. seller a negative rating, whereas the post office might
have caused the problem.
A general problem with all subjective measures is
6. Categories of trust semantics that it is difficult to protect against unfair ratings.
Another potential problem is that the act of referring
The semantic characteristics of ratings, reputation negative general and subjective trust in an entity can
scores and trust measures are important in order for lead to accusations of slander. This is not so much a
participants to be able to interpret those measures. The problem in reputation systems because the act of
semantics of measures can be described in terms of a rating a particular transaction negatively is less sensi-
specificity-generality dimension and a subjectivity-ob- tive than it is to refer negative trust in an entity in
jectivity dimension as illustrated in Table 1. general.
A specific measure means that it relates to a spe- !Objective and specific measures are for example
cific trust aspect such as the ability to deliver on time, used in technical product tests where the perfor-
whereas a general measure is supposed to represent an mance or the quality of the product can be objec-
average of all aspects. tively measured. Washing machines can for example
A subjective measure means that an agent provides be tested according to energy consumption, noise,
a rating based on subjective judgement whereas an washing program features, etc. Another example is to
objective measure means that the rating has been rate the fitness of commercial companies based on
determined by objectively assessing the trusted party specific financial measures, such as earning, profit,
against formal criteria. investment, R&D expenditure, etc. An advantage
!Subjective and specific measures are for example with objective measures is that the correctness of
used in survey questionnaires where people are asked ratings can be verified by others, or automatically
to express their opinion over a range of specific issues. generated based on automated monitoring of events.
A typical question could for example be: bHow do you !Objective and general measures can for example
see election candidate X’s ability to handle the econ- be computed based on a vector of objective and
omy?Q and the possible answers could be on a scale 1– specific measures. In product tests, where a range of
5 which could be assumed equivalent to bDisastrousQ, specific characteristics is tested, it is common to
bBadQ, bAverageQ, bGoodQ, and bExcellentQ. Similar derive a general score which can be a weighted aver-
questions could be applied to foreign policy, national age of the score of each characteristic. Dunn and
security, education and health, so that a person’s an- Bradstreet’s business credit rating is an example of a
swer forms a subjective vector of his or her trust in measure that is derived from a vector of objectively
candidate X. measurable company performance parameters.
Classification of trust and reputation measures 7. Reputation network architectures
Specific, vector General, synthesised
based The technical principles for building reputation
Subjective Survey eBay, voting systems are described in this and the following
questionnaires section. The network architecture determines how
Objective Product tests Synthesised general score from ratings and reputation scores are communicated
product tests, D&B rating
between participants in a reputation system. The
9. 626 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
two main types are centralised and distributed The two fundamental aspects of centralised repu-
architectures. tation systems are:
7.1. Centralised reputation systems (1) Centralised communication protocols that allow
participants to provide ratings about transaction
In centralised reputation systems, information partners to the central authority, as well as to
about the performance of a given participant is col- obtain reputation scores of potential transaction
lected as ratings from other members in the commu- partners from the central authority.
nity who have had direct experience with that (2) A reputation computation engine used by the
participant. The central authority (reputation centre) central authority to derive reputation scores for
that collects all the ratings typically derives a reputa- each participant, based on received ratings, and
tion score for every participant, and makes all scores possibly also on other information. This is de-
publicly available. Participants can then use each scribed in Section 8 below.
other’s scores, for example, when deciding whether
or not to transact with a particular party. The idea is 7.2. Distributed reputation systems
that transactions with reputable participants are likely
to result in more favourable outcomes than transac- There are environments where a distributed repu-
tions with disreputable participants. tation system, i.e. without any centralised functions,
Fig. 3 below shows a typical centralised reputation is better suited than a centralised system. In a dis-
framework, where A and B denote transaction part- tributed system there is no central location for sub-
ners with a history of transactions in the past, and mitting ratings or obtaining reputation scores of
who consider transacting with each other in the others. Instead, there can be distributed stores
present. where ratings can be submitted, or each participant
After each transaction, the agents provide ratings simply records the opinion about each experience
about each other’s performance in the transaction. The with other parties, and provides this information on
reputation centre collects ratings from all the agents, request from relying parties. A relying party, who
and continuously updates each agent’s reputation considers transacting with a given target party, must
score as a function of the received ratings. Updated find the distributed stores, or try to obtain ratings
reputation scores are provided online for all the agents from as many community members as possible who
to see, and can be used by the agents to decide have had direct experience with that target party. This
whether or not to transact with a particular agent. is illustrated in Fig. 4.
F B Potential transaction
A C Reputation
Reputation Centre Reputation Centre
a) Past b) Present
Fig. 3. General framework for a centralised reputation system.
10. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 627
Past transactions F G
A G D E
F B C
A E Ratings
a) Past b) Present
Fig. 4. General framework for a distributed reputation system.
The relying party computes the reputation score like Gnutella8 and Freenet9, also the search phase is
based on the received ratings. In case the relying party distributed. Intermediate architectures also exist, e.g.
has had direct experience with the target party, the the FastTrack architecture which is used in P2P net-
experience from that encounter can be taken into works like KaZaA10, grokster11 and iMesh.12 In Fast-
account as private information, possibly carrying a Track based P2P networks, there are nodes and
higher weight than the received ratings. supernodes, where the latter keeps track of other
The two fundamental aspects of distributed repu- nodes and supernodes that are logged onto the net-
tation systems are: work, and thus act as directory servers during the
(1) A distributed communication protocol that After the search phase, where the requested re-
allows participants to obtain ratings from other source has been located, comes the download phase,
members in the community. which consists of transferring the resource from the
(2) A reputation computation method used by each exporting to the requesting servent.
individual agent to derive reputation scores of P2P networks introduce a range of security threats,
target parties based on received ratings, and as they can be used to spread malicious software,
possibly on other information. This is described such as viruses and Trojan horses, and easily bypass
in Section 8. firewalls. There is also evidence that P2P networks
suffer from free riding . Reputation systems are
Peer-to-Peer (P2P) networks represent an envi- well suited to fight these problems, e.g. by sharing
ronment well suited for distributed reputation man- information about rogue, unreliable or selfish partici-
agement. In P2P networks, every node plays the pants. P2P networks are controversial because they
role of both client and server, and is therefore have been used to distribute copyrighted material
sometimes called a servent. This allows the users such as MP3 music files, and it has been claimed
to overcome their passive role typical of web nav- that content poisoning13 has been used by the music
igation, and to engage in an active role by provid- industry to fight this problem. We do not defend using
ing their own resources. There are two phases in P2P networks for illegal file sharing, but it is obvious
the use of P2P networks. The first is the search
phase, which consists of locating the servent where 8
the requested resource resides. In some P2P net- 9
works, the search phase can rely on centralised 10
functions. One such example is Napster7 which 11
has a resource directory server. In pure P2P networks http://imesh.com.
Poisoning music file sharing networks consists of distributing
files with legitimate titles and put inside them silence or random
11. 628 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
that reputation systems could be used by distributors whereas others have been proposed by the academic
of illegal copyrighted material to protect themselves community.
Many authors have proposed reputation systems 8.1. Simple summation or average of ratings
for P2P networks [2,13,14,18,24,36,40]. The purpose
of a reputation system in P2P networks is: The simplest form of computing reputation scores
is simply to sum the number of positive ratings and
(1) To determine which servents are most reliable at negative ratings separately, and to keep a total score as
offering the best quality resources, and the positive score minus the negative score. This is the
(2) To determine which servents provide the most principle used in eBay’s reputation forum which is
reliable information with regard to (1). described in detail in . The advantage is that
anyone can understand the principle behind the repu-
In a distributed environment, each participant is tation score, the disadvantage is that it is primitive and
responsible for collecting and combining ratings therefore gives a poor picture on participants’ reputa-
from other participants. Because of the distributed tion score although this is also due to the way rating is
environment, it is often impossible or too costly to provided, see Sections 10.1 and 10.2.
obtain ratings resulting from all interactions with a A slightly more advanced scheme proposed in e.g.
given agent. Instead the reputation score is based on a  is to compute the reputation score as the average
subset of ratings, usually from the relying party’s of all ratings, and this principle is used in the reputa-
bneighbourhoodQ. tion systems of numerous commercial web sites, such
as Epinions, and Amazon described in Section 9.
Advanced models in this category compute a
8. Reputation computation engines weighted average of all the ratings, where the rating
weight can be determined by factors such as rater
Seen from the relying party’s point of view, trust trustworthiness/reputation, age of the rating, distance
and reputation scores can be computed based on between rating and current score, etc.
own experience, on second hand referrals, or on a
combination of both. In the jargon of economic 8.2. Bayesian systems
theory, the term private information is used to
describe first hand information resulting from own Bayesian systems take binary ratings as input (i.e.
experience, and public information is used to de- positive or negative), and are based on computing
scribe publicly available second hand information, reputation scores by statistical updating of beta prob-
i.e. information that can be obtained from third ability density functions (PDF). The a posteriori (i.e.
parties. the updated) reputation score is computed by combin-
Reputation systems are typically based on public ing the a priori (i.e. previous) reputation score with
information in order to reflect the community’s opin- the new rating [29,31,48–51,68]. The reputation score
ion in general, which is in line with Definition 3 of can be represented in the form of the beta PDF
reputation. A party, who relies on the reputation score parameter tuple (a, b) (where a and b represent the
of some remote party, is in fact trusting that party amount of positive and negative ratings respectively),
through trust transitivity . or in the form of the probability expectation value of
Some systems take both public and private infor- the beta PDF, and optionally accompanied with the
mation as input. Private information, e.g. resulting variance or a confidence parameter. The advantage of
from personal experience, is normally considered Bayesian systems is that they provide a theoretically
more reliable than public information, such as ratings sound basis for computing reputation scores, and the
from third parties. only disadvantage that it might be too complex for
This section describes various principles for com- average persons to understand.
puting reputation and trust measures. Some of the The beta-family of distributions is a continuous
principles are used in commercial applications, family of distribution functions indexed by the two
12. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 629
parameters a and b. The beta PDF denoted by ous measures. This is also valid for determining trust
beta( p|a, b) can be expressed using the gamma func- measures, and some authors, including [1,9,10,44],
tion C as: have proposed discrete trust models.
For example, in the model of Abdul-Rahman and
Cða þ bÞ aÀ1 Hailes  trustworthiness of an agent x can be re-
betað pja; bÞ ¼ p ð1 À pÞbÀ1
CðaÞCðbÞ ferred as Very Trustworthy, Trustworthy, Untrustwor-
where 0VpV1; a; bN0 ð1Þ thy and Very Untrustworthy. The relying party can
then apply his or her own perception about the trust-
with the restriction that the probability variable p p 0 if worthiness of the referring agent before taking the
a b 1 and p p 1 if b b 1. The probability expectation referral into account. Look-up tables, with entries
value of the beta distribution is given by: for referred trust and referring party downgrade/up-
grade, are used to determine derived trust in x. When-
E ð pÞ ¼ a=ða þ bÞ: ð2Þ
ever the relying party has had personal experience
When nothing is known, the a priori distribution is with x, this can be used to determine referring party
the uniform beta PDF with a = 1 and b = 1 illustrated trustworthiness. The assumption is that personal ex-
in Fig. 5a. Then, after observing r positive and s perience reflects x’s real trustworthiness and that
negative outcomes, the a posteriori distribution is referrals about x that differ from the personal experi-
the beta PDF with a = r + 1 and b = s + 1. For example, ence will indicate whether the referring party under-
the beta PDF after observing 7 positive and 1 negative rates or overrates. Referrals from a referring party who
outcomes is illustrated in Fig. 5b. is found to overrate will be downgraded, and vice
A PDF of this type expresses the uncertain proba- versa.
bility that future interactions will be positive. The The disadvantage of discrete measures is that they
most natural is to define the reputation score as a do not easily lend themselves to sound computational
function of the expectation value. The probability principles. Instead, heuristic mechanisms like look-up
expectation value of Fig. 5b according to Eq. (2) is tables must be used.
E( p) = 0.8. This can be interpreted as saying that the
relative frequency of a positive outcome in the future 8.4. Belief models
is somewhat uncertain, and that the most likely value
is 0.8. Belief theory is a framework related to probability
theory, but where the sum of probabilities over all
8.3. Discrete trust models possible outcomes not necessarily add up to 1, and the
remaining probability is interpreted as uncertainty.
Humans are often better able to rate performance in Jøsang [29,30] has proposed a belief/trust metric
the form of discrete verbal statements, than continu- called opinion denoted by x x = (b, d, u, a), which
Probability density beta ( p | 8,2 )
Probability density beta ( p | 1,1 )
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Probability p Probability p
(a) Uniform PDF beta(p ⎢1, 1) (b) PDF beta(p ⎜8, 2)
Fig. 5. Example beta probability density functions. (a) Uniform PDF beta( p|1, 1). (b) PDF beta( p|8, 2).
13. 630 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
expresses the relying party A’s belief in the truth of agent A is trustworthy (TA ) or not trustworthy ( ITA ),
statement x. Here b, d, and u represent belief, disbe- and separate beliefs are being kept about whether A is
lief and uncertainty respectively where b, d, u a [0,1] trustworthy or not, denoted by m(TA ) and m( ITA )
and b + d + u = 1. The parameter a a [0,1], which is respectively. The reputation score C of an agent A is
called the relative atomicity, represents the base rate then defined as:
probability in the absence of evidence, and is used for
Cð AÞ ¼ mðTA Þ À mð ITA Þ;
computing an opinion’s probability expectation value
E(x x ) = b + au, meaning that a determines how un- where mðTA Þ; mð ITA Þa½0; 1Š and Cð AÞa½ À 1; 1Š:
certainty shall contribute to E(x x ). When the state- ð3Þ
ment x for example says bDavid is honest and
reliableQ, then the opinion can be interpreted as reli- Without going into detail, the ratings provided by
ability trust in David. As an example, let us assume individual agents are belief measures determined as a
that Alice needs to get her car serviced, and that she function of A’s past history of behaviours with indi-
asks Bob to recommend a good car mechanic. When vidual agents as trustworthy or not trustworthy, using
Bob recommends David, Alice would like to get a predefined threshold values for what constitutes trust-
second opinion, so she asks Claire for her opinion worthy and untrustworthy behaviour. These belief
about David. This situation is illustrated in Fig. 6. measures are then combined using Dempster’s
When trust and trust referrals are expressed as rule14, and the resulting beliefs are fed into Eq. (3)
opinions, each transitive trust path AliceYBobY to compute the reputation score. Ratings are consid-
David, and AliceYClaireYDavid can be computed ered valid if they result from a transitive trust chain of
with the discounting operator, where the idea is that length less or equal to a predefined limit.
the referrals from Bob and Claire are discounted as a
function Alice’s trust in Bob and Claire respectively. 8.5. Fuzzy models
Finally the two paths can be combined using the
consensus operator. These two operators form part Trust and reputation can be represented as linguis-
of Subjective Logic , and semantic constraints tically fuzzy concepts, where membership functions
must be satisfied in order for the transitive trust describe to what degree an agent can be described as
derivation to be meaningful . Opinions can be e.g. trustworthy or not trustworthy. Fuzzy logic pro-
uniquely mapped to beta PDFs, and in this sense the vides rules for reasoning with fuzzy measures of this
consensus operator is equivalent to the Bayesian type. The scheme proposed by Manchala  de-
updating described in Section 8.2. This model is scribed in Section 2 as well as the REGRET reputa-
thus both belief-based and Bayesian. tion system proposed by Sabater and Sierra [59–61]
Yu and Singh  have proposed to use belief fall in this category. In Sabater and Sierra’s scheme,
theory to represent reputation scores. In their scheme, what they call individual reputation is derived from
two possible outcomes are assumed, namely that an private information about a given agent, what they
call social reputation is derived from public informa-
Bob tion about an agent, and what they call context de-
ref. pendent reputation is derived from contextual
Alice David information.
trust trust 8.6. Flow models
Claire Systems that compute trust or reputation by tran-
2 sitive iteration through looped or arbitrarily long
derived chains can be called flow models.
Dempster’s rule is the classical operator for combining evidence
Fig. 6. Deriving trust from parallel transitive chains. from different sources.
14. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 631
Some flow models assume a constant trust/reputa- case. The Feedback Forum is a centralised reputa-
tion weight for the whole community, and this weight tion system, where eBay collects all the ratings and
can be distributed among the members of the com- computes the scores. The running total reputation
munity. Participants can only increase their trust/rep- score of each participant is the sum of positive
utation at the cost of others. Google’s PageRank  ratings (from unique users) minus the sum of neg-
described in Section 9.5, the Appleseed algorithm  ative ratings (from unique users). In order to pro-
and Advogato’s reputation scheme  described in vide information about a participant’s more recent
Section 9.2 belong to this category. In general, a behaviour, the total of positive, negative and neutral
participant’s reputation increases as a function of ratings for the three different time windows (i) past
incoming flow, and decreases as a function of outgo- 6 months, (ii) past month, and (iii) past 7 days are
ing flow. In the case of Google, many hyperlinks to a also displayed.
web page contributes to increased PageRank whereas There are many empirical studies of eBay’s repu-
many hyperlinks from a web page contributes to tation system, see Resnick et al.  for an overview.
decreased PageRank for that web page. In general the observed ratings on eBay are surpris-
Flow models do not always require the sum of the ingly positive. Buyers provide ratings about sellers
reputation/trust scores to be constant. One such ex- 51.7% of the time, and sellers provide ratings about
ample is the EigenTrust model  which computes buyers 60.6% of the time . Of all ratings provided,
agent trust scores in P2P networks through repeated less than 1% is negative, less than 0.5% is neutral and
and iterative multiplication and aggregation of trust about 99% is positive. It was also found that there is a
scores along transitive chains until the trust scores for high correlation between buyer and seller ratings,
all agent members of the P2P community converge to suggesting that there is a degree of reciprocation of
stable values. positive ratings and retaliation of negative ratings.
This is problematic if obtaining honest and fair ratings
is a goal, and a possible remedy could be to not let
9. Commercial and live reputation systems sellers rate buyers.
The problem of ballot stuffing, i.e. that ratings can
This section describes the most well known appli- be repeated many times, e.g. to unfairly boost some-
cations of online reputation systems. All analysed body’s reputation score, seems to be a minor problem
systems have a centralised network architecture. The on eBay because participants are only allowed to rate
computation is mostly based on the summation or each other after the completion of a transaction,
average of ratings, but two systems use the flow which is monitored by eBay. It is of course possible
model. to create fake transactions, but because eBay charges
a fee for listing items, there is a cost associated with
9.1. eBay’s feedback forum this practice. However, unfair ratings for genuine
transactions cannot be avoided.
eBay15 is a popular auction site that allows sell- The eBay reputation system is very primitive and
ers to list items for sale, and buyers to bid for those can be quite misleading. With so few negative rat-
items. The so-called Feedback Forum on eBay gives ings, a participant with 100 positive and 10 negative
buyer and seller the opportunity to rate each other ratings should intuitively appear much less reputable
(provide feedback in the eBay jargon) as positive, than a participant with 90 positive and no negatives,
negative, or neutral (i.e. 1, À 1, 0) after completion but on eBay they would have the same total reputa-
of a transaction. Buyers and sellers also have the tion score. Despite its drawbacks and primitive na-
possibility to leave comments like bSmooth trans- ture, the eBay reputation system seems to have a
action, thank you!Q which are typical in positive strong positive impact on eBay as a marketplace.
case or bBuyers beware!Q in the rare negative Any system that facilitates interaction between
humans depend on how they respond to it, and
people appear to respond well to the eBay system
http://ebay.com/. and its reputation component.
15. 632 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
9.2. Expert sites each other with the status of Apprentice (lowest),
Journeyer (medium) or Master (highest). A separate
Expert sites have a pool of individuals that are flow graph is computed for each type of referral. A
willing to answer questions in their areas of expertise, member will get the highest status for which there is a
and the reputation systems on those sites are there to positive flow to his or her node. For example if the
rate the experts. Depending on the quality of a reply, flow graph of Master referrals and the flow graph of
the person who asked the question can rate the expert Apprentice referrals both reach member x then that
on various aspects of the reply such as clarity and member will have Master status, but if only the flow
timeliness. graph of Apprentice referrals reaches member x then
AllExperts16 provides a free expert service for the that member will have Apprentice status. The Advo-
public on the Internet with a business model based on gato reputation systems does not have any direct
advertising. The reputation system on AllExperts uses purpose other than to boost the ego of members,
the aspects: Knowledgeable, Clarity of Response, and to be a stimulant for social and professional
Timeliness and Politeness where ratings can be networking within the Advogato community.
given in the interval [1,10]. The score in each aspect
is simply the numerical average of ratings received. 9.3. Product review sites
The number of questions an expert has received is
also displayed in addition to a General Prestige score Product review sites have a pool of individual
that is simply the sum of all average ratings an expert reviewers who provide information for consumers
has received. Most experts receive ratings close to 10 for the purpose of making better purchase decisions.
on all aspects, so the General Prestige is usually close The reputation systems on those sites apply to pro-
to 10Â the number of questions received. It is also ducts as well as to the reviewers themselves.
possible to view charts of ratings over periods from 2
months to 1 year. 9.3.1. Epinions
AskMe17 is an expert site for a closed user group of Epinions20 founded in 1999 is a product and shop
companies and their employees, and the business review site with a business model mainly based on so-
model is based on charging a fee for participating in called cost-per-click online marketing, which means
the AskMe network. Ask Me does not publicly pro- that Epinions charges product manufacturers and
vide any details of how the system works. online shops by the number of clicks consumers
Advogato18 is a community of open-source pro- generate as a result of reading about their products
grammers. Members rank each other according to on Epinions’ web site. Epinions also provides product
how skilled they perceive each other to be, using reviews and ratings to other web sites for a fee.
Advogato’s trust scheme,19 which in essence is a Epinions has a pool of members who write product
centralised reputation system based on a flow and shop reviews. Anybody from the public can
model. The reputation engine of Advogato computes become a member simply by signing up. The product
the reputation flow through a network where members and shop reviews written by members consist of prose
constitute the nodes and the edges constitute referrals text and quantitative ratings from 1 to 5 stars for a set
between nodes. Each member node is assigned a of aspects such as Ease of Use, Battery Life, etc. in
capacity between 800 and 1 depending on the distance case of products, and Ease of Ordering, Customer
from the source node that is owned by Raph Levien Service, On-Time Delivery and Selection in case of
who is the creator of Advogato. The source node has a shops. Other members can rate reviews as Not Help-
capacity of 800 and the further away from the source ful, Somewhat Helpful, Helpful, and Very Helpful, and
node, the smaller the capacity. Members can refer thereby contribute to determining how prominently
the review will be placed, as well as to giving the
16 reviewer a higher status. A member can obtain the
http://www.askmecorp.com/. status Advisor, Top Reviewer or Category Lead (high-
16. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 633
est) as a function of the accumulated ratings on all his revenue than poorly rated reviews, because the for-
or her reviews over a period. It takes considerable mer are more prominently placed so that they are
reviewing effort to obtain a status above member, and more likely to be read and used by others. Category
most members do not have any status. Leads will normally earn more than Top Reviewers
Category Leads are selected at the discretion of who in turn will normally earn more than Advisors,
Epinions staff each quarter based on nominations because their reviews per definition are rated and
from members. Top Reviewers are automatically se- listed in that order.
lected every month based on how well their reviews Providing high quality reviews is Epinions core
are rated, as well as on the Epinions Web of Trust (see value proposition to consumers, and the reputation
below), where a member can Trust or Block another system is instrumental in achieving that. The reputa-
member. Advisors are selected in the same way as Top tion system can be characterised as highly sophisti-
Reviewers, but with a lower threshold for review cated because of the revenue based incentive
ratings. Epinions does not publish the exact thresholds mechanism. Where other reputation systems on the
for becoming Top Reviewer or Advisor, in order to Internet only provide immaterial incentives like status
discourage members from trying to manipulate the or karma, the Epinions system can provide hard cash.
The Epinions Web of Trust is a simple scheme, 9.3.2. BizRate
whereby members can decide to either trust or block BizRate runs a Customer Certified Merchant
another member. A member’s list of trusted members scheme whereby consumers who buy at a BizRate
represents that member’s personal Web of Trust. As listed store are asked to rate site navigation, selec-
already mentioned, the Web of Trust influences the tion, prices, shopping options and how satisfied they
automated selection of Top Reviewers and Advisors. were with the shopping experience. Consumers par-
The number of members (and their status) who trust ticipating in this scheme become registered BizRate
a given member will contribute to that member get- members. A Customer Certificate is granted to a
ting a higher status. The number of members (and merchant if a sufficient number of surveys over a
their status) who block another member will have a given period are positive, and this allows the mer-
negative impact on that member getting a higher chant to display the BizRate Customer Certified seal
status. of approval on its web site. As an incentive to fill
Epinions has an incentive system for reviewers out survey forms BizRate members get discounts at
called the Income Share Program, whereby members listed stores. This scheme does not capture the frus-
can earn money. Income Share is automatically de- trated customers who give up before they reach the
termined based on general use of reviews by con- check, and therefore tends to provide a positive bias
sumers. Reviewers can potentially earn as much for of web stores. Thus is understandable from a busi-
helping someone make a buying decision with a ness perspective, because it provides an incentive for
positive review, as for helping someone avoid a stores to participate in the Customer Certificate
purchase with a negative review. This is important scheme.
in order not to give an incentive to write biased BizRate also runs a product review service similar
reviews just for profit. As stated on the Epinions to Epinions, but which uses a much simpler reputation
FAQ pages: bEpinions wants you to be brutally hon- system. Members can write product reviews on Biz-
est in your reviews, even if it means saying negative Rate, and anybody can become a member simply by
thingsQ. The Income Share pool is a portion of Epi- signing up. Users, including non-members, who
nions’ income. The pool is split among all members browse BizRate for product reviews can vote on
based on the utility of their reviews. Authors of more reviews as being helpful, not helpful or off topic,
useful reviews earn more than authors of less useful and the reputation systems stops there. Reviews are
reviews. ordered according to the ratio of helpful over total
The Income Share formula is not specified in votes, where the reviews with the highest ratios are
detail in order to discourage attempts to defraud the listed first. It is also possible to have the reviews
system. Highly rated reviews will generate more sorted by rating, so that the best reviews are listed
17. 634 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
first. Reviewers do not get any status and they cannot want to pay people to write good reviews for their
earn money by writing reviews for BizRate. There is books on Amazon.
thus less incentive for writing reviews on BizRate There are many reports of attacks on the Amazon
than there is on Epinions, but it is uncertain how review scheme where various types of ballot stuffing
this influences the quality of the reviews. The fact has artificially elevated reviewers to top reviewer, or
that anybody can sign up to become a member and various types of bbad mouthingQ has dethroned top
write reviews and that anybody including non-mem- reviewers. This is not surprising due to the fact that
bers can vote on the helpfulness of reviews makes this users can vote without becoming a member. For
reputation scheme highly vulnerable to attack. A sim- example the Amazon #1 Reviewer usually is some-
ple attack could consist of writing many positive body who posts more reviews than any living person
reviews for a product and ballot stuff them so that could possibly do if it would require that person to
they get presented first and result in a high average read each book, thus indicating that the combined
score for that product. effort of a group of people, presented as a single
person’s work, is needed to get to the top. Also,
9.3.3. Amazon reviewers who have reached the Top 100 rank have
Amazon21 is mainly an online bookstore that reported a sudden increase in negative votes which
allows members to write book reviews. Amazon’s reflects that there is a cat fight taking place in order
reputation scheme is quite similar to the one BizRate to get into the ranks of top reviewers. In order to
uses. Anybody can become a member simply by reduce the problem, Amazon only allows one vote
signing up. Reviews consist of prose text and a per registered cookie for any given review. However
rating in the range 1 to 5 stars. The average of all deleting that cookie or switching to another computer
ratings gives a book its average rating. Users, in- will allow the same user to vote on the same review
cluding non-members, can vote on reviews as being again. There will always be new types of attacks, and
helpful or not helpful. The numbers of helpful as Amazon needs to be vigilant and respond to new
well as the total number of votes are displayed with types of attacks as they emerge. However, due to
each review. The order in which the reviews are the vulnerability of the review scheme it cannot be
listed can be chosen by the user according to criteria described as a robust scheme.
such as bnewest firstQ, bmost helpful firstQ or
bhighest rating firstQ. 9.4. Discussion fora
As a function of the number of helpful votes each
reviewer has received, as well as other parameters not 9.4.1. Slashdot
publicly revealed, Amazon determines each revie- Slashdot22 was started in 1997 as a bnews for
wer’s rank, and those reviewers who are among the nerdsQ message board. More precisely it is a forum
1000 highest get assigned the status of Top 1000, Top for posting articles and comments to articles. In the
500, Top 100, Top 50, Top 10 or #1 Reviewer. Am- early days when the community was small, the signal
azon has a system of Favourite People, where each to noise ratio was very high. As is the case with all
member can choose other members as favourite mailing lists and discussion fora where the number of
reviewers, and the number of other members who members grow rapidly, spam and low quality postings
has a specific reviewer listed as favourite person emerged to become a major problem, and this forced
also influences that reviewer’s rank. Apart from giv- Slashdot to introduce moderation. To start with there
ing some members status as top reviewers, Amazon was a team of 25 moderators which after a while grew
does not give any financial incentives. However there to 400 moderators to keep pace with the growing
are obviously other financial incentives external to number of users and the amount of spam that fol-
Amazon that can play an important role. It is for lowed. In order to create a more democratic and
example easy to imagine why publishers would healthy moderation scheme, automated moderator se-
18. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 635
lection was introduced, and the Slashdot reputation get less moderation points to spend when they are
system forms an integral part of that as explained selected as moderators.
below. The moderation scheme actually consists of The purpose of the comment score is to be able to
two moderation layers where M1 is for moderating filter the good comments from the bad and to allow
comments to articles, and M2 is for moderating M1 users to set thresholds when reading articles and
moderators. postings on Slashdot. A user who only wants to
The articles posted on Slashdot are selected at the read the best comments can set the threshold to 5
discretion of the Slashdot staff based on submissions whereas a user who wants to read everything can set
from the Slashdot community. Once an article has the threshold to À 1.
been posted, anyone can give comments to that To address the issue of unfair moderations, Slash-
article. dot has introduced a metamoderation layer called M2
Users of Slashdot can be Logged In Users or just (the moderation layer described above is called M1)
anonymous persons browsing the web. Anybody can with the purpose of moderating the M1 moderators.
become a Logged In User simply by signing up. Any longstanding Logged In user can metamoderate
Reading articles and comments as well as writing several times per day if he or she so wishes. A user
comments to articles can be done anonymously. Be- who wants to metamoderate will be asked to moderate
cause anybody can write comments, they need to be the M1 ratings on 10 randomly selected comment
moderated, and only Logged In Users are eligible to postings. The metamoderator decides if a moderator’s
become moderators. rating was fair, unfair, or neither. This moderation
Regularly (typically every 30 min), Slashdot au- affects the Karma of the M1 moderators which in
tomatically selects a group of M1 moderators among turn influences their eligibility to become M1 mod-
long time regular Logged In Users, and gives each erators in the future.
moderator 3 days to spend a given number of (typ- The Slashdot reputation system recognises that a
ically 5) moderation points. Each moderation point moderator’s taste can influence how he or she rates
can be spent moderating 1 comment by giving it a a comment. Having one set of positive ratings and
rating selected from a list of negative (offtopic, one set of negative ratings, each with different types
flamebait, troll, redundant, overrated) or positive of taste dependent rating choices, is aimed at solv-
(insightful, interesting, informative, funny, under- ing that problem. The idea is that moderators with
rated) adjectives. An integer score in the range different taste can give different ratings (e.g. insight-
[À 1, 5] is maintained for each comment. The initial ful or funny) to a comment that has merit, but every
score is normally 1 but can also be influenced by the rating will still be uniformly positive. Similarly,
comment provider’s Karma as explained below. A moderators with different taste can give different
moderator rating a comment positively causes a 1 ratings (e.g. offtopic or overrated) to a comment
point increase in the comment’s score, and a mod- without merit, but every rating will still be uniform-
erator rating a comment negatively causes a 1 point ly negative. Slashdot staff is also able to spend
decrease in the comment’s score, but within the arbitrary amounts of moderation points making
range [À 1, 5]. these people omnipotent and thereby able to manu-
Each Logged In User maintains a Karma which ally stabilise the system in case Slashdot would be
can take one of the discrete values Terrible, Bad, attacked by extreme volumes of spam and unfair
Neutral, Positive, Good and Excellent. New Logged ratings.
In Users start with neutral Karma. Positive moderation The Slashdot reputation system directs and stimu-
of a user’s comments contributes to higher Karma lates the massive collaborative effort of moderating
whereas a negative moderation of a user’s comments thousands of postings everyday. The system is con-
contributes to lower Karma of that user. Comments by stantly being tuned and modified and can be described
users with very high Karma will get initial score 2 as an ongoing experiment in search for the best prac-
whereas comments by users with very low Karma will tical way to promote quality postings, discourage
get initial score 0 or even À 1. High Karma users will noise and to make Slashdot as readable and useful
get more moderation points and low Karma users will as possible for a large community.
19. 636 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
9.4.2. Kuro5in was obviously caused by the superior search results
Kuro5hin23 is a web site for discussion of technol- that the PageRank algorithm delivered.
ogy and culture started in 1999. It allows members to The definition of PageRank from Page et al.  is
post articles and comments similarly to Slashdot. The given below:
reputation system on Kuro5hin is called Mojo. It
Definition 4. Let P be a set of hyperlinked web pages
underwent major changes in October 2003 because
and let u and v denote web pages in P. Let N À (u)
it was unable to effectively counter noise postings
denote the set of web pages pointing to u and let
from throwaway accounts, and because attackers
N +(v) denote the set of web pages that v points to. Let
rated down comments of targeted members in order
E be some vector over P corresponding to a source of
to make them lose their reputation scores. Some of the
rank. Then, the PageRank of a web page u is:
changes introduced in Mojo to solve these problems
include to only let a comment’s score influence a X R ð vÞ
user’s Mojo (i.e. reputation score) when there are at RðuÞ ¼ cEðuÞ þ c ð4Þ
jN þ ðvÞj
least six ratings contributing to it, and to only let one
rating count from any single IP address. P
where c is chosen such that uaP R(u) = 1.
It is possible that the problems experienced by
Kuro5hin could have been avoided had they used In  it is recommended that E be chosen such
Slashdot’s principle of only allowing longstanding that uaP E(u) = 0.15. The first term cE(u) in Eq. (4)
members to moderate because throwaway accounts gives rank value based on initial rank. The second
P ð Þ
would have been less effective as an attack tool. term c vaN À ðuÞ jNRþ vvÞj gives rank value as a function
of hyperlinks pointing at u.
9.5. Google’s web page ranking system According to Definition 4 above R a [0,1], but the
PageRank values that Google provides to the public
The early web search engines such as Altavista are scaled to the range [0,10] in increments of 0.25.
simply presented every web page that matched the We will denote the public PageRank of a page u as
keywords entered by the user, which often resulted in PR(u). This public PageRank measure can be viewed
too many and irrelevant pages being listed in the for any web page using Google’s toolbar which is a
search results. Altavista’s proposal for handling this plug-in to the MS Explorer browser. Although Google
problem was to offer advanced ways to combine do not specify exactly how the public PageRank is
keywords based on binary logic. This was too com- computed, the source rank vector E can be defined over
plex for users and therefore did not represent a good the root web pages of all domains weighted by the cost
solution. of buying each domain name. Assuming that the only
PageRank proposed by Page et al.  represents a way to improve a page’s PageRank is to buy domain
way of ranking the best search results based on a names, Clausen  shows that there is a lower bound
page’s reputation. Roughly speaking, PageRank to the cost of obtaining an arbitrarily good PR.
ranks a page according to how many other pages are Without specifying many details, Google state that
pointing at it. This can be described as a reputation the PageRank algorithm they are using also take
system, because the collection of hyperlinks to a given other elements into account, with the purpose of
page can be seen as public information that can be making it difficult or expensive to deliberately influ-
combined to derive a reputation score. A single hy- ence PageRank.
perlink to a given web page can be seen as a positive In order to provide a semantic interpretation of a
rating of that web page. Google’s search engine24 is PageRank value, a hyperlink can be seen as a positive
based on the PageRank algorithm and the rapidly referral of the page it points to. Negative referrals do
rising popularity of Google at the cost of Altavista not exist in PageRank so that it is impossible to
blacklist web pages with the PageRank algorithm of
Eq. (4) alone. Before Google with its PageRank algo-
http://www.kuro5hin.org/. rithm entered the search engine market, some web-
http://www.google.com/. masters would promote web sites in a spam-like
20. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 637
fashion by filling web pages with large amounts of scientific papers, their authors and the journals where
commonly used search keywords as invisible text or the papers are published. The basic principle for
as metadata in order for the page to have a high ranking scientific papers is to simply count the num-
probability of being picked up by a search engine ber of times each scientific paper has been cited by
no matter what the user searched for. Although this another paper, and rank them accordingly. Journals
still can occur, PageRank seems to have reduced that can be ranked in a similar fashion by summing up
problem because a high PR is also needed in addition citations of all articles published in each journal and
to matching keywords in order for a page to be pre- rank the journals accordingly. Similarly to Google’s
sented to the user. PageRank algorithm, only positive referrals are pos-
PageRank applies the principle of trust transitivity sible with cross citations. This means that papers that,
to the extreme because rank values can flow through for example, are known to be plagiarisms or to
looped or arbitrarily long hyperlink chains. Some contain falsified results cannot easily be sanctioned
theoretic models including [36,39,73] do also allow with scientometric methods.
looped and/or infinite transitivity. As pointed out by Makino , even though scien-
tometrics normally provide reasonable indicators of
9.6. Supplier reputation systems quality and reputation, it can sometimes give mislead-
Many suppliers and subcontractors have estab- There is an obvious similarity between hyperlinked
lished a web presence in order to get a broader web pages and literature cross references, and it would
and more global exposure to potential contract be interesting to apply the concepts of PageRank to
partners. However as described in Section 1 the scientific cross citations in order to derive a new way
problem of information asymmetry and uncertainty of ranking authors and journals. We do not know of
about supplier reliability can make it risky to establish any attempt in this direction.
supply chain and subcontract agreements online. Rep-
utation systems have the potential to alleviate this
problem by providing the basis for making more 10. Problems and proposed solutions
informed decisions and commitments about suppliers
and subcontractors. Numerous problems exist in all practical and aca-
Open Ratings 25 is a company that sells Past Per- demic reputation systems. This section describes pro-
formance reports about supply chain subcontractors blems that have been identified and some proposed
based on ratings provided by past contract partners. solutions.
Ratings are provided on a 1–100 scale on the fol-
lowing 9 aspects: Reliability, Cost, Order Accuracy, 10.1. Low incentive for providing rating
Delivery/Timeliness, Quality, Business Relations,
Personnel, Customer Support and Responsiveness Ratings are typically provided after a transaction
and a suppliers score is computed as a function of has taken place, and the transaction partners usually
recently received ratings. The reports also contain the have no direct incentive for providing rating about the
number and business categories of contract partners other party. For example when the service provider’s
that provided the ratings. capacity is limited, participants may not want to share
the resource with others and therefore do not want to
9.7. Scientometrics give referrals about it. Another example is when
buyers withhold negative ratings because they are
Scientometrics  is the study of measuring bniceQ or because they fear retaliation from the seller.
research output and impacts thereof based on the Even without any of these specific motives, a rater
scientific literature. Scientific papers cite each other, does not benefit directly from providing a rating. It
and each citation can be seen as a referral of other serves the community to provide ratings and the
potential for free-riding (i.e. letting the others provide
http://openratings.com/. the ratings) therefore exists.
21. 638 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
Despite this fact many do provide ratings. In their basis. Authors proposing methods to counter this
study, Resnick and Zeckhauser  found that 60.7% problem include [2,6,7,11,13–15,47,64,58,68,69,71].
of the buyers and 51.7% of the sellers on eBay The methods of avoiding bias from unfair ratings can
provided rating about each other. Possible explana- broadly be grouped into two categories described
tions for these relatively high values can for example below.
be that providing reciprocal ratings simply is an ex-
pression of politeness. However lack of incentives for 10.3.1. Endogenous discounting of unfair ratings
providing ratings is a general problem that needs This category covers methods that exclude or give
special attention and that might require specific incen- low weight to presumed unfair ratings based on ana-
tive mechanisms. lysing and comparing the rating values themselves.
Miller et al.  have proposed a scheme for The assumption is that unfair ratings can be recog-
eliciting honest feedback based on financial rewards. nised by their statistical properties.
Jurca and Faltings  have proposed a similar in- Dellarocas  and Withby et al.  have pro-
centive scheme for providing truthful ratings based on posed two different schemes for detecting and exclud-
payments. ing ratings that are likely to be unfair when judged by
statistical analysis. Chen and Singh  have pro-
10.2. Bias toward positive rating posed a scheme that uses elements from collaborative
filtering for grouping raters according to the ratings
There is often a positive bias when ratings are they give to the same objects.
provided. In Resnick and Zeckhauser , it was
found that only 0.6% of all the ratings provided by 10.3.2. Exogenous discounting of unfair ratings
buyers and only 1.6% of all the ratings provided by This category covers methods where the externally
sellers were negative, which seems too low to reflect determined reputation of the rater is used to determine
reality. Possible explanations for the positive rating the weight given to ratings. The assumption is that
bias are that a positive ratings simply represents an raters with low reputation are likely to give unfair
exchange of courtesies (Resnick and Zeckhauser ratings and vice versa.
2002), that positive ratings are given in the hope of Private information e.g. resulting from personal
getting a positive rating in return (Chen and Singh experience is normally considered more reliable than
2001)  or alternatively that negative ratings are public information such as ratings from third parties.
avoided because of fear of retaliation from the other If the relying party has private information, then this
party (Resnick and Zeckhauser 2002). After all, no- information can be compared to public information in
body is likely to be offended by an unfairly positive order to give an indication of the reliability of the
rating, but badmouthing and unfairly negative ratings public information.
certainly have the potential of provoking retaliation or Buchegger and Le Boudec  have proposed a
even a lawsuit. scheme based on a Bayesian reputation engine and a
An obvious method for avoiding positive bias can deviation test that is used to classify raters as trust-
consist of providing anonymous reviews. A crypto- worthy and not trustworthy. Cornelli at al.  have
graphic scheme for anonymous ratings is proposed by described a reputation scheme to be used on top of the
Ismail et al. . Gnutella26 P2P network. Ekstrom and Bjornson 
have proposed a scheme and built a prototype called
10.3. Unfair ratings TrustBilder for rating subcontractors in the Architec-
ture Engineering Construction (AEC) industry. Yu and
Finding ways to avoid or reduce the influence of Singh  have proposed to use a variant of the
unfairly positive or unfairly negative ratings is a Weighted Majority Algorithm  to determine the
fundamental problem in reputation systems where weights given to each rater.
ratings from others are taken into account. This is
because the relying party cannot control the sincerity
of the ratings when they are provided on a subjective http://www.gnutella.com.
22. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 639
10.4. Change of identities another way to describe it is by reinforcement learn-
ing . The discounting of the past can be a function
Reputation systems are based on the assumption of time or of the frequency of transactions, or a
that identities and pseudonyms are long lived, allow- combination of both .
ing ratings about a particular party from the past to be
related to the same party in the future. In case a party 10.6. Discrimination
has suffered significant loss of reputation it might be
in his interest to change identity or pseudonym in Discriminatory behaviour can occur both when
order to cut with the past and start from fresh. How- providing services and when providing ratings. A
ever, this practice is not in the general interest of the seller can for example provide good quality to all
community  and should be prevented or discour- buyers except one single buyer. Ratings about that
aged. Authors proposing methods to counter this particular seller will indicate that he is trustworthy
practice include Zacharia, Moukas and Maes . except for the ratings from the buyer victim. The
Their reputation scheme, which we call the ZMM- filtering techniques described in Section 10.3.1 will
scheme, was used in the 1996–1999 MIT based Kas- give false positives, i.e. judge the buyer victim to be
bah multiagent C2C transaction system. Upon com- unfair in such situations. Only systems that are able to
pletion of a transaction, both parties were able to rate recognise the buyer victim as trustworthy, and thereby
how well the other party behaved. The Kasbah agents give weight to his ratings, would be able to handle this
used the resulting reputation score when negotiating situation well. Some of the techniques described in
future transactions. A main goal in the design of the Section 10.3.2 would theoretically be able to protect
ZMM-scheme was to discourage users from changing against this type of discrimination, but no simulations
identities, and the ZMM scheme was deliberately have been done to prove this.
designed to penalise newcomers. This approach has Discrimination can also take the form of a single
the disadvantage that it can be difficult to distinguish rater giving fair ratings except when dealing with a
between good and bad newcomers. specific partner. The filtering techniques described in
Sections 10.3.1 and 10.3.2 are designed to handle this
10.5. Quality variations over time type of discrimination.
Economic theory indicates that there is a balance 10.7. Ballot box stuffing
between the cost of establishing a good reputation and
the financial benefit of having a good reputation, Ballot stuffing means that more than the legitimate
leading to an equilibrium [37,62]. Variations in the number of ratings is provided. This problem is closely
quality of services or goods can be a result of delib- related to unfair ratings because ballot stuffing usually
erate management decisions or uncontrolled factors, consists of too many unfair ratings. In traditional
and whatever the cause, the changes in quality will voting schemes, such as political elections, ballot
necessarily lead to variations in reputation. Although a stuffing means that too many votes are cast in favour
theoretic equilibrium exists, there will always be fluc- of a candidate, but in online reputation systems, ballot
tuations, and it is possible to characterise the condi- stuffing can also happen with negative votes. This is a
tions under which oscillations can be avoided  or common problem in many online reputation systems
converge towards the equilibrium . In particular, described in Section 9 and they usually have poor
discounting of the past is shown to be a condition for protection against it. Among the commercial and live
convergence towards an equilibrium . Discount- reputation systems, eBay’s Feedback forum seems to
ing of the past can be implemented in various ways, provide adequate protection against ballot stuffing,
and authors use different names to describe what is because ratings can only be provided after transac-
basically the same thing. Past ratings can be dis- tions completion. Because eBay charges a fee for each
counted by a forgetting factor , aging factor  transaction ballot stuffing would be expensive. Epi-
or fading factor . Inversely a longevity factor  nions’ and Slashdot’s reputation system also provides
can be used to determine a rating’s time to live. Yet some degree of protection because only registered
23. 640 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
members can vote in a controlled way on the merit of that take ratings from users into account, financial
reviews and comments. incentives are only provided by Epinions (hard cash)
and BizRate (price discount), all the other web sites
only provide immaterial incentives in the form of
11. Discussion and conclusion status or rank.
Given that reputation systems used in commercial
The purpose of this work has been to describe and and online applications have serious vulnerabilities, it
analyse the state of the art in trust and reputation is obvious that the reliability of these systems some-
systems. Dingledine et al.  have proposed the times is questionable. Assuming that reputation sys-
following set of basic criteria for judging the quality tems give unreliable scores, why then are they used?
and soundness of reputation computation engines. A possible answer to this question is that in many
situations the reputation systems do not need to be
(1) Accuracy for long-term performance. The sys- robust because their value lies elsewhere. Resnick and
tem must reflect the confidence of a given score. Zeckhauser  consider two explanations in relation
It must also have the capability to distinguish to eBays reputation system: (a) even though a repu-
between a new entity of unknown quality and an tation system is not robust it might serve its purpose
entity with poor long-term performance. of providing an incentive for good behaviour if the
(2) Weighting toward current behaviour. The sys- participants think it works, and (b) even though the
tem must recognise and reflect recent trends in system might not work well in the statistical norma-
entity performance. For example, an entity that tive sense, it may function successfully if it swiftly
has behaved well for a long time but suddenly reacts against bad behaviour (called bstoningQ) and if
goes downhill should be quickly recognised as it imposes costs for a participant to get established
untrustworthy. (called blabel initiation duesQ).
(3) Robustness against attacks. The system should Given that some online reputation systems are far
resist attempts of entities to manipulate reputa- from being robust, it is obvious that the organisations
tion scores. that run them have a business model that is relatively
(4) Smoothness. Adding any single rating should insensitive to their robustness. It might be that the
not influence the score significantly. reputation system serves as a kind of social network to
attract more people to a web site, and if that is the
The criteria (1), (2) and (4) are easily satisfied by case, then having simple rules for participating is
most reputation engines except for the most primitive more important than having strict rules for controlling
such as taking a rating score as the sum of positive participants’ behaviour. Any reputation system with
minus negative ratings such as in eBays feedback user participation will depend on how people respond
forum. Criterion (3) on the other hand will probably to it, and must therefore be designed with that in
never be solved completely because there will always mind. Another explanation is that, from a business
be new and unforeseen attacks for which solutions perspective, having a reputation system that is not
will have to be found. robust can be desirable if it generally gives a positive
The problems of unfair ratings and ballot stuffing bias. After all, commercial web stores are in the
are probably the hardest to solve in any reputation business of selling, and positively biased ratings are
system that is based on subjective ratings from parti- more likely to promote sales than negative ratings.
cipants, and large number of researchers are working Whenever the robustness of a reputation system is
on this in the academic community. Instead of having crucial, the organisation that runs it should take mea-
one solution that works well in all situations there will sures to protect the stability of the system and robust-
be multiple techniques with advantages, disadvan- ness against attacks. This can for example be by
tages and trade-offs. Lack of incentives to provide including routine manual control as part of the
ratings is also a fundamental problem, because there scheme, such as in Epinions’ case when selecting
often is no rational reason for providing feedback. Category Lead reviewers, or in Slashdot’s case
Among the commercial and online reputation systems where Slashdot staff are omnipotent moderators. Ex-
24. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 641
ceptional manual control will probably always be References
needed, should the system come under heavy attack.
Another important element is to keep the exact details  A. Abdul-Rahman, S. Hailes, Supporting trust in virtual com-
of the computation algorithm and how the system is munities, Proceedings of the Hawaii International Conference
on System Sciences, Maui, Hawaii, 4–7 January 2000, 2000.
implemented confidential (called bsecurity by obscu-
 K. Aberer, Z. Despotovic, Managing trust in a peer-2-peer
rity), such as in the case of Epinions, Slashdot and information system, in: Henrique Paques, Ling Liu, David
Google. Ratings are usually based on subjective Grossman (Eds.), Proceedings of the Tenth International Con-
judgement, which opens up the Pandora’s box of ference on Information and Knowledge Management
unfair ratings, but if ratings can be based on objective (CIKM01), ACM Press, 2001, pp. 10 – 317.
 M.D. Abrams, Trusted system concepts, Computers and Se-
criteria it would be much simpler to achieve high
curity 14 (1) (1995) 45 – 56.
robustness.  E. Adar, B.A. Huberman, Free riding on Gnutella, First Mon-
The trust and reputation schemes presented in this day (Peer-reviewed Journal on the Internet) 5 (10) (2000
study cover a wide range of application and are based (October)) 8.
on many different types mechanisms, and there is no  S. Boeyen, et al., Liberty trust models guidelines, in: J. Linn
(Eds.), Liberty Alliance Project. Liberty Alliance, Draft Ver-
single solution that will be suitable in all contexts and
sion 1.0–15 Edition, 25 July, 2003.
applications. When designing or implementing new  S. Braynov, T. Sandholm, Incentive compatible mechanism for
systems, it is necessary to consider the constraints and trust revelation, Proceedings of the First Int. Joint Conference
the type of information that can be used as input on Autonomous Agents and Multiagent Systems (AAMAS),
ratings. July, 2002.
 S. Buchegger, J.-Y. Le Boudec, A Robust Reputation System
The rich literature growing around trust and repu-
for Mobile Ad-hoc Networks. Technical Report IC/2003/50,
tation systems for Internet transactions, as well as the EPFL-IC-LCA, 2003.
implementation of reputation systems in successful  S. Buchegger, J.-Y. Le Boudec, The effect of rumor spreading
commercial application, give a strong indication that in reputation systems for mobile ad-hoc networks, Proceedings
this is an important technology. The commercial and of the Workshop on Modeling and Optimization in Mobile, Ad
Hoc and Wireless Networks, March, 2003.
live implementation seems to have settled around
 V. Cahill, B. Shand, E. Gray, et al., Using trust for secure
relatively simple schemes, whereas a multitude of collaboration in uncertain environments, Pervasive Computing
different systems with advanced features are being 2 (3) (2003 (July–September)) 52 – 61.
proposed by the academic community. A general  M. Carbone, M. Nielsen, V. Sassone, A formal model for trust
observation is that the proposals from the academic in dynamic networks, Proc. of International Conference on
Software Engineering and Formal Methods (SEFM’03), Bris-
community so far lack coherence. The systems being
bane, September, 2003.
proposed are usually designed from scratch, and only  M. Chen, J.P. Singh, Computing and using reputations for
in very few cases are authors building on proposals by internet ratings, Proceedings of the Third ACM Conference
other authors. The period we are in can therefore be on Electronic Commerce (EC’01), ACM, 2001 (October).
seen as a period of pioneers, and we hope that the near  A. Clausen, The cost of attack of PageRank, Proceedings of
The International Conference on Agents, Web Technologies
future will bring consolidation around a set of sound
and Internet Commerce (IAWTIC’2004), Gold Coast, July,
and well recognised principles for building trust and 2004.
reputation systems, and that these will find their way  F. Cornelli, et al., Choosing reputable servents in a P2P
into practical and commercial applications. network, Proceedings of the eleventh international conference
on World Wide Web (WWW’02), ACM, 2002 (May).
 E. Damiani, et al., A reputation-based approach for choosing
reliable resources in peer-to-peer networks, Proceedings of the
Acknowledgement 9th ACM conference on Computer and Communications Se-
curity (CCS’02), ACM, 2002, pp. 207 – 216.
The work reported in this paper has been funded in  C. Dellarocas, Immunizing online reputation reporting systems
part by the Cooperative Research Centre for Enter- against unfair ratings and discriminatory behavior, ACM Con-
ference on Electronic Commerce, 2000, pp. 150 – 157.
prise Distributed Systems Technology (DSTC)
 R. Dingledine, M.J. Freedman, D. Molnar, Accountability
through the Australian Federal Government’s CRC measures for peer-to-peer systems, Peer-to-Peer: Harnessing
Program (Department of Education, Science, and the Power of Disruptive Technologies, O’Reilly Publishers,
25. 642 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
 M. Ekstrom, H. Bjornsson, A rating system for AEC e-bidding of the First International Conference on Trust Management,
that accounts for rater credibility, Proceedings of the CIB W65 Crete, May, 2003.
Symposium, 2002 (September), pp. 753 – 766.  R. Jurca, B. Faltings, An incentive compatible reputation
 D. Fahrenholtz, W. Lamesdorf, Transactional security for a mechanism, Proceedings of the 6th Int. Workshop on Decep-
distributed reputation management system, Proceedings of the tion Fraud and Trust in Agent Societies (at AAMAS’03),
Third International Conference on E-Commerce and Web ACM, 2003.
Technologies (EC-Web), vol. LNCS 2455, Springer, 2002  S.D. Kamvar, M.T. Schlosser, H. Garcia-Molina, The Eigen-
(September), pp. 214 – 223. Trust algorithm for reputation management in P2P networks,
 R. Falcone, C. Castelfranchi, Social Trust: A Cognitive Ap- Proceedings of the Twelfth International World Wide Web
proach, Kluwer, 2001, pp. 55 – 99. Conference, Budapest, May, 2003.
 L.C. Freeman, Centrality on social networks, Social Networks  D. Kreps, R. Wilson, Reputation and imperfect information,
1 (1979) 215 – 239. Journal of Economic Theory 27 (2) (1982) 253 – 279.
 E. Friedman, P. Resnick, The social cost of cheap pseudo-  I. Kurbel, K. Loutchko, A framework for multi-agent electron-
nyms, Journal of Economics and Management Strategy 10 (2) ic marketplaces: analysis and classification of existing sys-
(2001) 173 – 199. tems, Proceedings of the International ICSC Congress on
 D. Gambetta, Can we trust trust? in: D. Gambetta (Ed.), Trust: Information Science Innovations (ISI’01), American Univer-
Making and Breaking Cooperative Relations, Basil Blackwell, sity in Dubai, U.A.E., March, 2001.
Oxford, 1990, pp. 213 – 238.  R. Levien, Attack Resistant Trust Metrics. PhD thesis, Uni-
 T. Grandison, M. Sloman, A survey of trust in internet versity of California at Berkeley.
applications, IEEE Communications Surveys and Tutorials 3  C.Y. Liau, et al., Efficient distributed reputation scheme for
(2000). peer-to-peer systems, Proceedings of the 2nd International
 M. Gupta, P. Judge, M. Ammar, A reputation system for peer- Human.Society@Internet Conference (HSI), vol. LNCS 2713,
to-peer networks, Proceedings of the 13th International Work- Springer, 2003, pp. 54 – 63.
shop on Network and Operating Systems Support for Digital  N. Littlestone, M.K. Warmuth, The weighted majority al-
Audio and Video (NOSSDAV), 2003. gorithm, Information and Computation 108 (2) (1994)
 R. Guttman, A. Moukas, P. Maes, Agent-mediated electronic 212 – 261.
commerce: a survey, Knowledge Engineering Review 13 (3)  P. Maes, R. Guttman, A. Moukas, Agents that buy and sell:
(1998 (June)). transforming commerce as we know it, Communications of the
 W. Hood, C.S. Wilson, The Literature of Bibliometrics, Scien- ACM 42 (3) (1999) 81 – 91.
tometrics, and Informetrics, Scientometrics 52 (2) (2001)  J. Makino, Productivity of research groups. Relation between
291 – 314. citation analysis and reputation within research communities,
 B.A. Huberman, F. Wu, The dynamics of reputations, Com- Scientometrics 43 (1) (1988) 87 – 93.
puting in Economics and Finance 18 (2003).  D.W. Manchala, Trust metrics, models and protocols for
 R. Ismail, C. Boyd, A. Jøsang, S. Russel, Strong privacy in electronic commerce transactions, Proceedings of the 18th
reputation systems, Proceedings of the 4th International Work- International Conference on Distributed Computing Systems,
shop on Information Security Applications (WISA), Jeju Is- 1998.
land, Korea, August, 2003.  P.V. Marsden, N. Lin (Eds.), Social Structure and Network
 A. Jøsang, Trust-based decision making for electronic transac- Analysis, Sage Publications, Beverly Hills, 1982.
tions, in: L. Yngstrom, T. Svensson (Eds.), Proceedings of the
¨  D.H. McKnight, N.L. Chervany, The Meanings of Trust,
4th Nordic Workshop on Secure Computer Systems (NORD- Technical Report MISRC Working Paper Series 96-04, Uni-
SEC’99), Stockholm University, Sweden, 1999. versity of Minnesota, Management Information Systems Re-
 A. Jøsang, A logic for uncertain probabilities, International search Center, 1996, URL: http://misrc.umn.edu/wpaper/.
Journal of Uncertainty, Fuzziness and Knowledge-Based Sys-  N. Miller, P. Resnick, R. Zeckhauser, Eliciting Honest Feed-
tems 9 (3) (2001 (June)) 279 – 311. back in Electronic Markets. Working paper originally pre-
 A. Jøsang, R. Ismail, The beta reputation system, Proceedings pared for the SITE’02 workshop, availale at http://
of the 15th Bled Electronic Commerce Conference, Bled, www.si.umich.edu/presnick/papers/elicit/, February 11, 2003.
Slovenia, June, 2002.  L. Mui, M. Mohtashemi, C. Ang, A probabilistic rating frame-
 A. Jøsang, S. Lo Presti, Analysing the relationship between work for pervasive computing environments, Proceedings of
risk and trust, in: T. Dimitrakos (Ed.), Proceedings of the the MIT Student Oxygen Workshop (SOW’2001), 2001.
Second International Conference on Trust Management, Ox-  L. Mui, M. Mohtashemi, C. Ang, P. Szolovits, A. Halberstadt,
ford, March, 2004. Ratings in distributed systems: a Bayesian approach, Proceed-
 A. Jøsang, S. Pope, Semantic constraints for trust transitiv- ings of the Workshop on Information Technologies and Sys-
ity, Proceedings of the Asia-Pacific Conference of Concep- tems (WITS), 2001.
tual Modelling (APCCM), Newcastle, Australia, February,  L. Mui, A. Halberstadt, M. Mohtashemi, Notions of reputation
2005. in multi-agent systems: a review, Proceedings of the First Int.
 A. Jøsang, S. Hird, E. Faccer, Simulating the effect of repu- Joint Conference on Autonomous Agents and Multiagent Sys-
tation systems on e-markets, in: C. Nikolau (Ed), Proceedings tems (AAMAS), July, 2002.
26. A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644 643
 L. Mui, M. Mohtashemi, A. Halberstadt, A computational  O.E. Williamson, Calculativeness, trust and economic organi-
model of trust and reputation, Proceedings of the 35th Hawaii zation, Journal of Law and Economics 36 (1993 (April))
International Conference on System Science (HICSS), 2002. 453 – 486.
 L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank  A. Withby, A. Jøsang, J. Indulska, Filtering out unfair ratings
Citation Ranking: Bringing Order to the Web. Technical re- in Bayesian reputation systems, Proceedings of the 7th Int.
port, Stanford Digital Library Technologies Project, 1998. Workshop on Trust in Agent Societies (at AAMAS’04), ACM,
 L. Rasmusson, S. Janssen, Simulated social control for secure 2004.
internet commerce, in: Catherine Meadows (Ed.), Proceedings  B. Yu, M.P Singh, A social mechanism of reputation manage-
of the 1996 New Security Paradigms Workshop, ACM, 1996. ment in electronic communities, Proceedings of the 4th Inter-
 M.K. Reiter, S.G. Stubblebine, Toward acceptable metrics of national Workshop on Cooperative Information Agents, 2000,
authentication, Proceedings of the 1997 IEEE Symposium on pp. 154 – 165.
Research in Security and Privacy, Oakland, CA, 1997.  B. Yu, M.P. Singh, An evidential model of distributed reputa-
 P. Resnick, R. Zeckhauser, Trust among strangers in internet tion management, Proceedings of the First Int. Joint Confer-
transactions: empirical analysis of eBay’s reputation system, ence on Autonomous Agents and Multiagent Systems
in: M.R. Baye (Ed.), The Economics of the Internet and E- (AAMAS), ACM, 2002 (July).
Commerce, Advances in Applied Microeconomics, vol. 11,  B. Yu, M.P. Singh, Detecting deception in reputation manage-
Elsevier Science, 2002. ment, Proceedings of the Second Int. Joint Conference on
 P. Resnick, R. Zeckhauser, R. Friedman, K. Kuwabara, Rep- Autonomous Agents and Multiagent Systems (AAMAS),
utation systems, Communications of the ACM 43 (12) (2000 ACM, 2003, pp. 73 – 80.
(December)) 45 – 48.  G. Zacharia, A. Moukas, P. Maes, Collaborative Reputation
 P. Resnick, R. Zeckhauser, J. Swanson, K. Lockwood, The Mechanisms in Electronic Marketplaces, Proceedings of the
value of reputation on eBay: a controlled experiment, Working 32nd Hawaii International Conference on System Science,
Paper for the June 2002 Esa Conference, Boston, Ma, School IEEE, 1999.
of Information, University of Michigan, 2002 (URL: http://  C.-N. Ziegler, G. Lausen, Spreading activation models for trust
www.si.umich.edu/presnick/papers/postcards/). propagation, Proceedings of the IEEE International Confer-
 T. Riggs, R. Wilensky, An algorithm for automated rating of ence on e-Technology, e-Commerce, and e-Service (EEE ’04),
reviewers, Proceedings of the ACM/IEEE-CS Joint Confer- Taipei, March, 2004.
ence on Digital Libraries (JCDL), 2001, pp. 381 – 387.  P.R. Zimmermann, The Official PGP User’s Guide, MIT Press,
 J. Sabater, C. Sierra, REGRET: a reputation model for gregar- 1995.
ious societies, Proceedings of the 4th Int. Workshop on De-
ception, Fraud and Trust in Agent Societies, in the 5th Int. Audun Jøsang is the research leader of
Conference on Autonomous Agents (AGENTS’01), ACM the Security Unit at the Distributed Sys-
Press, Montreal, Canada, 2001, pp. 61 – 69. tems Technology Centre in Brisbane.
 J. Sabater, C. Sierra, Reputation and social network analysis in His research focuses on trust and repu-
multi-agent systems, Proceedings of the First Int. Joint Con- tation systems in addition to information
ference on Autonomous Agents and Multiagent Systems security. Audun received his PhD from
(AAMAS), July, 2002. the Norwegian University of Science
 J. Sabater, C. Sierra, Social ReGreT, a reputation model based and Technology in 1998, and has an
on social relations, SIGecom Exchanges 3.1 (2002) 44 – 56. MSc in Information Security from
 A. Schiff, J. Kennes, The value of reputation systems, Pro- Royal Holloway College, University of
ceedings of the First Summer Workshop in Industrial Organi- London, and a BSc in Telematics from
zation (SWIO), Auckland NZ, March, 2003. the Norwegian Institute of Technology.
 J. Schneider, et al., Disseminating trust information in wear-
able communities, Proceedings of the 2nd International Sym- Roslan Ismail is a senior lecturer at the
posium on Handheld and Ubiquitous Computing (HUC2K), Malaysian National Tenaga University
September, 2000. and a PhD student at the Information
 S. Sen, N. Sajja, Robustness of reputation-based trust: boolean Security Research Centre at Queensland
case, Proceedings of the First Int. Joint Conference on Auton- University of Technology. His research
omous Agents and Multiagent Systems (AAMAS), ACM, interests are in computer security, repu-
2002 (July). tation systems, e-commerce security, fo-
 C. Shapiro, Consumer information, product quality, and rensic security, security in mobile
seller reputation, The Bell Journal of Economics 13 (1) agents and trust management in general.
(1982) 20 – 35. He has an MSc in Computer Science
 S. Tadelis, Firm reputation with hidden information, Economic from The Malaysian University of
Theory 21 (2) (2003) 635 – 651. Technology, and a BSc from Pertanian University.
27. 644 A. Jøsang et al. / Decision Support Systems 43 (2007) 618–644
Colin Boyd is an Associate Professor
at Queensland University of Technol-
ogy and Deputy Director of the In-
formation Security Research Centre
there. His research interests are in
the theory and applications of cryp-
tography. He has authored over 100
fully refereed publications including a
recent book on protocols for authen-
tication and key establishment. Colin
received the B.Sc. and Ph.D. degrees
in mathematics from the University of Warwick in 1981 and