Rating System Algorithms Document

A case study on Star Ranking Algorithm
Department of Computer Applications, Sikkim University 1 | P a g e
Chapter 1: INTRODUCTION
Consumer generated ratings are now an essential part of several platforms. For instance,
users on Yelp and TripAdvisor rate restaurants, hotels, attractions, and various Types of
service oﬀ erings. Every major online marketplace (eBay, Amazon, Flipkart) uses online ratings
asa way of recognizing good products and rewarding honest/good behaviour by vendors. Because
buyers frequently look at reviews before buying a product or using a vendor.
With the rapid development of World Wide Web, our lives nowadays rely more and more
on the Internet. Online systems allow a large number of users to interact with each other
and provide thousands of movies, millions of books, billions of web pages for them to
choose. To approximately judge the quality of a certain object, a user can refer to the
historical ratings the object received. The most straightforward method to rank objects is
to consider their average ratings (we refer it as the mean method). However, such methods
are very sensitive to the noisy information and manipulation. In these rating systems,
some users may give unreasonable ratings because they are not serious about the rating
or simply not familiar with the related field In addition, the system may contain some
malicious spammers who always deliberately give high ratings to some low quality
objects.
However, it is not possible to predict the true trustworthiness of users that have only given
a few ratings. For example, users with only a few ratings, all of which are highly accurate,
can be a fraudulent shill account building initial reputation or it can be a genuine user.
Similarly, the true quality of products that have received only a few ratings is also
uncertain. We propose a Bayesian solution to address them.
Additionally, the rating behaviour is often very indicative of their nature. For instance,
unusually rapid or regular behaviour has been associated with fraudulent entities, such as
fake accounts, sybils and bots. Similarly, unusually bursty ratings received by a product
may be indicative of fake reviews. Therefore, we propose a Bayesian technique to
incorporate users’ and products’ rating behaviour in the formulation, by penalizing
unusual behaviour.
Combining the network, cold start treatment and behavioural properties together, this
paper present the FairJudge formulation and an iterative algorithm to ﬁnd the fairness,
goodness and reliability scores of all entities together and then compare the products

between them by using the modified version of Elo algorithm. This paper proves that
FairJudge has linear time complexity and it is guaranteed to converge in a bounded
number of iterations.
Overall, the paper makes the following contributions:
• Algorithm: We propose three novel metrics called fairness, goodness and reliability to
rank users, products and ratings, respectively. The paper propose Bayesian approaches to
address cold start problems and incorporate behavioural properties. The paper proposes
the FairJudge algorithm to iteratively compute these metrics and .take the output of the
Fair judge as an input to the modified Elo algorithm for comparing between 2 products
(As the final result contributes to the market value of companies).
• Eﬀectiveness: Since the new algorithm suggested by this paper has not yet been
implemented in the real time scenario so till now it’s still unknown.
1.1 OBJECTIVE:
The objective of this research is to:
 To study the rating system currently used by the standard companies in the
online market for rating the products.
 To study the implementation of the rating system applied in various sub-
domain in the market.
 To understand the various pros and cons faced by the companies by applying
the rating system.
 Trying to overcome the cons (flaws or suffering) faced by the rating system.

Chapter 2: LITERATURE SURVEY
2.1 Iterative ranking algorithm with reputation redistribution:
Process for eliminating noisy information in the iterations, so as to improve the accuracy in
objects’ quality ranking. The rating system can be naturally described by a weighted bipartite
network. The users are denoted by set U and objects (e.g. books, movies or others) are denoted
by set O. To better distinguish different type of nodes in the bipartite network, we use Latin
letters for users and Greek letters for objects. The rating given by a user i to object a is the
weight of the link, denoted by riα. The degree of users and objects are respectively ki and ka.
Moreover, we define the set of objects selected by user i as Oi and the set of users selecting
object a as Ua. We use Qa and Ri to note the quality of object a and the reputation of user i,
respectively. The initial configuration for each user is set as Ri~ki=M (where M is the number
of objects). The quality of an object depends on users’ rating and can be calculated by the
weighted average of rating to this object. Mathematically, it reads
…..1.1
In the iteration, both Qa and Ri will be updated. To calculate the reputation Ri of user i in
certain step, we first calculate the Pearson correlation coefficient between the rating vector of
user i and the corresponding objects a quality vector as the temporal reputation (TRi):
…..1.2
Where h is a tenable parameter. The method will reduce to the mean and CR methods when
h~0 and h~1, respectively. The obtained Ri will be then used as the reputation of user i to
calculate the quality of objects in eq. 1. With this reputation redistribution process, the user
with high TRi will be amplified, and vice versa. By reducing the weight of the users with low
TRi, we can eliminate the noisy information in the iterative processes. This effect is
accumulated in each iterative step, and will finally lead to a big improvement in the accuracy
of object quality
Estimation. Actually, the basic idea of the reputation redistribution process is similar to the
well-known k-nearest neighbours (KNN) algorithms which eliminate the noise by entirely
drop the information of nodes outside the k-nearest neighbours. The KNN algorithm is widely

used in recommender systems. Here, we design a smooth way to implement the idea to object
quality ranking. Though the modification of the method seems to be small, the improvement
is substantial (see the following analysis). Users’ reputation and objects’ quality will be
updated in each step. The iteration stops when the change of the quality
……1.3
2.2 Fair Judgment algorithm:
In this section [1], we present the FairJudge algorithm that jointly models the rating network
and behavioural properties.
We ﬁrst present the three novel metrics — Fairness, Reliability and Goodness — which
measure intrinsic properties of users, ratings and products, respectively, building on our prior
work. We show how to incorporate Bayesian priors to address user and product cold start
problems, and how to incorporate behavioural features of the users and products. We then
prove that our algorithm have several desirable theoretical guarantees. Prerequisites. We
model the rating network as a bipartite network where user u gives a rating (u,p) to product p.
Let the rating score be represented as score (u,p).
Let U,R and P represent the set of all users, ratings and products, respectively, in a given
bipartite network. We assume that all rating scores are scaled to be between -1 and +1, i.e.
score(u,p) ∈ [−1,1]∀(u,p) ∈ R. Let, Out(u) be the set of ratings given by user u and In(p) be
the set of ratings
Received by product p. So, |Out(u)| and |In(p)| represents their respective counts.
2.2.1 Fairness, Goodness and Reliability:
Users, ratings and products have the following characteristics:
• Users vary in fairness. Fair users rate products without bias, i.e. they give high scores to high
quality products, and low scores to bad products. On the other hand, users who frequently
deviate from the above behavior are ‘unfair’. For example, fraudulent users often create
multiple accounts to boost ratings of unpopular products and bad-mouth good products of
their competitors.

• Products vary in terms of their quality, which we measure by a metric called goodness. The
quality of a product determines how it would be rated by a fair user. Intuitively, a good product
would get several high positive ratings from fair users, and a bad product would receive high
negative ratings from fair users. The goodness G(p) of a product p ranges from −1 (a very low
quality product) to +1 (a very high quality product) ∀p ∈P.
• Finally, ratings vary in terms of reliability. This measure reflects how trustworthy the
specific rating is. The reliability R(u,p) of a rating (u,p) ranges from 0 (an untrustworthy
rating) to 1 (a trustworthy rating) ∀(u,p) ∈R The reader may wonder: isn’t the rating
reliability, identical to the user’s fairness? The answer is ‘no’. Consider Figure 1, where we
show the rating reliability distribution of the top 1000 fair and top 1000 unfair users in the
Flipkart network, as identified by our FairJudge algorithm (explained later). Notice that, while
most ratings by fair users have high reliability, some of their ratings have low reliability,
indicating personal opinions that disagree with majority (see green arrow). Conversely, unfair
users give some high reliability ratings (red arrow), probably to camouflage themselves as fair
users. Thus, having reliability as a rating specific metric allows us to more accurately
characterize this distribution.
Figure 1: Reliability of user ratings
While most ratings of fair users have high reliability, some ratings also have low reliability
(green arrow). Conversely, unfair users also give some highly reliability ratings (red arrow),
but most of their ratings have low reliability.

Given a bipartite user-product graph, we do not know the values of these three metrics for any
user, product or rating. Clearly, these scores are mutually interdependent. An example.
Figure 2: Toy example showing products (P1, P2, P3), users (UA, UB, UC, UD, UE and UF),
and rating scores provided by the users to the products.
In Figure 2. user UF always disagrees with the consensus, so UF is unfair.
Example 1 (Running Example). Figure 3 shows a simple example in which there are 3
products, P1 to P3, and 6 users, UA to UF. Each review is denoted as an edge from a user to
a product, with rating score between −1 and +1. Note that this is a rescaled version of the
traditional 5-star rating scale where a 1, 2, 3, 4 and 5 star corresponds to −1,−0.5,0,+0.5
and +1 respectively.
One can immediately see that UF’s ratings are inconsistent with those of the UA,UB, UC,UD
and UE. UF gives poor ratings to P1 and P2 which the others all agree are very good by
consensus. UF also gives a high rating for P3 which the others all agree is very bad. We will
use this example to motivate our formal deﬁnitions below.
Fairness of users: Intuitively, fair users are those who give reliable ratings, and unfair users
mostly give unreliable ratings. So we simply deﬁne a user’s fairness score to be the average
reliability score of its ratings:

…….. 2.1
Goodness of products: When a product receives rating scores via ratings with diﬀerent
reliability, clearly, more importance should be given to ratings that have higher reliability.
Hence, to estimate a product’s goodness, we weight the ratings by their reliability scores,
giving higher weight to reliable ratings and little importance to low reliability ratings:
……… 2.2
Returning to our running example, the ratings given by users UA, UB, UC, UD, UE and UF
to product P1 are +1,+1,+1,+1,+1 and −1, respectively. So we have:
……2.2.1
Reliability of ratings: A rating (u; p) should be considered reliable if (i) it is given by a
generally fair user u, and (ii) its score is close to the goodness score of product p. The first
condition makes sure that ratings by fair users are trusted more, in lieu of the user's established
reputation. The second condition leverages `wisdom of crowds', by making sure that ratings
that deviate from p's goodness have low reliability. This deviation is measured as the
normalized absolute difference, (mod(score(u,p) -G(p))/2). Together:
……. 2.3
In our running example, for the rating by user UF to P1:
……. 2.3.1
Similar equations can be associated with every edge (rating) in the graph of Figure 2.

Figure 3: Formula for user fairness, product goodness and rating reliability.
In figure 3 the set of mutually recursive definitions of fairness, reliability and goodness for
the proposed Fair Judge algorithm. The yellow shaded part addresses the cold start problems
and gray shaded part incorporates the behavioral properties.
2.3 Cold Start Problem:
If a user u has given only a few ratings, we have very little information about his true behavior.
Say all of u’s ratings are very accurate – it is hard to tell whether it is a fraudster that is
camouflaging and building reputation by giving genuine ratings, or it is actually a benign user.
Conversely, if all of u’s ratings are very deviant, it is hard to tell whether the user is a
fraudulent shill account or simply a normal user whose rating behavior is unusual at first but
stabilizes in the long run. Due to the lack of sufficient information about the ratings given by
the user, little can be said about his fairness. Similarly, for products that have only been rated
a few times, it is hard to accurately determine their true quality, as they may be targets of
fraud. This uncertainty due to insufficient information of less active users and products is the
cold start problem. Here problem is solved by assigning Bayesian priors to each user’s fairness
score as follows.
...... 3.1
Here, α is a non-negative integer constant, which is the relative importance of the prior
compared to the rating reliability – the lower (higher) the value of α, the more (less, resp.) the
fairness score depends on the reliability of the ratings. The 0.5 score is the default global prior
belief of all users’ fairness, which is the midpoint of the fairness range [0,1]. If a user gives
only a few ratings, then the fairness

score of the user is close to the default score 0.5. The more number of ratings the user gives,
the more the fairness score moves towards the user’s rating reliabilities. This way shills with
few ratings have little eﬀect on product scores. Similarly, the Bayesian prior in product’s
goodness score is incorporated as:
……. 3.2
2.4 Elo Ranking Algorithm:
Elo Rating Algorithm is widely used rating algorithm that is used to rank players in many
competitive games. Players with higher ELO rating have a higher probability of winning a
game than a player with lower ELO rating [5]. After each game, ELO rating of players is
updated. If a player with higher ELO rating wins, only a few points are transferred from the
lower rated player. However if lower rated player wins, then transferred points from a higher
rated player are far greater.
……. 4
Approach:
P1: Probability of winning of player with rating2
P2: Probability of winning of player with rating1.
P1 = (1.0 / (1.0 + pow (10, ((rating1 – rating2) / 400))));
P2 = (1.0 / (1.0 + pow (10, ((rating2 – rating1) / 400))));
Obviously, P1 + P2 = 1.
The rating of player is updated using the formula given below:-
rating1 = rating1 + K*(Actual Score – Expected score);
In most of the games, “Actual Score” is either 0 or 1 means player either wins or loose. K is
a constant. If K is of a lower value, then the rating is changed by a small fraction but if K is
of a higher value, then the changes in the rating are significant. Different organizations set a
different value of K.

Figure 4: Probability of losing or winning the game.
Demonstration of Elo rating algorithm by a small example:
Suppose there is a live match on chess.com between two players
rating1 = 1200, rating2 = 1000;
P1 = (1.0 / (1.0 + pow(10, ((1000-1200) / 400)))) = 0.76
P2 = (1.0 / (1.0 + pow(10, ((1200-1000) / 400)))) = 0.24
And Assume constant K=30;
CASE-1: Suppose Player 1 wins:
rating1 = rating1 + k*(actual – expected) = 1200+30(1 – 0.76) = 1207.2;
Case-2: Suppose Player 2 wins:
So we have come with the approach of modifying the Elo rating algorithm by changing some
of its arguments /parameter and constant value doing so we are trying to compare the two
products of different company and help the market to realize the value of the product.
Firstly we assume products, instead of players and if a user i is purchasing the same kind of
product twice then he/she shall be asked to mark a single item as better than other one
E.g. if a user i buys a phone of MI company and after a certain period of time he/she is again
buying a phone of another company he/she shall be asked to mark a company brand after the
user uses the phone for a certain period of time.

Chapter 3: METHODOLOGY
After using the fair judge algorithm and calculate the fairness rate of the user along with
respect to time (using IBIRDNEST factor algorithm) those user will be promoted to give mark
to the company products (if they purchase the similar kind of product from the same domain
like Flipkart or amazon) and based on their marking the market value of the products will be
calculated using the custom Elo algorithm.
Below shows the algorithm step by step of fair judge and Elo rating and also in Figure 4 we
have shown that how these two algorithm will be merged and will produce the output.
3.1 Used Fair Judgment and Elo rating algorithm:
• Initially average method used for rating of product.
• Fair Judgment is used to verify fair and unfair users.
• Compare the reliability using GLICKO/Elo Rating.
• Sort the ranking.
Figure 5. Integration model of Algorithm.

3.2 ALGORITHMS:
3.2.1 FairJudge Algorithm
Steps of FairJudge Algorithm:
1: Input: Rating network (U, R, P), α1, α2, β1, β2
2: Output: Fairness, Reliability and Goodness scores, given α1, α2, β1 and β2
3: Initialize F0(u) = 1,R0(u,p) = 1 and G0(p) = 1,∀u ∈ U,(u,p) ∈R,p ∈P.
4: Calculate IBIRDNESTIRTDU (u) ∀u ∈ U and IBIRDNESTIRTDP (p) ∀p ∈P.
5: t = 0
6: do
7: t = t + 1
8: Update goodness of products using Equation 3: ∀p ∈P,
9: Update reliability of ratings using Equation 2 ∀ (u, p) ∈R,
10: Update fairness of users using Equation 1 ∀u ∈U,
11:
12: while error >
13: Return Ft+1(u), RT+1(u, p), Gt+1(p), ∀u ∈ U, (u, p) ∈ R, p ∈ P
14: Stop.

3.2.2 Elo Algorithm
Steps for Elo Algorithm:
Input: 1) rating1 = Current Rating of a Product P1
2) rating1 = Current Rating of a Product P2
3) prob1 = Probability of winning of player with rating2
4) prob2 = Probability of winning of player with rating1.
Output: New Points are assigned to the product based on the Elo rating algorithm.
Step 1: prob1 = (1/1 + pow (10, ((P1-P2)/400))));
Step 2: prob2 = (1/1 + pow (10, ((P2–P1)/400))));
Step 3: if (P1 receive more points)
Step 3.1: rating1 = rating1 + K*(Actual Score – Expected score);
Step 3.2: rating2 = rating2 + k*(actual Score – expected score)
Step 3.3: else
Step 3.4: rating1 = rating1 + K*(Actual Score – Expected score);
Step 3.5: rating2 = rating2 + k*(actual Score – expected score);
Step 4: Update the rating according to the scenario
Step 5: Stop.

Chapter 4: RESULT
4.1 Codes:
Figure 6: Implementation of algorithm in Python Language.

3.3.2 Outcome:
Figure 7: Initial Rating for a Product from various users.

Figure 8: The output console calculating the fairness, reliability of the user and goodness of
the product.

CONCLUSION
We presented the case study of FairJudge algorithm and Elo rating algorithm to address the
problem of identifying fraudulent users in rating networks and correct filter the situation and
also took the output given by the Fair judge algorithm and took it as an input in the Elo
algorithm to compare with same kind of another product. Till now this paper has the following
contributions:
• Algorithm: We presented the three mutually-recursive metrics of the Fair judge algorithm -
fairness of users, goodness of products and reliability of ratings. The paper [1] extended the
metrics to incorporate Bayesian solutions to cold start problem and behavioural properties.
Along with this algorithm Elo rating [5] provides the capability to compare between two
products.
• Theoretical guarantees: The new concept by merging those two algorithms has not yet been
implemented in the real-time scenario so it’s still unknown that it can be able to optimize the
rating system or not.
REFERENCES
[1] Srijan Kumar, Bryan Hooi ,Disha Makhija “FairJudge: Trustworthy User Prediction in
Rating Platforms”
[2] Hao Liao, An Zeng, Rui Xiao “Ranking Reputation and Quality in Online Rating Systems”
[3] Rajat Sharma, Gautam Nagpal, Amit Kanwar , “Algorithm for Ranking Consumer
Reviews on Ecommerce Websites”
[4] University of Edinburgh by Marius St˘anescu, “Rating systems with multiple factors”
[5] https://www.geeksforgeeks.org/elo-rating-algorithm/

Rating System Algorithms Document

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Rating System Algorithms Document

Similar to Rating System Algorithms Document (20)

More from Scandala Tamang

More from Scandala Tamang (6)

Recently uploaded

Recently uploaded (20)

Rating System Algorithms Document