The Scientific community has been inclined in the last decade to define Axioms for rating the quality of centrality measures such as the PageRank importance scoring algorithm, the HITS algorithm, the SALSA, the Betweennes or even the classic Degree metric, etc.
This is a review of a WWW conference paper authored by Paolo Blodi, Alessandro Luongo and Sebestiano Vigna.
Rank Monotonicity in Centrality Measures (A report about Quality guarantees for Search engines search Results)
1. Presentation of the
Paper Rank Monotonicity
in Centrality Measures
Paolo Boldi Alessandro Luongo Sebastiano Vigna
Presentation prepared by Mahdi Cherif for the Course Graph Mining.
Lectured by Andrea Marino
UniFI 2018-2019
17th of July, 2019
2. Synopsis of the Paper
ο΅ The authors present three axioms, i.e., the Score-Monotonicity Axiom, the
Rank-Monotonicity Axiom as well as the Strict Rank-Monotonicity Axiom.
ο΅ Eleven βCentralityβ measures for graphs are presented and explored in terms
of the three properties defined by the said axioms.
3. The Paper (Methodology and type)
ο΅ The paper adopts a deductive methodology as well as a case-study method,
i.e., counter-examples method for negating the properties for any of the
measures for general graphs or strongly connected graphs
ο΅ The paper builds on an axiomatic approach
ο΅ The paper can be entitled for classification in three categories, i.e, survey,
reprise and progress scientific work.
4. Significance of the Paper
ο΅ Strict rank monotonicity is proved for PageRank and Katzβs index. We
understand that this is the main contribution of the paper
ο΅ Additionally, , a set of results related to non-Markovian measures as well as
various insightful counter-examples are included for both Markovian and non-
Markovian measures.
ο΅ The Paper and the results are interesting for two reasons:
β’ The web and its community are constantly seeking better user-tailored results
and the Strict Rank-Monotonicity Axiom could be perceived as a quality
theoretical guarantee of a search engine measure model.
β’ Better understanding of search engines reaction to changes occurring in the
structure of the web.
5. Context of the Paper
ο΅ Continuous quest for better results of search engines responding to users
search queries
ο΅ Search engines tend to use Heuristics in order to score the very large volume
of data collections available in the Web
ο΅ The scoring is inherently subjective given that importance of a web document
is a matter of Personal taste and perspective
ο΅ Nonetheless there have been multiple attempts to formalize such scores and
make them Deterministic
6. Context of the Paper (The Dilemma)
ο΅ Legal framework is increasingly restrictive, e.g., the right to be forgotten,
data protection by design, data privacy, userβs ownership of his data,
indiscriminate profiling
ο΅ Size of the Web increased out of proportions
ο΅ Dark scores, e.g., Googleβs Sauce !
7. Authors Answer (The Axioms)
ο΅ The authors capitalize on previous work of themselves and other researchers
and present three Axioms (Definitions) for a given measure wherein a score
model is characterized as compliant with the property (Axiom) or not
ο΅ The Score-Monotonicity Axiom
ο΅ The Rank-Monotonicity Axiom
ο΅ The Strict Rank-Monotonicity Axiom
8. The Axioms (The Pledge)
ο΅ Offering a formal guarantee of the quality of a scoring measure
ο΅ A cross-scores guarantee, i.e., allows indiscriminate factual judgement of any
centrality measure
ο΅ Aim to identify the properties of a graph G after the addition of one arc xο y
ο΅ The authors had to address each measure twice, i.e, for each measure a
separate proof or counter-example for General Graphs and a proof or
counter-example for Strongly connected Graphs
9. Definition 1 (Score-Monotonicity Axiom)
ο΅ A centrality measure satisfies the score-monotonicity axiom if for every graph
G and every pair of nodes x,y such that x-/->y, when we add xο y to G the
centrality of y increases.
Note that previous work by other researchers has focused on the notion of Score-
Monotonicity which they proved for PageRank
10. Definition 2 (Rank-Monotonicity Axiom)
ο΅ A centrality measure satisfies the rank-monotonicity axiom if for every graph
G and every pair of nodes x,y such that x-/->y, when we add xο y to G the
following happens:
β’ If the score of z was strictly smaller than the score of y, this fact remains true
after adding xο y;
β’ If the score of z was smaller than or equal to the score of y, this remains true
after adding xο y.
Note that another formulation of the above definition is as follows:
β’ If the score of z was strictly smaller than the score of y, this remains true
after adding xο y;
β’ If the score of z was equal to the score of y, it remains equal or becomes
smaller after adding x->y.
11. Definition 3 (Strict Rank-Monotonicity
Axiom)
ο΅ A centrality measure satisfies the strict rank-monotonicity axiom if for every
graph G and every pair of nodes x,y such that x-/->y, when we add xο y to G
the following happens:
β’ If the score of zβ y was smaller than or equal to the score of y, after adding
xο y the score of z becomes smaller than the score of y.
Note that the only difference between the last two definitions is the behavior on
ties (nodes with the same score as y): if a measure is strictly rank monotone,
adding an arc xο y will break all ties with other nodes in favor of y.
12. The Measures (Centrality Measures
Survey)
ο΅ The authors identify in section 3.1 eleven measures labeled as βCentrality
Measuresβ
ο΅ A sub-category is recognized for Markovian measures, such measures are
called in this paper βSpectral measuresβ
ο΅ Non-spectral measures are:
β’ Indegree
β’ Closeness
β’ Linβs index
β’ Harmonic centrality
13. The Measures (Centrality Measures
Survey)
ο΅ Spectral Measures are:
β’ The dominant left eigenvector
β’ Seeleyβs index
β’ Katzβs index
β’ PageRank
β’ HITS
β’ SALSA
14. The Results
Centrality SMN
(General G)
RMN
(General G)
SMN (SC G) RMN (SC G)
Harmonic yes yes* yes yes*
Degree yes yes* yes yes*
Katz yes yes* yes yes*
PageRank yes yes* yes yes*
Dominant no no yes yes*
Seeley no no yes yes
Lin no no yes yes
Closeness no no yes yes
HITS no no no no
SALSA no no no no
Betweenness no no no no
Yes*: Strict rank Monotonicity is satisfied
15. Overview of the Results
ο΅ Strict rank monotonicity is proved for PageRank and Katzβs index. A result
with clear value for the Web mining scientific and industrial communities.
ο΅ Other proofs and counter-examples for other spectral measures, besides
PageRank and Katzβs index.
ο΅ a set of results related to non-Markovian measures as well as various
insightful counter-examples are included for both Markovian and non-
Markovian measures.
16. Strict Rank Monotonicity for PageRank
and Katzβs index
ο΅ Theorem 3 Let M and Mβ be two nonnegative matrices, such that Mβ-M=π π₯
π
πΏ
(i.e, the matrices differ only on the x-th row and πΏ is the corresponding row
difference). Let also π be a nonnegative preference vector and 0 β€ Ξ± β€ min
(1/Ο(M),1/Ο(Mβ)); let r and rβ be the damped spectral rakings associated with
M and Mβ respectively. Assume further that:
1. There is exactly one y such that πΉ π¦>0;
2. ππ¦ β 0
3. ππ¦ β€ πβ² π¦
Then, if ππ§ β€ ππ¦ we have πβ² π§-ππ§ β€ πβ² π¦-ππ¦. As a consequence, ππ§ β€ ππ¦ implies πβ² π§ β€
πβ² π¦, whereas ππ§ < ππ¦ implies πβ² π§ < πβ² π¦.
17. Strict Rank Monotonicity for PageRank
and Katzβs index (A note about r)
ο΅ This paper defines a generic damped spectral ranking given by
r=π πβ₯0(πΌπ) π=π(1 β πΌπ)β1
With,
M: Transition Matrix
πΌ: Damping factor
π: Preference vector
18. Strict Rank Monotonicity for PageRank
and Katzβs index (A note about r)
ο΅ PageRank Formalization (From the litterature)
R=cAR+(1-c)π
ο΅ R is the right dominant eigenvector of A
ο΅ The authors of PageRank suggest computing R by repeatedly applying A
ο΅ PageRank Formalization (PageRank authors)
Let E(π’) be some vector over the Web pages that corresponds to a source of rank. Then, the
PageRank of a set of Web pages is an assignment, Rβ, to the Web pages which satisfies
Rβ²
π’ = π
π£βπ΅ π’
π β²
π£
ππ£
+ cE(u)
Such that c is maximized and | π β² |1=1 (| π β² |1denotes the πΏ1 norm of Rβ).
Let u be a web page. Let πΉπ’ be the set of pages u point to and π΅π’ be the set of pages that point
to u. Let π π’=| πΉπ’| be the number of links from u.
We have Rβ=c(ARβ+E). Since | π β² |1, we can rewrite this as Rβ=c(A+E*1)Rβ where 1 is the vector
consisting of all ones. So, Rβ is an eigenvector of (A+E*1).
19. Other Results for Spectral Measures
ο΅ Theorem 4 Condition (3) of Theorem 3 can be substituted by the following
two hypotheses (that imply it)
1. 1 β πΌπ is (strictly) diagonally dominant
2. π§ πΏ π§ β₯ 0.
Note that this give specific terms for the realization of Condition 3 which can be
very useful, i.e., undertaking the verification prior to the computation of the
new rβ.
20. Other Results for Spectral Measures
ο΅ Theorem 5 is a strict version of Theorem 3 for proving Strict Rank-
Monotonicity
ο΅ Corollary 1 PageRank satisfies the strict rank-monotonicity axiom, for any
graph, damping factor and preference vector, provided all scores are nonzero.
The latter condition is always true if the preference vector is everywhere
nonzero or if the graph is strongly connected.
ο΅ Corollary 2 Katzβs index satisfies the strict rank-monotonicity axiom, for any
graph, attenuation factor and preference vector, provided all scores are
nonzero. The latter condition is always true if the preference vector is
everywhere nonzero or if the graph is strongly connected.
21. Other Results for Spectral Measures
ο΅ Through quick proofs and counter examples, authors demonstrate that
β’ Seeleyβs index is not rank monotone on general graphs
β’ Seeleyβs index is not score monotone
β’ Seeleyβs index is score monotone and rank monotone for strongly connected
graphs but not strict rank monotone
β’ SALSA is not score monotone and rank monotone
β’ Another counter example (fig.10) shows that the dominant left eigenvector is
not rank monotone on general graphs.
β’ HITS is not rank monotone for strongly connected graphs
β’ HITS is not score monotone on strongly connected graphs
22. Other Results for Spectral Measures
β’ Dominant left eigenvector is strict rank monotone
β’ Dominant left eigenvector is score monotone for strongly connected graphs
23. Results for non-spectral Measures
β’ Harmonic centrality is score monotone on all graphs
β’ Harmonic centrality is strict rank monotone on all graphs
β’ Closeness does not satisfy score monotonicity and rank monotonicity in
general graphs
β’ Closeness is score monotone on strongly connected graphs
β’ Linβs index does not satisfy score monotonicity and rank monotonicity in
general graphs
β’ Linβs index is equivalent to closeness in strongly connected graphs so it
satisfies both score monotonicity and rank monotonicity but not strict rank
monotonicity
β’ Betweenness does not satisfy all the axioms even in strongly connected graphs
24. The Proofs
ο΅ The proofs are quite elegant
ο΅ For the damped spectral measure proofs the authors used several
fundamental theorems and properties of Linear Algebra in order to achieve all
the steps of the proofs.
ο΅ They also used an innovative paradigm, i.e., βthe update vectorβ for the said
proofs.
ο΅ For other proofs, they relied on fundamental mathematics as well as counter-
examples through insightful graphs
26. Proofs for non-spectral Measures
ο΅ Harmonic centrality. The reciprocal of a denormalized harmonic mean.
π¦β π₯
1
π(π¦,π₯)
Lemma 1 Let G be a graph with distance function d, and let dβ be the distance
function of G with an additional arc xο y. Then, for every node π€ β π¦ πππ π§ β π€
we have
1
πβ²(π€, π§)
β
1
π π€, π§
β€
1
πβ² π€, π¦
β
1
π(π€, π¦)
Moreover, if πβ² π€, π§ < π(π€, π§)
1
πβ²(π€, π§)
β
1
π π€, π§
<
1
πβ² π€, π¦
β
1
π(π€, π¦)
27. Proofs for non-spectral Measures
ο΅ Proof. The first part is obvious if πβ²
π€, π§ = π(π€, π§)
ο΅ Otherwise, with the notation of Figure 1, the hypothesis πβ² π€, π§ < π(π€, π§)
yields π > π + 1 + π (which implies π, π < β). Note that in this case π‘ > π + 1, as
otherwise π > π + 1 + π β₯ π‘ + π, contradicting the triangular inequality π β€ π‘ + π.
We conclude that
1
πβ²(π€,π§)
β
1
π π€,π§
=
1
π+1+π
β
1
π
<
1
π+1
β
1
π‘
=
1
β β² π€,π¦
β
1
β w,π¦
,
Since π , π‘ < β
1
π+1+π
β
1
π
-(
1
π+1
β
1
π‘
) =
π+1βπβ1βπ
π+1+π π+1
+
π βπ‘
π π‘
< β
π
π π‘
+
π
π π‘
= 0
If s or t are infinite the result holds.
28. Proofs for non-spectral Measures
ο΅ Theorem 1. Harmonic centrality satisfies strict rank monotonicity on all
graphs.
ο΅ Proof. With the notation of Lemma 1, we assume that for a node π§ β π¦
π€β π§
1
π(π€,π§)
β€ π€β π¦
1
π(π€,π¦)
.
Adding the latter inequality to that of Lemma 1, for every π€ β π¦, π§, we obtain
π€β π§,π¦
1
πβ²(π€, π§)
+
1
π(π¦, π§)
β€
π€β π§,π¦
1
πβ²(π€, π¦)
+
1
π(π§, π¦)
Given that πβ² π¦, π§ = π π¦, π§ πππ πβ² π§, π¦ β€ π π§, π¦ . But then either π§ β π₯, in which
case at least for π€ = π₯ we are adding a strict inequality, or π§ =
π₯, ππ π€βππβ πππ π πβ² π§, π¦ < π(π§, π¦).
29. Proofs for non-spectral Measures
ο΅ Closeness. Bavelas introduced closeness in 1948, the closeness of x is defined
by
1
π¦ π(π¦, π₯)
The graph must be strongly connected or some of the summands will be β.
To correct this, closeness is often patched by eliminating infinite summands at
the denominator, this version of closeness is the one used in this paper.
Nonetheless, the authors illustrate a counter-example for closeness on general
graphs.
31. Proofs for non-spectral Measures
ο΅ Lemma 2 Let G a graph with distance function d, and let dβ be the distance
function of G with an additional new arc xο y. Then, for every node w and z
π π€, π§ β πβ² π€, π§ β€ π π€, π¦ β πβ² π€, π¦ .
Proof. If π π€, π§ = πβ²(π€, π§) the result holds. Otherwise, looking at Figure 1, we
have π β€ π‘ + π by the triangular inequality. Thus,
π π€, π§ β πβ² π€, π§ β€ π β π β 1 β π β€ π‘ β π β 1 β€ π π€, π¦ β πβ² π€, π¦ .
32. Proofs for non-spectral Measures
ο΅ Theorem 2 Closeness satisfies rank monotonicity on strongly connected
graphs.
ο΅ Proof. With the notation of Lemma 2, we assume that for a node z
1
π€ π(π€,π§)
β€
1
π€ π(π€,π¦)
.
Equivalently,
π€ π(π€, π¦) β€ π€ π(π€, π§),
And adding for all w the inequalities of Lemma 2
π€ πβ²(π€, π¦) β€ π€ πβ²(π€, π§).
The same deduction is true if we start from a strict inequality. An inversion of
the sums completes the proof.
34. Proofs for non-spectral Measures
ο΅ Linβs index. A repaired definition of closeness for graphs with infinite
distances. The Linβs index for a node x is
| π¦ π π¦, π₯ < β |Β²
π¦β π₯ π(π¦,π₯)
.
β’ For strongly connected graphs the measure is equivalent to closeness.
β’ For general graphs the following graph is represented and serves as a counter-
example
35. Proofs for non-spectral Measures
The Lin centrality of y and z is (k+1)Β²/k.
After adding the arc, the centrality of y becomes (k+5)Β²/(k+9) which is smaller
for k>3
36. Proofs for Centrality Measures
Betweenness. Let π π¦π§ is the number of shortest paths going from
y to z. Let π π¦π§ x is the number of such paths passing through x,
we define the betweenness of x as
π¦,π§β π₯,π π¦π§β 0
π π¦π§(π₯)
π π¦π§
.
In G, the score of x and y is
zero. But when adding xο y, a
new shortest path arises
through x, raising its score to
1/3 while the score of y
remains zero
37. Proofs for spectral Measures
ο΅ Lemma 3 Let M be a nonnegative matrix, 0 β€ πΌ β€ 1/π π is a damping factor,
and π a nonnegative preference vector. Let
r=π πβ₯0(πΌπ) π
be the associated damped spectral ranking and let πΆ = (1 β πΌπ)β1. Then, given y
and z such that π π¦π§ > 0 and letting π = π π¦π¦/π π¦π§, we have π π€π¦ β€ π. π π€π§ for all w. In
particular, if ππ¦ β 0
β’ If ππ§ β€ ππ¦, then π π¦π§ β€ π π¦π¦;
β’ If ππ§ < ππ¦, then π π¦π§ < π π¦π¦.
Note: The authors suggest that both PageRank and Katzβs index are special
instances of the damped spectral ranking
38. Proofs for spectral Measures
ο΅ Proof. The first claim is a statement of the property (Willoughby, 1977) that
for all y, z and w
π π¦π§β₯
π π€π¦ π π¦π§
π π¦π¦
,
So π. π π€π§ β₯ π π€π¦.
Note now that if π π¦π¦ < π π¦π§, then π < 1, and
ππ¦= π€ π£ π€ π π€π¦ < π€ π£ π€ π π€π§ = ππ§,
Which proves the first item (the strict inequality due to the assumption ππ¦ β 0).
If π π¦π¦ β€ π π¦π§, then π β€ 1, and the second item follows similarly.
39. Proofs for spectral Measures
ο΅ Proof of Theorem 3 (The main theorem). In this proof, as in the Lemma πΆ =
(1 β πΌπ)β1. First, given the hypotheses 1 β πΌπ and 1 β πΌπβ² are M-matrices,
so they both have positive determinants. Since Mβ is obtained from M by a
rank-one correction Mβ=M+π π₯
π
πΏ, applying the matrix determinant lemma we
have
det 1 β πΌπβ²
= det(1 β πΌπ β πΌπ π₯
π
πΏ)=(1 β πΌπΏ(1 β πΌπ)β1
π π₯
π
)det(1 β πΌπ).
Therefore,
1 β πΌπΏ(1 β πΌπ)β1 π π₯
π > 0
40. Proofs for spectral Measures
ο΅ Proof of Theorem 3 (Bis)
Given the Sherman-Morrison formula the inverse of 1 β πΌπβ²
is written as a
function of 1 β πΌπ.
(1 β πΌπβ²
)β1
=(1 β πΌ(M+π π₯
π
πΏ))β1
= (1 β πΌπ β πΌπ π₯
π
πΏ)β1
= (1 β πΌπ)β1
+
(1βπΌπ)β1 πΌπ π₯
π πΏ(1βπΌπ)β1
1βπΌπΏ(1βπΌπ)β1 π π₯
π
Then after a multiplication by the preference vector π, the explicit spectral-
rank correction is obtained
πβ² = π(1 β πΌπβ²)β1
Thus, πβ²
β π = πΎπΏ(1 β πΌπ)β1
, with πΎ a positive constant
41. Proofs for spectral Measures
Note that if [πΏ(1 β πΌπ)β1
] π§ β€ 0
The thesis is verified by the hypothesis ππ¦ β€ πβ² π¦. This holds true, in particular, if π π¦π§=
0, as in that case
if [πΏ(1 β πΌπ)β1
] π§ = β π€β π¦ |πΏ π€| π π€π§ β€ 0.
If π π¦π§>0, since ππ¦ β 0 we know from Lemma 3 that π =
π π¦π¦
π π¦π§
β₯ 1, and for all w we have
π. π π€π§ β₯ π π€π¦. It follows that,
[πΏ(1 β πΌπ)β1
] π¦ = πΏ π¦ π π¦π¦- π€β π¦ |πΏ π€| π π€π¦ β₯ πΏ π¦ ππ π¦π§ β π€β π¦ π|πΏ π€|π π€π§
=π(πΏ π¦ π π¦π§- π€β π¦ | πΏ π€|π π€π§)=π[πΏ(1 β πΌπ)β1
] π§ β₯ [πΏ(1 β πΌπ)β1
] π§ .
This concludes the proof.
42. Proofs for other spectral Measures
ο΅ SALSA. The dominant left eigenvector of π΄ π π΄.
The following illustration refutes score monotonicity and rank monotonicity on a
strong connected graph for SALSA
Before adding xο y, all
scores where equal to
1/8, after the
addition, the score of
y decreases to 3/28
and the score of z
increases to 3/14
43. Proofs for other spectral Measures
ο΅ HITS. This rank is the dominant left eigenvector of π΄ π
π΄
The authors present two counter-examples for HITS, the first negates that the
measure is rank monotone on strongly connected graphs and the second
demonstrates that HIT is not score monotone on strongly connected graphs.
Before adding xο y, there
is a unique dominant left
eigenvector that is zero
on all nodes except for
the 3-clique. After
addition of the arc, z has
a rank greater than y
44. Proofs for other
spectral Measures
ο΅ This counter-example depicts
that HITS is not score
monotone on strongly
connected graphs: The score
of y remains zero after the
addition of xο y
45. Recommendations
ο΅ Establish, for damped spectral rankings, a result or a theoretical framework
for non-strictly positive preference vectors.
ο΅ The achievement of experimental results as part of a future work.
46. Implementation of PageRank and
Preference vectors (The TSPR concept)
It is important to handle the case of zero entries in the preference vector: The Perron-Frobenius
guarantees convergence of the eigenvector only for strongly connected graphs. Therefore, they
will be discarded from the calculation. The graph will lose its expressiveness
48. Recommendations for the proofs
ο΅ The consideration of a proposal of new versions of the rejected models that
could be Axioms-valid.
49. Criticism
ο΅ The results:
We clearly think that experimental results could have increased to multiple
factors the impact of the results.
ο΅ The proofs:
The hypothesis of the theorems and lemmas are very strong and it is not very
uncommon to meet a case of zero-valued entries of a preference vector or non-
zero sum of the entries of update vector, multiple strictly positive entries in the
update vector: the application cases mentioned at the end of the proof for
Theorem 3 and Corollary 1 of section 5.5.2 in the article where the entry of
exactly one node has exactly one positive value in the update vector while all
others have negative or zero values and wherein the sum of all entries of the
update vector is equal to zero has not been explained in-depth.
50. Criticism
ο΅ the authors define Spectral measure as the dominant left eigenvector of some
matrix derived from the adjacency matrix A of the graph. Given, that
PageRank is the right eigenvector and not the left eigenvector this assertion
in the paper lacks further argumentation.
51. Presentation of the Proofs
ο΅ The proofs are too succinct. We also encountered some ambiguity related to
presentation of vectors, identity vectors and matrices.
52. Possible improvements
ο΅ The Proofs: Exploration of a generalization proof for the Theorem 3 for
multiple p-successive applications of the update vector, i.e., πΏ1, β¦ , πΏ π.
ο΅ Presentation of the Proofs:
β’ We suggest clear presentation of vectors and matrices in a clear and easily
distinguishable manner, i.e., not merely emboldened police, this is especially
true for identity vectors.
β’ The explanatory examples for proof of Theorem 3 and Corollary 1 of section
5.5.2, wherein the update vector entries sum to zero, deserves further
detailing.
53. Conclusion
ο΅ Valuable paper with very instructive insights
ο΅ Important contribution but has to be tested against the Web