Done reread the effect of new links on google pagerank

362
-1

Published on

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
362
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Done reread the effect of new links on google pagerank

  1. 1. INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE The Effectof New Links on Google PageRank Konstantin Avrachenkov —Nelly Litvak N° 5256 July 2004apport de rechercheThème COMUnité de recherche INRIA Sophia Antipolis 2004, route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex (France)Téléphone : +33 4 92 38 77 77 — Télécopie : +33 4 92 38 77 65¢¤¦¨©§ " %¤0124562@BD2E ¤IQTUVXYBbdXUTQhpUqsuw¤hyy ƒy„U s ‰‘„”–™5fhqkl„”Xop”–‘‘tvpw z{}~  bqqb}‘Uvl qq lXw‘ww‘‡UQpvv©f}éè¤vvv“”2 }vX ¤s„T‘vv Q}p„v‘6tv‘™vdléé ‘l éè‘ o{èl{t–pop{ ‘è‘–l0‘tww£hvv £{v‘q¥¦ }vXspvvX}‘£v ¥èh{{‘llX vdd lXpéé£vqhtl«{董¦évvwhh w}é ‘v¨k‘ll vé ”lhé«"lé„o{wl‘® p‘‘t}{«~–}¯¯¦°évvps¦®klé ‘2{‘bUX²¥vV‘ 6o{„T{„ ” è‘q p³£hvv 2Q}p„v‘I¦” qt{oé{{Eé}®oh{é Eé}p2v³ophl{v¯è{£Q}vX}éU®blè‘ {‘–pkh0q{}ltvé} qkt"‘{Tht q2l ”éè2vé ‘«{èpé{éT£k‘T¨èw‘‡ ‘q‘{è鰄‘oé{ {{ ·©„ué}p°}é ¦è{{ ‘„èph p{™ ¦l{66 }Q}p„v‘¸vl‘„© q¸‘v„º‘‘ll‘„l0plp¯¥ l‘T¼l T£l‘„l–oq k{£}³vq{è6}pè‘h ‘¤klwT{pvb¦0ov o é q{éT™°¦©é}p „‘oé{ {v è‘q2èélt qEè{–©„½op02‘‘è~u}é upul‘¤}lé2 }é u l{ Tvp–è ‘qUévèEl‘ ¦·évvX"vé ¤l‘„覸v02‘é«{èX ¸hI{2 Tpv £hvv vspvvX}‘Idb}‘¦®é0bU éTlpz®‘l 0vw¯ ‘hèé¸hlwTl„vps¸}{vTw } é ¯”swXÍX²YV²Y{ÍÓè”Õ wٔksÕ~”²s͔dTTw”~ßw}ÍX”}²y¯wvÍXQsÓY”~YßwX”spYow”Õk{YÕ² ²¯kÕ~oÍ ”d~Íkwyٖ d¤zUX””~V͔p²Íwo¯QQ°XÓÓ ²²¥w”͔pTY~”Ù™pYowXÕl{YÕ²™Upp Q6”””~VUw™p²²b²Íwo ”” ²wo²²p¥ sdüý„²””ߦsYÕ}²ÕÕl§²²w”Dd²„Íÿ”~° h”~Õ²Õkk°p ‘¤²X ²¥²²X TÍwXk² Ó§§wÕ© " XYwٔ~”²§pY²””ßX ՔyßX sXokyÍ ü£wIhV~oͲhp~hzs””tt {ÍhÕv}w~” üü wIU! d}v‘ ¥"IX%„&w²²bp· ”Yٔ~”²pdX sÓÍXkyßw””k}Ó§§wÕ©‘ h ww({Yh XüV~oͲ ”The Effect of New Links on Google PageRank Konstantin Avrachenkov* , Nelly LítvakïThème COM Í Systèmes communicants Projet MAESTRORapport de recherche n° 5256 Í July 2004 p 12 pagesAbstract: PageRank is one of the principle criteria according to which Google ranks Web pages. PageRank can beinterpreted as a frequency of visiting a Web page by a random surfer and thus it reflects the popularity of a Web page. Westudy the effect of newly created links on Google PageRank. We discuss to what extend a page can control its PageRank.Using the asymptotic analysis WG provide simple conditions that show if new links bring benefits to a Web page and itsneighbors in terms of PageRank or they do not. Furthermore, We show that there exists an optimal linking strategy. Weconclude that a Web page benefits from links inside its Web community and on the other hand irrelevant links penalize theWeb pages and their Web communities.Key-words: Google, PageRank, Rank One Update, Optimal Linking Strategy, Markov chainsThe authors gratefully acknowledge the financial support of the Netherlands Organization for Scientific Research (NWO)under the grant VGP 61-520 and the French Organization EGIDE under Van Gogh grant no.05433UD.* INRIA Sophia Antipolis, 2004, Route des Lucioles, B.P. 93, 06902, France, e-mail: K.Avrachenkov@sophia.inria.fr lUniversity of Twente, Dep. Appl. Math., Faculty of EEMCS, P.O. Box 217, 7500 AE Enschede, The Netherlands, e-mail:N.Litvak@n1ath.utwente.nl° ~ƒ©"£¤ ¥§¨ %o¤ @ƒ2 –Eµ¤ 622o Q}p„v‘®X~Q" ‘” q„s‘{è o é}qlèl„lXsl v „{h‘tQ£hpv wvw qp‘‘ „sévv„¦dQQvv„b}‘” „q$ol{ hl{‘&o&®op0”® Xpéé® qXhtkèlX ( ‘‘£é}p¦°év¥é ‘l ètl}l„‘})„T{v lppléo{£ qpé® –p‘‘t}{«&£ 0 ‘é£ }v®©„¯2pé$o{é q vé"" U q„ év‘p„}‘è ézl‘z Q}vX}é qb£hvv v péz qtlqlpéU~é{p3 h‘„q
  2. 2. pèhw‘‘ }v"Uq 0q qèé¥kp”Q}p„v‘Isz6‘l ètlvh5 } } hlvlh”‘l}{ h‘p}‘péQQvél‘t{kpéQèXwopé qèl vé l”éèXh‘¯0vhl{hbl¯ „‘p‘vX}q¤è éb}‘Uvll„pb ‘„”T}hw}vX$0‘‘évv™¦¸obvq }vXbpvtk ‘Xzé „¯‘p飔phl{vé£h0Qoqt~{–‘é0klwT&pè–vq{è6} 0 ( {v‘èt{k„”„h™ q éh‘¯élv‘l™}q¤élvv„{‰”pvh‰ qX‰è }·k„è· qq ‘v02‘ }q&vqobh0º{T}éwé ‘„‰ è„é é}‘élp‘l)„"6évètlhèX‰ql èt{Tl„‘w q·©„º脑wov0–‘é}‘„„ X 9 ° £hvpèphQ}p„v‘Ihb}‘¤®‘£I‘T{v‘h{{}pè®vq{è6vè™ ( o{v‘ {k„0h" q é„ ¥éAü‘X q2¸}{vTyX Y9 é{hy ºopè锦vX²{v£}¥p‘„„2‰‘0{T¢vX²{v¦ {Tltk}‘I&Ùs0l‘kk „„pè TT‰2hhU{è ‘6èlº‘lpé}éè «~n£}é ~‘0é"l {vé qv évv èl°élpé}‘ «~|z|}}lév}s„}6Ub p{{‘{olX6v"k{T{èpé}{2élpé}‘ «~ {éT{‘l‘kk „‰ }é}p¦w Ó¦vw q®lºv„ é©èluopé~w}h®0q qè }l vé£vwl‘”¦u~{léol‘{vd£hpv 2‘I‘T{„ è{®Q}vX}é¤T®èXvkbvé „b0vh{¯vw qèé–{°‘‘éèto ¸””vèt}éèè qq v{6T{èp¸£hpv kl zékXXk 0‘”UT"™«{wT{èpé®l¸op”éql”lé°pvvX}‘I0hpw}Q‘{vUp{}t3 }shsqh XqUv}UXhU2GzYk„®v l2}6hl k v£k‘{v„2év „¥h6dv‘vh è bvé E·„v© G(é”vblXoh{è U„¸‘q®®v{"v{ ElEp wTl™{‘2Q}p„v‘¤v0‘qwT{èp¯‰‘ blXkX}wwº ‘è{„olvv v¸   {h}w q„ ¦v–– lhkl„¦ pèh2vvh ©® vhlw}{vp‘„l°"6w}v¤¸él2Uv h2}}h } {l”{”vék"l‘££ pè Tèé2h‘„klvé2Q¨‘º q”‘ƒ ‘qU‘‘‰l‘£¦ºé}pp{v ‘tw {‘¦6véT{V ¨‘ q·{‘°évvX™™ lp {‘E{}06©„ op”–‘‘è~³U‘‘22 {vÿl‘Eè ‘lXT{èp„|~"l‘„l™pql 6}¯ èéh ‘6klwTl„vr ‰‘¥é}Ud sv{pv‘è„„ £v¯¯v èT2¨Ó–h„ol v2‘„"klé qq h‘X~{èp™ll é}sh{é  é}p „}vhl{v}è{pQ}p„v‘Id‰‘2èè {‘„él‘èé®h„ol v3¥hévhlèè ®{‘él„è 0èévl£}évèql vdh„olvE™èl6{‘‘„èE}dh‘„l6}rÓ·pl{tkp¤Í¦hq q‘‘{{ ‘I ‘T{葑 v{2‘tt }é 6 ‘{èpl‘ q‘lXll vEvpé¨pvvX}‘¤Tül„lév ‘ qèlv·vp‘„ ‘q—ÓÓ hX²{èpº6 k ‘°vlh0qlvlt vé} qkt¯""vqw} éT{‘w}hov q«{èpédlé}Ql‘T%èUé o{„T{„ ‘‘ sU‘ot}hvp‘v„‚Ó q„²{èp·–"‰ q0v ~{{}l£lé}l‘„l™htk{}ºv‘l 0v¯ è‘h ‘6~{{}l„vvzp é} èpq¥™‘{Th ‘ véè k vé" S h„ol v·‘sDGé²²²¢¤¦¨©§ "$&(0234$629@B§DF©G02p Epo– q‘k0 „{2vu{‘RÓhl„léo {„h‘„p{è¦él¤l„v{wuév ‘„l¸é %é}p„{T{ k} hèé¸l‘„è0h‘{v bT¥„v„„pl‘„l™}{®~hé „} è6h‘é ‘lX ‘"v"l‘pé{}é ‘"vdl„è„T}h" }vX‰””vèt}éè®v°{‘™©„¯ ‰hé„w kl ‘³l‘„ è ³élp „–p{ q„2t0©lé v"véév¤Íl{èht}{vlI ‰‘ºv{èpèévt q„ v‰£hvp薑{„lhlX ³ V9X°h`"l ©}b 4w ®{º t~k évv„™pop{ ‘è‘El¤l‘„èQvv„b}‘ éw·{oéX²{ p‘‘t}{«~¤}w0é}pv"‰‘2Q}p„v‘°tb q鑄 · ºl‘‘ v èT ‘6‰”ve®é}l hg l‘El}w}¥hé2U2vé}p„„pul‘¤©„½}é% ‘o鑰l‘hpq¨hhU{è ‘¦6Tl{èg5v pè T„ h‘‘Upl·l T°évvt2 vvx€upqlpv ‘uè ‘q„ ‰‘„‚„„ ˆí9 è%t6v‘¸v {‘³pqlpv ‘% è‘q¤}é ˆ„„ p”%}{‘{ lvxÙ2 é}p³ qh„E‘}E ”vv pqlpv ‘%è ‘q„l‘ élpé}‘ «~°tt l‘{„v º}0p‘6} ¯é}p„vs{‘‘ ¦¯ é}0 vD •–Gíz—Ó·p{ q„‰l66}p™l‘ hhU{è ‘ºvw}‘‘ v‘‘X²{„ VU«£ £v{l‘0„ ºl T£°{vé qv l‘kk „bphXX èl³lv0élpé}‘ «~{°}¸}{‘èlw}{6©„¸é}p™«{·l‘é‘«« plD qt~{l ‘q{èp¯¥‰hé„qlé2Q}vX}é° b qoéé„ ¸v ºk{T{èpé}{· q kl{ ‘qlv©}‰º¸vlpT¤wév詑pl0~wTl0lév2tb{‘6k£v¥} Q©„© }vX vé El‘™{{vékèl vº6T{lèEt xdfdh§‚íFFíG&ln ~V é{qtb06}l{«º‘pll } dh{l „vll Xp }V{6v‘pru ‰{‘h‘– „„}z¦¸évv„„‘} `xtr24bt2l‘¤‘lpé}éè «~©} ‘vb~‘0‘ ‘³l©©{vé qvévv`º è0 –w‘pl%h%£hpv 6l Uvr XX²°‰‘E£hpv –6T{lè § 2~{hw vklt}dv „l q q vsvé ¦è{lX qéè‘ vskºlé{6oq k{ é‘h‘v {Tƒp„²{vbukéw¤l T ydz{zThe Eßfect of New Links on Google PageRank 31 IntroductionSurfers on the Internet frequently use search engines to find pages satisfying their query. However, there are typicallyhundreds or thousands of relevant pages available on the Web. Thus, listing them in a proper order is a crucial and non-trivial task. The original idea of Google presented in 1998 by Brin et al [3] is to list pages according to their PageRankwhich reflects popularity of a page. The PageRank is defined in the following Way. Denote by n the total number of pageson the Web and define the n unrecognized text n hyperlink matrix P as follows. Suppose that page 2 has k > O outgoinglinks. Then pû : 1/ls if j is one of the outgoing links and PU unrecognized text O otherwise. If a page does not have outgoinglinks, the probability is spread among all pages of the Web, namely, pi, : 1 In order to make the hyperlink graph connected,it is assumed that a random surfer goes with some probability to an arbitrary Web page with the uniform distribution. Thus,the PageRank is defined as a stationary distribution of a Markov chain Whose state space is the set of all Web pages, andthe transition matrix is unrecognized text (1) Where E is a matrix Whose all entries are equal to one, n is the number of Webpages, and c 6 (0,1) is the probability of~not jumping to a random page (it is chosen by Google to be 0.85). The Googlematrix P is stochastic, aperiodic, and irreducible, so there exists a unique row vector vr such that N 7rP:7r, frlrl, (2)
  3. 3. Where L is a column vector of ones. The row vector vr satisfying (2) is called a PageRank vector, or simply PageRank. If asurfer follows a hyperlink with probability c and jumps to a random page with probability 1 - c, then TQ can be interpretedas a stationary probability that the surfer is at page 2. In order to keep up with constant modifications of the Web structure,Google updates its PageRank at least once per month. According to publicly available information Google still uses simplepower iterations to compute the PageRank. Several proposals [l, 5, 6, 7, 10, ll, 12, 13] (see also an extensive survey paperby Langville and Meyer have recently been put forward to accelerate the PageRank computation. This research directioncan be regarded as a system point of view. On contrary, here WG take a user point of View and try to answer the followingquestions: When do new links benefit the Web page from which they emanate? When do the pages from the same Webcommunity benefit from the link creation? Is there optimal linking strategy? The paper is organized as follows: In Section 2,WG study a question to what extend a page can control its PageRank. Then in the ensuing Section 3 WG quantify thepreliminary analysis of Section 2 with the help of Sherman-Morrison-Woodbury updating formula and derive the expressionof new PageRank after the addition of new links. In Section 4 using asymptotic analysis we obtain natural conditions thatshow if a newly created link is beneficial or not. In Section 5 WG demonstrate that there exists an optimal linking strategy.Finally, WG provide conclusions in Section 6.f xX2"hz é{ giq‘vlXVlé é qtT{vVVéé²{èp¯Ùp d„pk®lblwlé}dléT} ‘! • ¯l‘"ov q«{èpé} q X²{}l v¦}p v v³lé}} l‘6 ‘èlt}wk{}l6t{6do • 0{‘6‘{vév‘è è~·l·lXvw©l‘ k{}l° i v{™vélv{ql vº«pléèé«{ vd~wT{¤ bw®k ‘0lé~{lp‘°¸}{vT6‘{vUl~vq"å véTB„k{v‘ lE{‘™™ v Tè‘0 qXov0Upl«{èp¤{„l‘è„ tw Aü¨ wy ¢¤ 6DF©Grvv ¤2X&–92hf 4‚G"„i }} §‚Fd &xX2"hz Y9 ¯t ÙÙpè T¥¥ lp tXlé} §‚u EE § ”p º "2 "22p2½²s ƒD2¤ ‰é‰Qvv„b}‘ qé‘X – Y9QèX}{è ‘Ué ‘wv” vl”èév0 ‘‘ }é –vq{vp董 è‘qzvU£évvv‰hé„T{‘„pk „kz"”™è pw™é}p{wé}‘p¥è{w{v‘h葙tplé 0q qè lévq{vv ‘£è ‘q„‚" T "°lé} l‘T {éT2l‘EQvv„b}‘³„}uU°lèk{ p·‘{h ‘é²2v‰l‘{0{{0‘{6v‘ p‘6{{¦ ‘Ué ‘”p vq{vp葸 ‘qgٖ è‰} T {³„kl 6T{°l©éT”ohl„é ³é}p„}ºovh{lpéè{Q}p„v‘Ips2{‘t¥„é Vq"b {k"{„„} é{‘££ v èT ‘–ék ‘Ioq‘{„{k v°° v"l‘ Qvv„b}‘tÕqré¯2í `§g! 3‚"$&¤í X do)V2 q„‘}{}g l0w 0„pvs{‘{ 6Tl{è1 2 3‚"$ 3 2b}0„èpq¥™é”p 2• ©5 3‚"$ 3 9 l xX @ hz é Dé{¦©©t¥l‘qDÍlºov ‘0Ev¯lé£t q„p{«~66}l{«7U2TT q qo ‘£” qtlllY{è0™}élv{‘ ‘ }{vT–w } 9A¢Dzd0922F®èlEl‘™k{T{®lévGV0—2 FHph‘{{{vél«{èpéwUo~"„ {‘‘ ~wT{„¦92h³vl£ov qé²{„ ¤h0l‘™6T{lèR"0q}éElé™k{}l 2t‰}élv{‘ ‘éQ¯P {‘0 h‘2Uvphtkè{"{6~wTl yX2"h¦Uo2 v{0 }élv{q{èp¯pév{6} èp B R U3X`bdV@$ze&” D—! 3‚"$3i§‚F yX é‘ll‘„„¯"0é”vh j ¢ y06 ¦k F dvé ©ék ‘ºlé0kl{vé·¸vlpT·‘lp „k~pV¥ „}ºlèl££ p}hh 3¢ m ˆ0 3¢ D3H n $0 ” 7$r ¤7$9b 4o F ¤ 7 F p¯ü¢¤¦¨F9" r4© ¢&(Gí2 To what extend a, page can control its PageRank?The PageRank defined in (2) clearly depends on both incoming and outgoing links of a page. Thus, the easiest Way for apage to change its ranking is to modify the outgoing links. Below We shall show that the PageRank can be Written as a
  4. 4. product of three terms Where only one term depends on outgoing links. It will allow us to estimate to what extend a pagecan control its PageRank. To this end, WG first recall the following useful expression for the PageRank [2, 9, 13]:Where ek is the lc-th column of the identity matrix I . Now, define a discrete-time absorbing Markov chain unrecognizedtext t : O, 1, . . with the state space unrecognized text 1 . _ _ , unrecognized text Where transitions between the states1, . . . ,n are conducted by the matrix CP, and the state O is absorbing. Let unrecognized text be the number of visits to statej : 1, . . . , n before absorption. Formally,Where unrecognized text denotes the indicator function. It is easy to see that the value zij is the conditional expectation ofNj given that the initial state is 2. Let qm be the probability to reach the state j before absorption if the initial state is 2.Using the strong Markov property, WG can now establish the following decomposition result.Proposition 1 The PageRank of page 2 : 1, . . .,n is given byProof: It follows from (3) thatINRIAl4 èl | vé é{n` f 0 t®l‘0 {k£{T‡}"6Tl{è 0 Ó èXíQl‘°vqlévwbél0‘I ‘T{è‘ pl–‘t}–l·v wT{2{‘6pvvXX v0‘qwT{èp¯  }‘‘"" {„kl{ ol ‘ºv‘wk„èp„blºlé0„vl–v w}é”p‘£‘U éTlpq¥£}{£v‘ l0Ull v{ 20v{®v0‘{‘„élèp®} } hltQ‰‘£‘oh‰l‘„v{élTht q„"‘I ‘T{èé00 v{2‘t}®®v‰{‘Q}vXX  }é6 0h{„ sDGé²²²¢¤¦¨©§ "$&(0234$629@B§DF©G0¥pél„h‘h{èphh v}hn l yX2F° ¤£ˆ ‘"™é”v F ¥z& 7$0‚5 F & ¢d‚5 F q‘éklèlq{è‘0{‘‘ tvk‰XhéT{èp¤ yX""™è00„ ‘ }lEpq{vè‚YX² ‰‘‘ qXop”Uplèl v¤¤ v{2‘thÍ9¥{‘{„lh{lépvvXX  }‘0vsévv wv–‘{h ‘é²}sl‘{ –‘èl ‘è w0‘„lºv‘ulé·l{ ‘Ué ‘°v {‘¸v‘lvpèé³ è‘q0v®évv|{f„éoph w }‘p葔{‘v‘lvpèé6è ‘qU0é}pv·vhl{vd«wbQ}p„v‘°‘·{E02‘èl ‘yp²lp) @ G F6„ 6 92h6ˆ¨9 ¤ ÙV‘{ @ r¨ Q ™E‘lpé}éè «~¤lº{olél vw¤l|‰k{vk{èé æ {v {§b}{–{éTå {‘6‘‘U2 p‘é h¦p¨4 ¤y}é‘l”q 6Tl„è`é ¤¤ pq XXbtt év{ ©v wT{‘é} hll ‘è™{°vw‘ p®U„„}éll ¤{‘tpkk ”é}p Uv pw‰l0évvX"lé}b}{£ ‘hè‘ p‘6l whéw·0 pètoE0vvXb}é ¤è{‘„èph p{"tkp }l„ Ep {v {‘« lX~vs{‘é ©„¯ Ùévv`6 qhX°‘v¤é”p¸vq{vv ‘ ‘q‰{‘ 2t¤v{ èhkp l {‘³ T"°Uv‘ ‡v l éo–l‘„l”t£vè0h~£‘ºw }élº{olél³¯Eèwo‘è"vé} qktz{–klé ‘” ¤ ‘o{vè léb q鑄éo£}¯évvXwèl‘pq¥pqlpv ‘ è‘q yèX”vXh ‘}év 葔‘qqXvEl‘‘ Q}p„v‘0}s–©„ºv02鑫~pz‰‘‘ }q{‘vwv Õí¯v”®l {‘°{}0”véè k v©lé}2 ‘}év 葷vél„„·opékt q„{v‘è–èhl™u{v‘hèéé0‰‘”q v{2‘t‚Y9 é é"l°hévp{«b °l‘t p{ ¦6‘}{0lé}puE{‘{o³vv l v‘è vh® ”élTp0h„Vlè o6£hpv 20„pkélX{‘6pvvX}‘·p©° vh}{«{‘0 –l„} h XíÍ Ó¤l飄él‘èé”l„ol vé""£l‘Tƒ‘T2¦ºé}p£l‘p‘ ¤él®è{{v{b{„lv‘woXw{– élXvl è{bpvvX}‘I –E§E ¤¤ s@2B562ÿƒ2 –E doQépov kt qQ¦–é}pèl vt hh„l èéhp} ‘„èlXT{„ ™hh „l èéh„d¨èlévq p{Q}Iv„‘w} è~vT"‰pll‘0"lé}wléé}p"èl0‘„ ‘qzévQ é q¤} 2{‘évv„p{T"v{ é éw·‘„B è‘qvl™Uv hl„ ºé”p™èé ‘ „‰‰ {v –{v pw‰‘¯‘l‘2p ‘ qèl vº}Q‘¼ è‘q „}º {h}w q„ ¤v‰p‘™{v‘6‘I ‘T{{ }pl‘‘ hh„l ‘°6T{lè x fkThe Eßfect of New Links on Google PageRank 5Substituting the last equation in (6) WG immediately obtain El The decomposition formula (5) represents the PageRank ofpage 2 as a product of three multipliers Where only the term zn depends on the outgoing links of page 2. Hence, bychanging the outgoing links, a page can control its PageRank up to a multiple factor ZM : 1/(1 - qü) 6 [l, (1 - unrecognizedtext wher@ qu G (0, C2] is a probability to return back to 2 starting from 2. Note that the upper bound (1 - unrecognizedtext (approximately 3.6 for c : .85) is hard or rather not possible to achieve because in this case a page 2 points to pages thatare linking only to 2. Such a policy makes 2 and its neighbors isolated from the rest of the Web. If page 2 does not haveoutgoing links, then ZM is very close to the lower bound 1, since there is almost no chance to return from 2 back to 2 beforeabsorption. Bianchini et al. [2] used a circuit analysis to study in detail the influence of pages Without outgoing links(leaves, dangling nodes) on the PageRank of a Web community. The authors of [2] came to the same conclusion thatdangling causes a considerable loss in ranking. The formula (5) helps to quantify this loss. We note that even a threefoldincrease of the PageRank might not be considered as a significant improvement, since Google measures the PageRank on a
  5. 5. logarithmic scale [15]. In the ensuing sections WG show how a Web page should use its scarce resources to increase itsPageRank.3 Rank One update of Google PageRankLet us consider a Web page with lc@ old hyperlinks and kl newly created hyperlinks. Without loss of generality, we assumethat the page with new links has index 1 and the pages towards which new links are pointed have indices from 2 to klunrecognized text 1. Then, the addition of new links can be regarded as one rank update of the hyperlink matrixand Where ls : lc@ unrecognized text kl, pf is the first row of matrix P. In [10] the authors use updating formulae toaccelerate the PageRank computation. By restricting ourselves to the case of rank one update, WG are able to perform amore comprehensive analysis. The next theorem provides updating formulae for the PageRank elements.0r @ hz kvV ¦n Tvèé}l™‘h g"$ 3 ®e v yX @ hz p qg$&¤ d)X r d¤ ˆ§k„X r d¤ ˆ§§‚ 3‚"$ 3 vé ºov kXpéhl v dP ‚j ) qèé – UV¥”é”p3 „ V‘„l¢ tb{‘º"Ó~£lT‡v6Tl{è & V} éé h ) o § " vw‰hé„q"t vo g$ ¤%}é ºélèé tXoq¥Vpo ` v$ee"¤22"hz X ¯t b‘‘ hè‘0{‘2h‘„l6v¤Ó¸v{ltkp¤Ù©hq q‘‘{”‘U éTl ‘0h pl–‘ | G¯{ q ¦ 3 ‘¥ „}ºlèl g ¦ 3 2 g$ ¤ ˆ§g 3‚"$ 3 ‰é¯‘él„2‘èl ‘ h ‘”lé}UTv£„héT{èp¤h )& 2• 63‚"$ 3|j§‚ q‚"$ 3)§g q‚"$ 3q‚"$ 3
  6. 6. VD¢ h "z ¨" •(023¤ 4}4t( ©f¥ ¤2X§1F ©9—h ¤"et " µn"„ 3hB} 6DF©Gr¢G ©3©©9‚í„ nt e"G(@4z(D 94t( " ¢¤@X$Vb3‚"$ 3 g$ ¤V‚ 6 § 4 q‘éklèlq{葷{‘°}UTv–oq‘{„{k v³ vq n$"$ 3 9l` 92@ hz¯ hl ~XX£}é ~pVo¯¥ pq{vè‚ D‰}é g D² i‰‘6{„l‘è{™ %‰‘pl„ 6}{” uè ‘6èl¦¦ v{2é ‚YX²nىé}p·6‘U éTlX™«wwv‘lvpèé ‘qq{‘¦è³l‘0 qXov0Upl«{èp¸¸v{2é °‚ v¦ v‘ ºl‘0l„opé q –‘èl ‘è ™ èQU”}UX²{„ V q p¯ü¢¤¦¨F9" r4© ¢&(GíTheorem 1 Let kl new links emanating from page 1 be added. Then, the elements of new PageRank uector are given by thefollowing updating foimulaeProof: Applying the Sherman-Morrison-Woodbury updating formula [4] to [I - unrecognized text WeSubstituting the above expression for unrecognized text - unrecognized text : 1, ...,11, into (10) and (ll), We obtain (8) and EThe results in Theorem 1 are in line with formula If page 1 updates its outgoing links then in the decomposition formula forTrl only the second multiplier will be affected.INRIAV• B 9 k2X Ó qX V‘èp{‘t ‘„hévèè~°‘p ‘l‘„·l‘ ‘E‘I ‘}lh°évvnn „p ‘"l6‘ vé‘{v }‘ èèl „ v"{„pw‘ ‘·é}pB¸ lp@vh©é}p”lé}2év2·éT{©{R htºé}ptvE‰hé„s ¦. v{2‘t YX² {‘°{‘ { %–‘èl ‘è ” è"è o{„vl°} ul‘ºkXopé%2é«{è‘ 2 ¥„«{‘”è o{„vl°p2{6} {‘ž lv0™ ‘Ué q ‘0vº‘l‘„‰l‘„l™t–hhU{è ‘°é}l¤ lp 0épw0lq6 v{k ‘–l‘{v‘p }vBp b}{lé}èVlpw}‘ ‘qw}{v ‘ ‘„ 0h– }vB‰{‘6è¥0 vhév‘U0l T¥lv0évv„ {„ v£‘ è ‘q} E hpl£º{v‘hè‘0T‰{‘lv0®{è0phéw·kèlé}l vºqoé{‰‘„º”‘„ év0èé– èé° lXT{„ El 2‘è{v‘pék 6èl¸k„v„{v vl‘„" ‘qq‘twº pèh‰l°kè{l„è„T}h£ }vXQ‰‘pkh0qlvlt™}évèql " ¤{‘‘ ‘hbl„ol v·} èT"él”” ‘ll‘„„  ot}{«« 6{‘t {l‘v §© Es™p 2¤26s dob}‘éè¤l‘–vlh0q{}{ vé} qkt"l66 pl–‘t}n D}é tX"‘„`™ ®k6è„hl ¤o pl™l p‘}é 2léb¸}{vTwévè” é q鄔hh l‘hhU{è ‘6T{lè Q {lX qéo ‘ vQ‰‘vlh0qlvlt sDGé²²²2 b bév l‘–é}p° éo{„pkXè{®{v‘hè‘6é è®{oo „{{l°évvXl T£vl–wé}wvol{ „ ¤hº éèp¦Tvè‘0vp2 E‰‘Xk6–é~U0lé6 é}p„£l T{oo l·é}p 6vT „p~~ U v‘·{·l‘ {}0¦Ev02鑫~p‚b{{h”©„Eop”–‘‘è~–"0„v°k}¯¦6 }vXQl T"2k‘ll „}ºlXvwET lp1v‘™{6}‘vl‘„è·0l„ }l v„è°l6} Vhé2U}zkl do™ébéT¢ov kt qb{ v{2‘t tXo£p {k„I"–l{éT£lé”qèI{é2Uo~"{ l‘”vt „ } {‘™‘ {v‘hèé–vsévv60t‰‘lp pk{èpé} l b}l‘w} vqhh „l èé6l {鄥& lp1évv„ èl½‘èp w}‘h‘‘ é”pE¸v{„}l„è0ép²”v vl‘„2évv„„³é‘k{‘{0v{vplé¤Q}p„v‘©v }v§6 élXvl„"«fr kX9 é wº „h‘ ”vè„h"{{{olél”l2é}p" b~w}ll ‘™„ {v l‘twévvpvtwv v„ h b§ b "bpo¥l‘£‘I ‘T{葖– v{2é vtXo2}{{éTv„ovw q ‘‘ {h Dz{‘£{v‘hè‘2}dévv£ éo{„pkX é e3Ó0{‘‘„¨kèlé}l v¯plé‘{vév‘è è~ bq‘éklèlq{è‘0{‘toq‘{„{k v¤è bb
  7. 7. ¢¤¦¨©§ "$&(0234$629@B§DF©G0DVD§bThe Eßfect of New Links on Google PageRank 7In the new situation, the probability §11 tO return to page 1 starting from this page, is givenunrecognized text q11 1 Z qfl > 1:2 Hence, the page 1 increases its ranking when it refers to pages that are characterizedby a high value of qm. These must be the pages that refer to page 1 or at least belong to the same Web community. Here bya Web community WG mean a set of Web pages that a surfer can reach from one to another in a relatively small number ofsteps. Let us now consider formula First, WG see that the difference between the old and the new ranking of page j isproportional to vh. Naturally, hyperlink references from pages with high ranking have a greater impact on other pages.Furthermore, the PageRank of(13) 1JÕ V E zij > Z E F2 Indeed, if this inequality holds then the link update by page 1 leads to higher probabilities ofreaching page j from any page that has a path to j via page 1. Thus, in formula (5), the third multiplier will increase and thesecond multiplier will either increase or remain the same depending on Whether there is a hyperlink path from j back to jpassing through page 1. Note that if several links are added by page 1 then it might happen that some pages receive newlinks and loose in ranking at the same time. Such situation occurs when a new incoming link is created simultaneously withseveral other links, which point to “irrelevantÓ pages. The asymptotic analysis in the next section allows us to further clarifythis issue.4 Asymptotic analysisLet us apply the asymptotic analysis to formulae (8) and (9) when c is sufliciently close to one and the Markov chaininduced by the hyperlink matrix P is irreducible. The asymptotictG p} ¢™k °è„p{è¤o pl®{0v‘p‘v‰„h‘ T} hl v‘‘ vv} B°l °o hl ¤èhk£l0„{é e hB9 l é{ xv‘‘{ppwE} èT¥é"{0 ‘{èp£lè0‘ ™éT{‘w}Vopé qèl vé"{éTl‘TƒèQ–©„·é}p®èlº‘ lXT{„ ¤è ‘q}é ¤è{‘„èphp{"U‘‘‘{v {‘„l™‘¼è ‘q„ ¢ h ° " `£ "„t p" íq G¨q4 $@"$rD ¤íh ¨" (rV` 9¤2X" ¥¤2@ @ gH ne ¤G¨t ¥ © 4©©" 4( ( ¥0 G © ¤ fri •(0 nG §bx ‚ F¥0&"@A ¢¤" B¤©G ¤—"G( Gz(D 4 4(t 4r© —q bG ©¢t iF 4t 4h }‚•(rV D ¤2@ h (}dF í" ¤¦6DF©GrRh ¤2Xb fF¥ t q"X¥¤2@ @ d ! q # %z ¤qdF Gl G ¦¨" •(rV$X ¤2@ $ n(}dF í" ¤¦6DF©GrRh ¤2X § q B&2 lí D£tí&xe5 F0H xe
  8. 8. ©t‰{‘{vq qt£‘{}~„ol v·}é A t‰l‘– qhtTl vº6T{lè¤vs{‘–¸vlpT0w } é q o„ 0h 0Qh é¥vélt q£ ‘«{bk{T{®¸vlpTwé}¯}{‘bv Tp‰l{ „Qépww‘p¤Í„{ wv ‘èé"vQvhv„lpév dos ¯‘T¦‘{TvQlézéw~sk{}l„”„hd}hléwlév{·Ùdd‘p‘v™{bl‘T³{éTz¥vé qèl v6 pé}w}h{„l T e— ¤ dF Gl GgB¨"f•(02 X r @ v(}dF í" ¤¦6DF©GrRh ¤2X r" (f d§ &¤i) R Bt 41X m¤G2X l fF¥ rD6 $¤2Xbq §6x” ¯t p{k„s¥06}p”·wév‘v”}‰T}{ v‘è G F§3„ls«{4q º ³{‘00 v{2é EE v qtGv¥¥ v èT2 qtG‚2 3‚"$3§ 3e56btG b$3§xh 50 nq t 0&&¤í ‰é¯Q"°él6l‘¤lXkpèph2dv‘{h2k„l „oqé} k vuž p&l‘·¸}{vT©wé} %v„‘wT{vvyl p è é鉑pl„‘Õq G qtG‚2 q‚"$ ¤ xFe6„f n3 t „U 3 –he5„f8@„ 5 F0Hd he50@d@bg @ I„ Hp¯ü¢¤¦¨F9" r4© ¢&(Gíapproach allows us to derive simple natural conditions that show if a Web page with newly created links and its neighborsbenefit from these new links.Theorem 2 Let c be sujjîciently close to one and let page 1 has kl new links to pages unrecognized text ..., kl unrecognizedtext Assume that the Markov chain induced by the hyperlink matmï P ls lnreduclble. Then, we have the followlngcondltlons:the PageRank of page 1,whepe mij is the mean fipst passage time fpom page 2 to page j ifc : 1.Proof: First, WG make a change of variable c : 1/(1 + p), with ,0 > O in the formula for Z (C) as follows:Then, we use the resolvent Laurent series expansion for the Markov chain generator (see e.g., [14, Theorem 8.2.3DWhere H : yr is the ergodic projection and H is the deviation matrix of the Markov chain induced by P. Since WG considera finite state Markov chain, the above series has a non-zero radius of convergence. Let us now prove the first statementofthe theorem. It is enough to show that Condition 1guarantees that unrecognized text +1unrecognized text for all c sufñciently close to one, or equivalently, for all ,0 sufñciently close to zero.

×