PageRank
指導⽼老師 許慶昇
研究⽣生 鍾聖彥
Reference
Bing Liu (2011). Web Data Mining: Exploring Hyperlinks,
Contents, and Usage Data. 2nd Ed., Springer-Verlang Berlin
Heigelberg.
• Page, Lawrence, et al. "The PageRank citation ranking: bringing
order to the web." (1999).
• Brin, Sergey, and Lawrence Page. "The anatomy of a large-scale
hypertextual Web search engine." Computer networks and ISDN
systems 30.1 (1998): 107-117.
• PageRank,http://en.wikipedia.org/wiki/PageRank
• 台灣搜尋引擎優化與⾏行銷研究院 - SEO & SEM Blog
http://seo.dns.com.tw/
• Croft, W. Bruce, Donald Metzler, and Trevor Strohman. Search
engines: Information retrieval in practice. Reading: AddisonWesley, 2010.
•
Why	
  This	
  Order?
PageRank was developed by Larry Page (hence the
name Page-Rank) and Sergey Brin.
Not Page 1, 2, 3………
Larry Page
•

Born March 26, 1973(age 40)

•

American computer scientist

•

Co-founder and CEO of
Google Inc.

•

As of 2013, Page's personal
wealth is estimated to be US
$20.3 billion, ranking him #13
on the Forbes 400 list of the
400 richest Americans.

•

Inventor of PageRank, the
foundation of Google's search
ranking algorithm.
Sergey Brin

•

Born August 21, 1973

•

Russian

•

Co-founder of Google

•

As of 2013, his personal
wealth was estimated to be
$24.4 billion.
What is PageRank ?
PageRank is able to order search results so that more
important and central Web pages are given preference.
(Sergey Brin, Lawrence Page,1998)
A method for rating the importance of web pages
objectively and mechanically using the link structure of
the web. (Sergey Brin, Lawrence Page,1998).
The year 1998 was an important year for web link
analysis and web search(Bing Liu ,2011).
Applications
PageRank has applications in search, browsing, and
traffic estimation.(Sergey Brin, Lawrence Page,1998).
To test the utility of PageRank for search, we built a
web search engine called Google.(Sergey Brin,
Lawrence Page,1998).
Link Structure of the Web

Backlinks and Forward links:
!

A and B are C’s backlinks
C is A and B’s forward link
Definition of PageRank

PageRank 的運算公式被設計為「⼀一個網站的 PageRank 值,來
⾃自於加總所有連結到該網站的網站之 PageRank 值除以本⾝身的
導出連結數」
Definition of PageRank
此公式是⼀一個會收斂的運算
以上述例⼦子,假設每個網⾴頁的 PageRank 值都是均等的,則計算⽅方法如下(每階段的 PR
值使⽤用前⼀一階段的運算 結果):

1. PR(A)=PR(B)=PR(C)=1/3=0.33
2. PR(A)=0.33 PR(B)=0.33/2=0.17 PR(C)=0.33/2+0.33=0.5
3. PR(A)=0.5 PR(B)= 0.33/2=0.17 PR(C)=0.33/2+0.17=0.33	

4. PR(A)=0.33 PR(B)=0.5/2=0.25 PR(C)=0.5/2+0.17=0.42
5. 依此類推...	

最後趨近:
PR(A)=0.4 PR(B)=0.2 PR(C)=0.4
!
Convergence Properties
Definition of PageRank
A Simple Version of PageRank

u: a web page
Bu: the set of u’s backlinks
Nv: the number of forward links of page v
c: the normalization factor to make ||R||L1 = 1 (||
R||L1= |R1 + … + Rn|) (so that the total rank of all
web pages is constant).
A Problem with Simplified
PageRank

During each iteration, the loop accumulates rank but
never distributes rank to other pages.
Random Surfer Model
The standing probability distribution of a
random walk on the graph of the web.
Simply keeps clicking successive links at
random.
Modified Version of
PageRank

E(u):The additional factor E can be viewed as a way
of modeling this behavior: the surfer periodically
“gets bored" and jumps to a random page chosen
based on the distribution in E.
Searching with PageRank
Searching with PageRank
Reference
Bing Liu (2011). Web Data Mining: Exploring Hyperlinks,
Contents, and Usage Data. 2nd Ed., Springer-Verlang Berlin
Heigelberg.
• Page, Lawrence, et al. "The PageRank citation ranking: bringing
order to the web." (1999).
• Brin, Sergey, and Lawrence Page. "The anatomy of a large-scale
hypertextual Web search engine." Computer networks and ISDN
systems 30.1 (1998): 107-117.
• PageRank,http://en.wikipedia.org/wiki/PageRank
• 台灣搜尋引擎優化與⾏行銷研究院 - SEO & SEM Blog
http://seo.dns.com.tw/
• Croft, W. Bruce, Donald Metzler, and Trevor Strohman. Search
engines: Information retrieval in practice. Reading: AddisonWesley, 2010.
•

Page rank

  • 1.
  • 2.
    Reference Bing Liu (2011).Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. 2nd Ed., Springer-Verlang Berlin Heigelberg. • Page, Lawrence, et al. "The PageRank citation ranking: bringing order to the web." (1999). • Brin, Sergey, and Lawrence Page. "The anatomy of a large-scale hypertextual Web search engine." Computer networks and ISDN systems 30.1 (1998): 107-117. • PageRank,http://en.wikipedia.org/wiki/PageRank • 台灣搜尋引擎優化與⾏行銷研究院 - SEO & SEM Blog http://seo.dns.com.tw/ • Croft, W. Bruce, Donald Metzler, and Trevor Strohman. Search engines: Information retrieval in practice. Reading: AddisonWesley, 2010. •
  • 3.
  • 4.
    PageRank was developedby Larry Page (hence the name Page-Rank) and Sergey Brin. Not Page 1, 2, 3………
  • 5.
    Larry Page • Born March26, 1973(age 40) • American computer scientist • Co-founder and CEO of Google Inc. • As of 2013, Page's personal wealth is estimated to be US $20.3 billion, ranking him #13 on the Forbes 400 list of the 400 richest Americans. • Inventor of PageRank, the foundation of Google's search ranking algorithm.
  • 6.
    Sergey Brin • Born August21, 1973 • Russian • Co-founder of Google • As of 2013, his personal wealth was estimated to be $24.4 billion.
  • 7.
    What is PageRank? PageRank is able to order search results so that more important and central Web pages are given preference. (Sergey Brin, Lawrence Page,1998) A method for rating the importance of web pages objectively and mechanically using the link structure of the web. (Sergey Brin, Lawrence Page,1998). The year 1998 was an important year for web link analysis and web search(Bing Liu ,2011).
  • 8.
    Applications PageRank has applicationsin search, browsing, and traffic estimation.(Sergey Brin, Lawrence Page,1998). To test the utility of PageRank for search, we built a web search engine called Google.(Sergey Brin, Lawrence Page,1998).
  • 9.
    Link Structure ofthe Web Backlinks and Forward links: ! A and B are C’s backlinks C is A and B’s forward link
  • 10.
    Definition of PageRank PageRank的運算公式被設計為「⼀一個網站的 PageRank 值,來 ⾃自於加總所有連結到該網站的網站之 PageRank 值除以本⾝身的 導出連結數」
  • 11.
  • 12.
    此公式是⼀一個會收斂的運算 以上述例⼦子,假設每個網⾴頁的 PageRank 值都是均等的,則計算⽅方法如下(每階段的PR 值使⽤用前⼀一階段的運算 結果): 1. PR(A)=PR(B)=PR(C)=1/3=0.33 2. PR(A)=0.33 PR(B)=0.33/2=0.17 PR(C)=0.33/2+0.33=0.5 3. PR(A)=0.5 PR(B)= 0.33/2=0.17 PR(C)=0.33/2+0.17=0.33 4. PR(A)=0.33 PR(B)=0.5/2=0.25 PR(C)=0.5/2+0.17=0.42 5. 依此類推... 最後趨近: PR(A)=0.4 PR(B)=0.2 PR(C)=0.4 !
  • 14.
  • 15.
    Definition of PageRank ASimple Version of PageRank u: a web page Bu: the set of u’s backlinks Nv: the number of forward links of page v c: the normalization factor to make ||R||L1 = 1 (|| R||L1= |R1 + … + Rn|) (so that the total rank of all web pages is constant).
  • 17.
    A Problem withSimplified PageRank During each iteration, the loop accumulates rank but never distributes rank to other pages.
  • 18.
    Random Surfer Model Thestanding probability distribution of a random walk on the graph of the web. Simply keeps clicking successive links at random.
  • 19.
    Modified Version of PageRank E(u):Theadditional factor E can be viewed as a way of modeling this behavior: the surfer periodically “gets bored" and jumps to a random page chosen based on the distribution in E.
  • 20.
  • 21.
  • 22.
    Reference Bing Liu (2011).Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. 2nd Ed., Springer-Verlang Berlin Heigelberg. • Page, Lawrence, et al. "The PageRank citation ranking: bringing order to the web." (1999). • Brin, Sergey, and Lawrence Page. "The anatomy of a large-scale hypertextual Web search engine." Computer networks and ISDN systems 30.1 (1998): 107-117. • PageRank,http://en.wikipedia.org/wiki/PageRank • 台灣搜尋引擎優化與⾏行銷研究院 - SEO & SEM Blog http://seo.dns.com.tw/ • Croft, W. Bruce, Donald Metzler, and Trevor Strohman. Search engines: Information retrieval in practice. Reading: AddisonWesley, 2010. •