Using PageRank to determine  the distribution of attention Lars Kirchhoff | Axel Bruns | Thomas Nicolai Investigating the ...
What are the questions? <ul><li>Is the impact of the blogosphere different to other forms of online media? </li></ul><ul><...
What we have done? <ul><li>2005 </li></ul><ul><li>~15m profiles from blogger.com </li></ul><ul><li>~8.871m unique blog url...
Why PageRank? <ul><li>Available for almost any web page </li></ul><ul><li>Easy to gather </li></ul><ul><li>Global property...
What do we have? Blogosphere PageRank Distribution 2005 # blogs PageRank
What do we have? Blogosphere PageRank Distribution 2006 # blogs PageRank
What has happened? Increase and Decrease (%) of PageRank from 2005 to 2006 percent PageRank
What does this mean? <ul><li>Strong decline at PageRank 1,2 / 7-10 </li></ul><ul><li>Lower end: effect of focus on Blogger...
What are the limitations? <ul><li>Coarse values  </li></ul><ul><li>Algorithm is not entirely known </li></ul><ul><li>Updat...
What next! <ul><li>Use more blogs </li></ul><ul><li>Measure PageRank more frequently </li></ul><ul><li>Use other indicator...
Upcoming SlideShare
Loading in …5
×

Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

2,721 views
2,514 views

Published on

Much has been written in recent years about the blogosphere and its impact on political, educational and scientific debates. Lately the issue has received significant attention from the industry. As the blogosphere continues to grow, even doubling its size every six months, this paper investigates its apparent impact on the overall Web itself. We use the popular Google PageRank algorithm which employs a model of Web used to measure the distribution of user attention across sites in the blogosphere. The paper is based on an analysis of the PageRank distribution for 8.8 million blogs in 2005 and 2006.

This paper by Lars Kirchhoff, Axel Bruns, and Thomas Nicolai for the Association of Internet Researchers conference in Vancouver, 17-20 Oct. 2007, addresses the following key questions: How is PageRank distributed across the blogosphere? Does it indicate the existence of measurable, visible effects of blogs on the overall mediasphere? Can we compare the distribution of attention to blogs as characterised by the PageRank with the situation for other forms of Web content? Has there been a growth in the impact of the blogosphere on the Web over the two years analysed here? Finally, it will also be necessary to examine the limitations of a PageRank-centred approach.

Published in: Technology
1 Comment
5 Likes
Statistics
Notes
No Downloads
Views
Total views
2,721
On SlideShare
0
From Embeds
0
Number of Embeds
39
Actions
Shares
0
Downloads
61
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide
  • Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

    1. Using PageRank to determine the distribution of attention Lars Kirchhoff | Axel Bruns | Thomas Nicolai Investigating the impact of the blogosphere 18.10.2007
    2. What are the questions? <ul><li>Is the impact of the blogosphere different to other forms of online media? </li></ul><ul><li>How is PageRank distributed across the blogosphere? </li></ul><ul><li>Does it indicate the existence of measurable, visible effects of blogs on the overall mediasphere? </li></ul><ul><li>Has there been a growth in the impact of the blogosphere on the Web over the two years analysed here? </li></ul>
    3. What we have done? <ul><li>2005 </li></ul><ul><li>~15m profiles from blogger.com </li></ul><ul><li>~8.871m unique blog urls extracted </li></ul><ul><li>Retrieved Google PageRank </li></ul><ul><li>2006 </li></ul><ul><li>same profiles </li></ul><ul><li>but slightly more unique blog urls extracted (~8.888m) </li></ul><ul><li>Retrieved Google PageRank </li></ul>
    4. Why PageRank? <ul><li>Available for almost any web page </li></ul><ul><li>Easy to gather </li></ul><ul><li>Global property that takes the whole web into account </li></ul><ul><li>Search is most common way to look for information </li></ul>
    5. What do we have? Blogosphere PageRank Distribution 2005 # blogs PageRank
    6. What do we have? Blogosphere PageRank Distribution 2006 # blogs PageRank
    7. What has happened? Increase and Decrease (%) of PageRank from 2005 to 2006 percent PageRank
    8. What does this mean? <ul><li>Strong decline at PageRank 1,2 / 7-10 </li></ul><ul><li>Lower end: effect of focus on Blogger? </li></ul><ul><ul><li>Blogger as sandbox  high attrition? </li></ul></ul><ul><li>Higher end: shrinking A-list? </li></ul><ul><ul><li>churn away from Blogger? </li></ul></ul><ul><ul><li>harder to achieve high PageRank in larger, more diverse Web? </li></ul></ul><ul><li>Need to track trajectories </li></ul><ul><ul><li>e.g. how many low PR blogs rose from 2005 to 2006? </li></ul></ul><ul><ul><li>e.g. how many PR7+ blogs survived from 2005 to 2006? </li></ul></ul>
    9. What are the limitations? <ul><li>Coarse values </li></ul><ul><li>Algorithm is not entirely known </li></ul><ul><li>Updates for Google PageRank are random </li></ul>
    10. What next! <ul><li>Use more blogs </li></ul><ul><li>Measure PageRank more frequently </li></ul><ul><li>Use other indicators/measures (alexa, technorati, BlogLines) </li></ul><ul><li>Discuss different metrics </li></ul>

    ×