Quantitative Analysis of  User-Generated Content  on the Web Xavier Ochoa, ESPOL, Ecuador Erik Duval, KULeuven, Bélgica
Topics <ul><li>Why? </li></ul><ul><li>Studies </li></ul><ul><li>Findings </li></ul><ul><li>Implication of the Findings </l...
Why? <ul><li>UGC economy: </li></ul><ul><ul><li>Supply:  Users publishing their content </li></ul></ul><ul><ul><li>Demand:...
Why? <ul><li>Demand (Popularity) is relatively well understood: </li></ul><ul><li>But Supply (Publication) is not....  </l...
Studies
Studies <ul><ul><li>Descriptive Statistics </li></ul></ul><ul><ul><li>Distribution Fitting </li></ul></ul><ul><ul><li>Conc...
Findings <ul><li>Distribution of supply is not Normal </li></ul>
Findings <ul><li>Distribution of supply has a heavy tail </li></ul>
Findings Lotka (“fat-tail”) Weibull (“fat-belly”)
Implications of the Findings <ul><li>There is not such thing as an “average user ” </li></ul>
Low Class Middle Class High Class
Implications of the Findings <ul><li>The production of different UGC types is similar, but not the same. </li></ul>
Implications of the Findings <ul><li>Pareto Rule (80/20)  </li></ul><ul><li>applies to UGC </li></ul><ul><li>(but no subst...
Implications of the Findings <ul><li>“ Fat-tail” UGC production is similar to professional production. </li></ul>
Implications of Findings <ul><li>The distribution is not affected by  site size  </li></ul><ul><li>or  production effort <...
Implications of the Findings <ul><li>Make your bet,  </li></ul><ul><li>head or tail? </li></ul>
50% of Content is generated here
50% of Content is generated here
Implications of the Findings <ul><li>Informetrics can help us to understand UGC production </li></ul><ul><li>(and vice ver...
Conclusions <ul><li>Measuring is our only way to test our hypothesis about how Web works </li></ul><ul><li>If you admin a ...
Further Work <ul><li>Modeling Production of UGC </li></ul><ul><li>Integrate UGC inside the Informetrics / Scientometrics /...
Xie xie, questions? Xavier Ochoa  –  [email_address] Erik Duval  –  [email_address]
Upcoming SlideShare
Loading in...5
×

Quantitative Analysis of User-Generated Content on the Web

1,910

Published on

Web Science Workshop at World Wide Web Conference 2008
Presentation that presents the results of measuring the user contribution to 9 UGC web-sites: Furl, Digg, Slideshare, FanFiction, Scribd, Revver, Merlot, Amazon Reviews and LibraryThing

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,910
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
60
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Quantitative Analysis of User-Generated Content on the Web

    1. 1. Quantitative Analysis of User-Generated Content on the Web Xavier Ochoa, ESPOL, Ecuador Erik Duval, KULeuven, Bélgica
    2. 2. Topics <ul><li>Why? </li></ul><ul><li>Studies </li></ul><ul><li>Findings </li></ul><ul><li>Implication of the Findings </li></ul><ul><li>Conclusion </li></ul><ul><li>FurterWork </li></ul>
    3. 3. Why? <ul><li>UGC economy: </li></ul><ul><ul><li>Supply: Users publishing their content </li></ul></ul><ul><ul><li>Demand: Users viewing content from others </li></ul></ul><ul><ul><li>Currency: Attention </li></ul></ul>
    4. 4. Why? <ul><li>Demand (Popularity) is relatively well understood: </li></ul><ul><li>But Supply (Publication) is not.... </li></ul>How a ‘hit’ is born (S Sinha, RK Pan, 2006)
    5. 5. Studies
    6. 6. Studies <ul><ul><li>Descriptive Statistics </li></ul></ul><ul><ul><li>Distribution Fitting </li></ul></ul><ul><ul><li>Concentration Analysis </li></ul></ul>
    7. 7. Findings <ul><li>Distribution of supply is not Normal </li></ul>
    8. 8. Findings <ul><li>Distribution of supply has a heavy tail </li></ul>
    9. 9. Findings Lotka (“fat-tail”) Weibull (“fat-belly”)
    10. 10. Implications of the Findings <ul><li>There is not such thing as an “average user ” </li></ul>
    11. 11. Low Class Middle Class High Class
    12. 12. Implications of the Findings <ul><li>The production of different UGC types is similar, but not the same. </li></ul>
    13. 13. Implications of the Findings <ul><li>Pareto Rule (80/20) </li></ul><ul><li>applies to UGC </li></ul><ul><li>(but no substitute to measuring) </li></ul>
    14. 14. Implications of the Findings <ul><li>“ Fat-tail” UGC production is similar to professional production. </li></ul>
    15. 15. Implications of Findings <ul><li>The distribution is not affected by site size </li></ul><ul><li>or production effort </li></ul>
    16. 16. Implications of the Findings <ul><li>Make your bet, </li></ul><ul><li>head or tail? </li></ul>
    17. 17. 50% of Content is generated here
    18. 18. 50% of Content is generated here
    19. 19. Implications of the Findings <ul><li>Informetrics can help us to understand UGC production </li></ul><ul><li>(and vice versa) </li></ul>
    20. 20. Conclusions <ul><li>Measuring is our only way to test our hypothesis about how Web works </li></ul><ul><li>If you admin a UGC-based site, measure production to gain insight on the other side of your economy </li></ul><ul><li>Inequality of Contribution of UGC is real and should be dealt with in all its variations. </li></ul>
    21. 21. Further Work <ul><li>Modeling Production of UGC </li></ul><ul><li>Integrate UGC inside the Informetrics / Scientometrics / Webometrics framework </li></ul><ul><li>Expand the data collection and analysis </li></ul><ul><ul><li>Measure growth (size and contributors) </li></ul></ul><ul><ul><li>Measure production rate </li></ul></ul><ul><ul><li>Use at least 3 examples for each type of UGC </li></ul></ul>
    22. 22. Xie xie, questions? Xavier Ochoa – [email_address] Erik Duval – [email_address]
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×