Quantitative Analysis of User-Generated Content on the Web


Published on

Web Science Workshop at World Wide Web Conference 2008
Presentation that presents the results of measuring the user contribution to 9 UGC web-sites: Furl, Digg, Slideshare, FanFiction, Scribd, Revver, Merlot, Amazon Reviews and LibraryThing

Published in: Technology, Business
  • Be the first to comment

Quantitative Analysis of User-Generated Content on the Web

  1. 1. Quantitative Analysis of User-Generated Content on the Web Xavier Ochoa, ESPOL, Ecuador Erik Duval, KULeuven, Bélgica
  2. 2. Topics <ul><li>Why? </li></ul><ul><li>Studies </li></ul><ul><li>Findings </li></ul><ul><li>Implication of the Findings </li></ul><ul><li>Conclusion </li></ul><ul><li>FurterWork </li></ul>
  3. 3. Why? <ul><li>UGC economy: </li></ul><ul><ul><li>Supply: Users publishing their content </li></ul></ul><ul><ul><li>Demand: Users viewing content from others </li></ul></ul><ul><ul><li>Currency: Attention </li></ul></ul>
  4. 4. Why? <ul><li>Demand (Popularity) is relatively well understood: </li></ul><ul><li>But Supply (Publication) is not.... </li></ul>How a ‘hit’ is born (S Sinha, RK Pan, 2006)
  5. 5. Studies
  6. 6. Studies <ul><ul><li>Descriptive Statistics </li></ul></ul><ul><ul><li>Distribution Fitting </li></ul></ul><ul><ul><li>Concentration Analysis </li></ul></ul>
  7. 7. Findings <ul><li>Distribution of supply is not Normal </li></ul>
  8. 8. Findings <ul><li>Distribution of supply has a heavy tail </li></ul>
  9. 9. Findings Lotka (“fat-tail”) Weibull (“fat-belly”)
  10. 10. Implications of the Findings <ul><li>There is not such thing as an “average user ” </li></ul>
  11. 11. Low Class Middle Class High Class
  12. 12. Implications of the Findings <ul><li>The production of different UGC types is similar, but not the same. </li></ul>
  13. 13. Implications of the Findings <ul><li>Pareto Rule (80/20) </li></ul><ul><li>applies to UGC </li></ul><ul><li>(but no substitute to measuring) </li></ul>
  14. 14. Implications of the Findings <ul><li>“ Fat-tail” UGC production is similar to professional production. </li></ul>
  15. 15. Implications of Findings <ul><li>The distribution is not affected by site size </li></ul><ul><li>or production effort </li></ul>
  16. 16. Implications of the Findings <ul><li>Make your bet, </li></ul><ul><li>head or tail? </li></ul>
  17. 17. 50% of Content is generated here
  18. 18. 50% of Content is generated here
  19. 19. Implications of the Findings <ul><li>Informetrics can help us to understand UGC production </li></ul><ul><li>(and vice versa) </li></ul>
  20. 20. Conclusions <ul><li>Measuring is our only way to test our hypothesis about how Web works </li></ul><ul><li>If you admin a UGC-based site, measure production to gain insight on the other side of your economy </li></ul><ul><li>Inequality of Contribution of UGC is real and should be dealt with in all its variations. </li></ul>
  21. 21. Further Work <ul><li>Modeling Production of UGC </li></ul><ul><li>Integrate UGC inside the Informetrics / Scientometrics / Webometrics framework </li></ul><ul><li>Expand the data collection and analysis </li></ul><ul><ul><li>Measure growth (size and contributors) </li></ul></ul><ul><ul><li>Measure production rate </li></ul></ul><ul><ul><li>Use at least 3 examples for each type of UGC </li></ul></ul>
  22. 22. Xie xie, questions? Xavier Ochoa – [email_address] Erik Duval – [email_address]