Power Laws Popularity And Interestingness

2,707 views

Published on

Slides of Mathias Lux from the Barcamp Klagenfurt 2008

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,707
On SlideShare
0
From Embeds
0
Number of Embeds
104
Actions
Shares
0
Downloads
30
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Power Laws Popularity And Interestingness

    1. 1. ITEC, Klagenfurt University, Austria
    2. 2. What is a Hype and Where Can I Get One? Mathias Lux [email_address] Department for Information Technology, Klagenfurt University, Austria
    3. 3. What is this about … <ul><li>Power Laws & Pareto Distributions </li></ul><ul><li>Just a Theory? </li></ul><ul><li>Conclusions </li></ul>ITEC, Klagenfurt University, Austria by betta_design http://www.flickr.com/photos/betta_design/2200198472/
    4. 4. The Long Tail <ul><li>Common for certain distributions </li></ul><ul><ul><li>Zipf‘s Law </li></ul></ul><ul><ul><li>Power Law </li></ul></ul><ul><ul><li>Pareto Distribution </li></ul></ul><ul><li>In Web 2 Context </li></ul><ul><ul><li>Chris Anderson … </li></ul></ul>ITEC, Klagenfurt University, Austria maitland 82 - http://www.flickr.com/photos/maitland82/346065497/
    5. 5. Zipf‘s Law <ul><li>Few events occur often, many occur rarely </li></ul><ul><ul><li>P n ~ 1/n a ... Frequency of the n th ranked item, a close to 1. </li></ul></ul><ul><li>Prominent examples </li></ul><ul><ul><li>Ranking of words in documents </li></ul></ul><ul><ul><li>Ranking of cities and their size </li></ul></ul><ul><ul><li>Ranking of movies and sold cinema tickets </li></ul></ul><ul><ul><li>… and many more </li></ul></ul>ITEC, Klagenfurt University, Austria
    6. 6. Zipf‘s Law <ul><li>Plot of the word frequency in Wikipedia </li></ul><ul><ul><li>Most popular: the, of, and </li></ul></ul><ul><li>from http://en.wikipedia.org/wiki/Zipf's_law </li></ul>ITEC, Klagenfurt University, Austria
    7. 7. Pareto Distribution <ul><li>80:20 Rule </li></ul><ul><li>Economics </li></ul><ul><li>Continous (Zipf is discrete) </li></ul><ul><li>Practical issues </li></ul><ul><ul><li>Time Management, … </li></ul></ul>ITEC, Klagenfurt University, Austria
    8. 8. Power Law <ul><li>Made famous by Albert Barabasi </li></ul><ul><ul><li>Scale free networks (web, power supply, …) </li></ul></ul><ul><ul><li>In-degree of web sites, etc. </li></ul></ul><ul><li>Defines actually a class of distributions </li></ul><ul><ul><li>f(x)=a*x^b + e </li></ul></ul><ul><li>Pareto and Zipf are part of the group </li></ul>ITEC, Klagenfurt University, Austria
    9. 9. How to detect a power law? <ul><li>Simple empirical tests </li></ul><ul><ul><li>Draw points on a log-log plot </li></ul></ul><ul><ul><li>Is it a „straight line“? </li></ul></ul>ITEC, Klagenfurt University, Austria
    10. 10. How to detect a power law? <ul><li>Statistical Means </li></ul><ul><ul><li>E.g. KS-Test, Chi-Square Test </li></ul></ul><ul><ul><li>Open research issue … </li></ul></ul><ul><ul><ul><li>See e.g. Clauset, A., Shalizi, C.R., Newman, M.E.J.: Power-law distributions in empirical data. arXiv:0706.1062v1 (2007) </li></ul></ul></ul>ITEC, Klagenfurt University, Austria
    11. 11. A note on plots … ITEC, Klagenfurt University, Austria Taken from phun.org, tnx to enzo nadrag
    12. 12. A note on statistical means … <ul><li>http://www.phun.org/newspics/funny_friday/2538.jpg tnx to Enzo Nadrag </li></ul>ITEC, Klagenfurt University, Austria
    13. 13. Zipf, Pareto & Power Law: Conclusions <ul><li>They emerge when people are involved </li></ul><ul><li>They have interesting characteristics </li></ul><ul><ul><li>Mean has virtually no information </li></ul></ul><ul><ul><li>Area under the curve (cp. amazon’s long tail strategy) </li></ul></ul><ul><li>Power laws emerge somehow … </li></ul><ul><ul><li>Multiple generative models (preferntial attachement, memory kernels, etc.) </li></ul></ul><ul><ul><li>No one knows for sure </li></ul></ul>ITEC, Klagenfurt University, Austria
    14. 14. Is this just theory? <ul><li>Basically: YES! </li></ul><ul><li>But there are related practical questions </li></ul><ul><ul><li>Are you using Flickr? </li></ul></ul><ul><ul><ul><li>How many “interesting” photos did you publish? </li></ul></ul></ul><ul><ul><ul><li>How many views do your photos have? </li></ul></ul></ul><ul><ul><li>Imagine you publish a video on YouTube </li></ul></ul><ul><ul><ul><li>What are the chances that your video is a big hit? </li></ul></ul></ul><ul><ul><ul><li>How to “help out” the process of getting a big hit? </li></ul></ul></ul><ul><ul><ul><li>Can one distinguish between hit or flop? </li></ul></ul></ul>ITEC, Klagenfurt University, Austria
    15. 15. Is this just theory? (2) <ul><li>More related practical questions </li></ul><ul><ul><li>Do you have a website? </li></ul></ul><ul><ul><ul><li>How to “flat out” resource popularity? </li></ul></ul></ul><ul><ul><ul><li>How select popular resources (e.g. for caching, adaptation, preprocessing)? </li></ul></ul></ul>ITEC, Klagenfurt University, Austria
    16. 16. Big hits on YouTube ITEC, Klagenfurt University, Austria © 2007 by Aigner Thomas and Oraze Manuel
    17. 17. Getting popular … <ul><li>Starting with the first view (user) </li></ul><ul><li>Some other users find the same resource </li></ul><ul><li>They point other to it </li></ul><ul><ul><li>Blogging, Digging, word of mouth </li></ul></ul><ul><ul><li>Multiplicator of information – cp. Metcalfe’s law </li></ul></ul><ul><li>Number of views (users) “explodes” </li></ul>ITEC, Klagenfurt University, Austria
    18. 18. Some graphs … ITEC, Klagenfurt University, Austria <ul><li>Data from del.icio.us </li></ul><ul><li>Shows </li></ul><ul><li>bookmarks / day </li></ul><ul><li>relative user count </li></ul>
    19. 19. Observations <ul><li>There is an initial bend in the curve </li></ul><ul><li>The mean user # at the bend is rather small </li></ul><ul><ul><li>Around 50 </li></ul></ul><ul><li>There are outliers </li></ul><ul><ul><li>Google Video was doomed to be a success </li></ul></ul>ITEC, Klagenfurt University, Austria
    20. 20. Conclusions <ul><li>If there is a bend … </li></ul><ul><ul><li>Chances are better for a big hit. </li></ul></ul><ul><li>Time is still an issue </li></ul><ul><ul><li>Slow start, long vs. short hype, etc. </li></ul></ul><ul><li>Resources without this bend: </li></ul><ul><ul><li>Better Chances that they are shelf warmers </li></ul></ul><ul><ul><li>Decision support for portfolio adaptation </li></ul></ul>ITEC, Klagenfurt University, Austria
    21. 21. The Flickr way <ul><li>Flickr defined “Interestingness” </li></ul><ul><ul><li>Patented </li></ul></ul><ul><ul><li>combining views, comments, age, etc. </li></ul></ul><ul><li>Interesting photos are presented </li></ul><ul><ul><li>Users see new photos </li></ul></ul><ul><ul><li>Not all photos (2.000 new / minute, checked Feb. 1 2008, ~ 11.oo UTC) </li></ul></ul><ul><ul><li>They have no “big hit” </li></ul></ul>ITEC, Klagenfurt University, Austria Kudos given to Horst Gutmann and Marian Kogler
    22. 22. The YouTube way <ul><li>Smaller resource data base than Flickr </li></ul><ul><ul><li>Around 45 videos a day (65.000 a day) </li></ul></ul><ul><li>But a lot more views (data Feb. 1st, 08) </li></ul><ul><ul><li>73.245.607 for „Evolution of Dance“ </li></ul></ul><ul><ul><li>20 most viewed have > 30M views </li></ul></ul><ul><li>Not obvious counter strategy </li></ul><ul><ul><li>Might not (yet) be necessary </li></ul></ul>ITEC, Klagenfurt University, Austria
    23. 23. Digg <ul><li>Assumption: Diggs also follow a power law </li></ul><ul><ul><li>Quite reasonable … </li></ul></ul><ul><li>How to avoid the Digg- effect? </li></ul><ul><ul><li>Digg has a mirror … </li></ul></ul>ITEC, Klagenfurt University, Austria
    24. 24. Thanks ... <ul><li>... for your </li></ul><ul><li>attention </li></ul><ul><li>You are interested? </li></ul><ul><li>Then talk to me … </li></ul>ITEC, Klagenfurt University, Austria by Gexydaf http://www.flickr.com/photos/gexydaf/2208215419/
    25. 25. Mathias Lux <ul><li>Affiliation </li></ul><ul><ul><li>Klagenfurt University, ITEC </li></ul></ul><ul><li>Contact </li></ul><ul><ul><li>mathias @ juggle.at </li></ul></ul><ul><ul><li>http://www.semanticmetadata.net </li></ul></ul>ITEC, Klagenfurt University, Austria

    ×