SlideShare a Scribd company logo
1 of 15
Download to read offline
TECHNICAL REPORT
                          YL-2010-008




EDISCOPE: SOCIAL ANALYTICS FOR ONLINE NEWS

                    Yury Lifshits
                Santa Clara, CA 95054
            {lifshits@yahoo-inc.com}

                      December 20, 2010




         Bangalore • Barcelona • Haifa • Montreal • New York
                      Santiago • Silicon Valley
Yahoo! Labs Technical Report No. YL-2010-008
EDISCOPE: SOCIAL ANALYTICS FOR ONLINE NEWS

                                          Yury Lifshits
                                     Santa Clara, CA 95054
                                 {lifshits@yahoo-inc.com}

                                          December 20, 2010

ABSTRACT: We present Ediscope — an system for measuring social engagement around online
news articles. Ediscope collects signals from Twitter, Facebook and Bit.ly. Using our link spotter
and social crawler we address a number of questions. What is a lifespan of a typical news story?
What are the typical engagement numbers per-pageview? Can social signals be used for pageview
estimates? How much improvement a social optimization can bring to a news source? Our first
results indicate that less than 20% of activity happens to an article after its first 24 hours. In average
a story has 5-20 social actions per 1000 pageviews. For most feeds, top 7 stories a week capture
65% Facebook actions and 25% retweets. The correlation between pageviews and social signals
is surprisingly low. Our measurements indicate a double digit improvement potential for social
optimizations.
1. Introduction
    Online news are on the way to become our primary source of information. In order to win the
competition and delight the users, the editors of online news have to constantly optimize their con-
tent strategy. Content strategy is a new applied discipline that addresses the following questions:
What should we write about? How many articles per day? How to allocate coverage shares be-
tween main topics? How to discover breaking stories? Which stories to promote within a website?
What is the most effective navigation structure for our content? Next to content strategy, there is
the emerging field of social media optimization (SMO): How to maximize engagement? How to
maximize secondary traffic from social sources (Facebook, Twitter)? How to grow the number of
followers, subscribers and fans?
    To solve the problems of content strategy and social media optimization one needs both art
and science. As web news are inherently more measurable than print news, the role of science
is increasing. Until recently, most solutions were based on click-through rates, time spent, eye
tracking and pageviews. This information is typically available only for website owners. Therefore,
it was hard to create generic measurement and optimization solutions. Fortunately, in the last couple
of years, social signals emerged as a universal and public feedback mechanism. In this paper, we
present a study based on Facebook likes, links in Twitter, and clicks on Bit.ly links. The availability
of social signals for content strategy problems created the new research direction of social media
analytics [1].

Questions we address in this study: For how long an average article receives user attention?
Can we guess the pageviews counts from social signals? Can social signals be used to promote
best stories? Should editors focus on producing better content or on producing more content? How
much improvement can it bring?

Contribution. Our first contribution is the data engineering infrastructure we build for the project.
Ediscope system has modules for link discovery, signal monitoring, statistical analysis and visual-
ization. Ediscope data and lookup tool are available at http://ediscope.labs.yahoo.net.
Currently, Ediscope toolkit is available on request as it is subject to third-party API rate limits. Feel
free to contact Yury at lifshits@yahoo-inc.com to use Ediscope for your project or to order a custom
report on your favorite news source.
    Our most surprising finding is the low correlation between social signals and the actual pageivew
counts. The gap is especially large for non-top new, in that case Pearson coefficient approaches 0.5.
To understand the role of these low correlations we introduce a simple user experience model. Under
this model we demonstrate a potential for double digit improvement at Gawker, Business Insider,
Change.org and Forbes blogs.
    In average we see around 10 Facebook/Twitter actions per 1000 pageviews. Correlation between
social activities is higher for the top news than in the average case. Mainstream sources have much
more Facebook activity than mentions on Twitter. Tech media has the opposite situation. Facebook
actions are much more skewed to top news. Finally, Twitter signals have slightly better correlation
to pageviews counts.
Our results show that almost universally across news sources less than 20% of activity happens
after the first 24 hours. Feeds and frontpages drive attention to the latest content units. Search brings
traffic to “evergreen” content like Wikipedia. But there is no driver for materials with mid-range
(few weeks – few months) lifespan. Perhaps, we need a new promotion mechanism for this type of
content.

Remark on focus. When scientists work with real world data there are two mindsets. One can
focus on hard/intelligent tasks like model fitting and parameter predictions. This approach makes it
easy to judge the project by comparing accuracy of results to the previous work. The other method is
to measure the raw signals and turn them into actionable insights for domain experts. In this case, the
findings can be judged by novelty of measurements and importance of resulting recommendations.
This study follows the second approach. Here are our takeaway lessons for editors and product
managers of online news:

Create new promotion mechanisms for in-depth content. At the moment there is no middle place
     between breaking news and reference content. Perhaps, we need dedicated feed, section and
     frontpage module that highlight articles of mid-range lifespan.
Use social signals for content optimization. There is a serious gap between what content units are
     most liked and what content units receive the most pageviews. In other words, user experience
     can be improved by using Facebook likes and retweet counts to promote the most popular
     content.
Check your engagement scores. If you see less than 10-20 social actions per 1000 pageviews,
     your sharing functionality can be improved. Typically, it is as simple as getting the buttons of
     the right size, at the right place and minimize the number of clicks to share your content.
Check your head/tail structure. If you have heavy head, improvements in quality and promotion
     mechanisms should be your priorities. If you have heavy tail, your best opportunity is in
     expanding content production. According to our measurements, one has heavier-than-typical
     head if over 75% of weekly Facebook actions or over 45% of weekly retweets is concentrated
     in top 7 articles.

1.1. Related work
    Social signals (Facebook likes, Retweet counters, Bit.ly click counters) are relatively new phe-
nomena. In particular, Facebook Like button was introduced in May 2010, just 6 months prior to
this paper. Until now, social analytics research was centered around text-based signals [6, 7, 8]. To
our knowledge, we present the first temporal study of Facebook like counts.
    Before social signals, researchers were looking into comment counts, Digg counts and Youtube
viewcounts. Tsagkias, Weerkamp and de Rijke developed several algorithms to predict the total
volume of comments shortly after publication [12, 13]. Paul Ogilvie measured and modeled total
comment counts across various RSS feeds as a part of FeedHub project [9]. Cha, Kwak, Rodriguez,
Ahn, and Moon performed a long tail analysis of Youtube and Daum videos [3]. Avramova, Wit-
tevrongel, Bruneel and De Vleeschauwer developed classifier that distinguishes videos with expo-
nential and power law popularity decays [2]. Salman and Rangwala showed how to predict a total
Digg count shortly after publication [10]. Spiliotopoulos studied correlations between Digg counts
and comment counts for most popular stories [11].
    The key advantage of social signals comparing to comment/Digg/Youtube counts is their uni-
versality. Only now one can develop optimization/prediction/recommendation systems that will be
applicable to any news source on the Web.

2. Overview of Ediscope System
2.1. Architecture of Ediscope.
     For our study we implemented a new social analytics system called Ediscope. It has four primary
components. Link spotting tool is taking RSS feeds as an input and check them regularly to spot
new links. In many cases, RSS feeds present proxy links in order to measure clicks from RSS
readers. In particular, Feedburner and Pheedo do that. In this cases we convert proxy links to
the original ones. The second component is signal crawler. It takes news URLs and calls public
APIs (Facebook, Bit.ly, TweetMeme) to retrieve the current numbers for a given story. We also
implemented custom scraping for pageview counts. After that, we have monitoring component
that re-crawl active links in our database regularly (by default, every hour). Ediscope’s monitor
computes the deltas to the previous crawl for measuring activity over the last interval. Monitoring
functionality is used for temporal analysis of social engagement. Finally, we call Google Chart API
for dynamic visualization of results at Ediscope’s website.
     In its current form, Ediscope has certain limitations. First of all, APIs we use have strict rate
limits. In particular, TweetMeme only allows 250 requests per 60 minute time period. This forced
us to focus on smaller datasets. Secondly, the same news article can be represented by several
URLs. Sometimes, Facebook, Bit.ly or Twitter fail to recognize these links as the same object. As
a result, APIs return lower engagement numbers, missing likes, clicks and retweets on non-canonic
versions of an article. E.g. Wall Street Journal has different URLs for a story when you visit it
directly vs. when you visit it from the frontpage. Next, many top websites do not have RSS feeds
or their feeds do not work properly. For example, Yahoo’s today module, the central piece of its
frontpage, does not have a feed. In these cases, one has to use manual lookups or scraping. Finally,
Ediscope is using a pull mechanism to discover new stories. By the time we add an article to our
system, around 15% of its social activity has already happened. In the future, push mechanisms
such as PubHubSubBub can be used to address this issue.
     There are several commercial systems in the space of social analytics. Postrank is a proprietary
article ranking algorithm that takes social signals into account. BackType is a lookup system that
retrieves the current values of social metrics. Unlike Ediscope, it does not have the fully accessible
temporal profiles or pageview extractor modules. Klout is using social signals to rate news sources
and Twitter personalities.
2.2. Datasets.
    We created three datasets for our study: temporal set, pageview set, head-tail set. For temporal
analysis we selected 10 RSS feeds from major US news sources. We used our linkspotting module
to discover 20 articles per source. Link spotter was checking RSS feeds every 10 minutes in order
to discover articles almost immediately after publishing. Then, we used our monitoring tool to
update social counts every hour and compute the corresponding delta values. As a result we have
got temporal social profiles for 20 articles at 10 sources. For pageview analysis we consider four
major content networks that explicitly show viewcounts at their articles: Business Insider, Gawker,
Forbes Blogs and Change.org. For every network, we picked three RSS feeds, launched our link
spotting module and kept it live until we spotted around 50-75 articles per network. Then we waited
for several days until the total social counts are close to their final values. Then, we used our crawler
to measure social counts and pageview counts for every article in our dataset. For head/tail analysis
we looked at RSS feeds of several major news sources. For every publisher, we used link spotter to
get all articles from a one week period (around 200 articles per feed). Then we crawled them once
to collect social counts.

3. Empirical Study
3.1. Article Lifespan
     In our temporal study we track 20 articles from each of the following sources: Washington Post,
Gizmodo, CNN, MSNBC, HuffingtonPost, Yahoo News, New York Times, Engadget, Mashable,
and TechCrunch. On average, every story has 901 Facebook actions (likes, shares and Facebook
comments), 221 retweets and 660 clicks on from Bitly-shortened links. The following table repre-
sents percentages of activities for the first, second, third, forth and fifth interval of 24 hours after
publication. Note that the total share of activity is significantly less than 100%. This is due to ac-
tivity in the interval between the time a story was published and the time Ediscope has discovered
it. Out of all sources, Engadget articles have the slowest decay of activity and Yahoo News has the
sharpest decay.
Signals in average    1 day    2 day   3 day    4 day   5 day
Facebook              73.94    11.57   2.83     1.29    0.48
Twitter               70.71    5.11    1.72     0.69    0.37
Bitly                 73.27    8.07    2.49     1.06    1.01
Engadget signals
Facebook              56.13    24.40   9.03     4.35    1.99
Twitter               71.27    9.24    4.12     1.28    0.71
Bitly                 76.53    10.02   4.12     1.54    0.86
Yahoo News signals
Facebook              85.49    6.69    1.01     0.38    0.10
Twitter               84.80    4.21    0.33     0.13    0.00
Bitly                 33.88    2.08    0.40     0.21    0.05




    Figure 1: Average activity of Engadget article during the first 68 hours of track-
    ing. Deep blue represents Facebook, light blue represents Twitter, yellow rep-
    resents Bit.ly.




    Figure 2: Social activity of Engadget article “BlackBerry users running out of
    loyalty”
Here are our main observations:

   • Majority (typically, over 80%) of social activity happens during the first 24 hours.
   • Monotonicity. Majority of shapes are monotone or monotone after daytime correction (bump-
     next-morning effect).
   • Twitter is geeky. While mainstream sources like NYT, Yahoo, CNN, MSNBC and Washington
     Post have up to 10 Facebook actions for one retweet, TechCrunch and Mashable have more
     retweets than Facebook signals. The Facebook advantage over Twitter in mainstream news
     indicates that it can be a more reliable signal for content optimization solutions.
   • Non-original content has lower activity. HuffingtonPost has two patterns: one for original
     posts, another for aggregated content. Five links from TechCrunch feed are re-posts from
     CrunchGear and TechCrunch.EU and have much lower counts than TC-proper articles.
   • User experience flaws. The sharing functionality can have serious affect on total amount of
     activity. In particular, at New York Times Twitter buttons do not directly tweet the story, but
     instead ask reader to use Twitter for logging into NYT.

   The fact that most activity happens during the first day has serious implications for editors
and product managers of online news. As our study shows, the currently used mechanisms for
promotion (feeds, frontpage promotions, cross-linking) are only capable for driving the first day
audience. In such an environment, weekly/analytic/evergreen content is highly discouraged and
unsustainable. Thus, if a certain publisher wants to produce longer-lifespan articles, it should depart
from existing content promotion strategies. On a positive side, we feel that the opportunity of high
quality weekly/monthly analytic content is wide open in almost every vertical.

3.2. Per-pageview Statistics
    Several online content networks display actual pageview counts. This allows us to compute
average amounts of social activity per 1000 pageviews. In some cases several top stories have
different activity pattern than the rest of the site. To get more robust results we compute averages
both for full sets of articles and the sets excluding top 10 articles.

     Network             Facebook     Twitter   Bit.ly   FB (non-top)    TW (non-top)      BT (non-top)
     Gawker              24.59        4.66      13.36    11.55           4.74              2.65
     Forbes blogs        4.61         9.16      41.41    5.13            11.86             29.00
     Business Insider    3.08         6.40      34.37    3.90            28.99             106.47
     Change.org          4.43         2.74      3.54     8.69            4.12              6.25

   Then we look at the Pearson correlation coefficient between social signals and the actual pageview
counts. We also compute correlations between Facebook and Twitter signals and between Bit.ly and
Twitter signals.
Network                    FB / PV    TW / PV     BT / PV     FB / TW     BT / TW
     Gawker                     0.92       0.95        0.93        0.95        0.95
     Forbes blogs               0.35       0.40        0.63        0.34        0.63
     Business Insider           0.93       0.54        0.65        0.65        0.87
     Change.org                 -0.01      0.45        0.05        0.34        0.65
     Excluding top 10 news
     Gawker                     0.47       0.63        0.41        0.47        0.35
     Forbes blogs               0.12       0.34        0.55        0.31        0.56
     Business Insider           0.34       0.43        0.53        0.50        0.80
     Change.org                 0.67       0.50        -0.09       0.47        0.75

    To get a visual sense of correlations we present plots for Gawker and Change.org. Absolute
values are scaled to fit in the same space. The top-right point at Gawker plot is in fact far outside of
the chart (Gawker has one outstandingly popular story).




            Figure 3: Correlation between retweets and pageviews at Gawker network

    Let us make some observations from the above tables:

   • On average articles have around 10 Facebook/Twitter actions per 1000 pageviews.
   • With the exception of Facebook signals at Gawker, the top news have less social actions per-
     pageview than the average stories.
   • For the non-top news, correlation between social signals and pageviews is around 0.5. Recall
     that Pearson coefficient is ranging from -1 (perfectly negatively correlated) to 0 (totally inde-
     pendent) to 1 (perfectly positively correlated). Thus, 0.5 value means that social signals are as
     close to perfect correlation as they are to total independence.
   • In 6 cases out of 8, retweets have higher correlation to pageviews than Facebook actions.
   • Change.org shows negative correlations in some cases. An article is more likely to get Face-
     book activity if it has less pageviews. It turns out that “Social Entrepreneurship” section has
     much more pageviews but the same (or even slightly lower) Facebook counts. Once we re-
     move articles from this section, the correlation returns to positive value.
   • As expected, bit.ly clicks are better correlated to retweets than Facebook signals.
Figure 4: Correlation between pageviews vs. and Facebook (dark blue), Twitter
          (light blue) and Bit.ly (yellow) signals at Change.org. The gap in pageviews
          represents difference in popularity between different sections of the portal.


    Looking at our per-pageview results, one can try to reconstruct pageview counts for the rest of
the Web. The baseline guess would be around Facebook count (or Twitter count) times 100. As our
measurements show, there are more chances to accurately predict the pageviews for a top story than
to do so for an average article. And looking at our lifespan study, we recommend Facebook over
Twitter as the primary signal for mainstream sources.
    What lessons can one learn from these measurements? At the moment the role of social traffic
in overall article success seems to be very small. For an average story there is a very low correlation
between social signals and its pageview count. When we include top stories to the picture, social
activity per pageview actually goes down. These observations hint that factors different from lik-
ability and social cascades are playing the leading role in pageview success. As a result, traffic is
allocated to not-so-likable stories.
    Let us do the following mind experiment. Assume for a moment that Facebook count or Twitter
count represents the actual reader satisfaction score. Then we can compute the total user satisfaction
score as the sum of products between pageviews and Facebook/Twitter counts. Now, let us reallocate
pageview counts in a way that the top pageview value corresponds to the top Facebook count, the
second top corresponds to the second top and so on. Then, we can calculate the “optimal” user
satisfaction score. In other words, we want to check how much user benefit promotion-by-likability
can bring to existing content networks. Below is the table of our results.

     Network             FB increase    TW increase     FB increase (non-top)     TW increase (non-top)
     Gawker              1.019          1.026           1.330                     1.181
     Forbes blogs        1.566          1.403           1.796                     1.341
     Business Insider    1.047          1.342           1.402                     1.227
     Change.org          2.346          1.245           1.109                     1.110
As we see, all networks have double digit potential to increase their user experience for non-top
stories. Forbes blogs and Change.org can significantly increase the overall user experience. Again,
Change.org results look a bit weird because it has two cluster of very different articles. One are
more likable, others get more pageviews. So the overall experience can be improved significantly.
Once we remove the top news (one cluster), the rest of the site can only achieve 10-11% increase. Of
course, our model for user satisfaction score is oversimplification, but it can be used as a first-order
approximation of possible improvement based on social signals.

3.3. Head vs. Tail Analysis
     In our final experiment we collect links from RSS feeds at several US news sources over the
course of one week. These feeds have from 64 to 226 items per week. Then, for every source we
retrieve and sort the social counts for discovered articles. We compute the percentage of weekly
social activity that corresponds to the top story, top 7 stories and all stories outside top 7. We use
the constant 7 as a reflection of one-story-per-day strategy.

 Feed                  Articles tracked    Top item: FB / TW     Top 7: FB / TW      The rest: FB / TW
 TechCrunch            182                 32.3 / 4.6            61.5 / 16.8         38.5 / 83.2
 Mashable              162                 23.1 / 2.1            47.1 / 13.2         52.9 / 86.8
 Wired                 120                 9.9 / 4.8             41.4 / 24.9         58.6 / 75.1
 Engadget              200                 44.3 / 18.9           68.7 / 27.5         31.3 / 72.5
 Wall Street Journal   201                 36.6 / 5.8            65.4 / 18.5         34.6 / 81.5
 Vanity Fair           64                  21.8 / 11.4           70.5 / 44.7         29.5 / 55.3
 Yahoo! Upshot         109                 28.8 / 26.0           75.7 / 59.1         24.3 / 40.9
 Yahoo! Top News       226                 20.9 / 9.1            45.6 / 29.6         54.4 / 70.4
 All Things D          139                 66.2 / 17.2           89.2 / 41.5         10.8 / 58.5
 Gizmodo               82                  36.1 / 5.2            70.0 / 21.1         30.0 / 78.9
 Aol News              78                  19.2 / 11.4           85.1 / 44.4         14.9 / 55.6

   One can made several immediate observations:

   • Typically, around 65% of Facebook actions and 25% of retweets happens around top 7 stories.
   • Facebook activity is much more heavy-headed than retweets.
   • Yahoo! Upshot is the most heavy-headed blog in our study. Only 40% of retweets and 25% of
     Facebook actions happens outside top 7 articles. Perhaps, it is so, because Upshot has very few
     dedicated readers, and majority of action corresponds to a few Yahoo-wide promoted stories.
     AllThingsD is also fairly heavy-headed.
   • Mashable and Wired have the heaviest tails. Both have over over 75% of retweets and over
     50% of Facebook actions outside top 7 stories.

    Let us offer an interpretation from a content optimization perspective. The heavy head of social
activity means that the total user satisfaction can be improved by improving quality of the tail
content or by finding a better ways to promote it. The heavy tail indicates that the tail content
has its own audience and is well promoted. Thus, the best opportunity for heavy-tail websites lies
in expanding its content production. For more accurate interpretation, one should track individual
consumption patterns. Goel, Broder, Gabrilovich and Pang have recently shown that the purpose of
the tail inventory is not only to capture new users but also to better serve users who like some-of-
the-top and some-of-the-niche [5].

4. Roadmap for Social Analytics
     There is a number of natural next steps for Ediscope framework. First, we can turn our mea-
surements into rankings of news sources and individual writers by their engagement scores and
lifespans of their content. It is also informative to compare the signals for the same story covered
at different destinations. Then, one can do in-depth factor analysis to find what features of content
and audience increase the overall success of an article. In particular, what is the role of frontpages
and other in-site promotions? Another important step is to release datasets for research community.
Pageviews vs. social signals spreadsheet is likely to be published first. As we identified the problem
with content of mid-range lifespan, one should have a closer look at this area. Videos and products
have a longer lifespan and should be studied through social signals. And, of course, Ediscope should
collect larger datasets to make its findings more robust.
     The general direction of using social signals for content management is wide open. Here is the
overview of the key areas.

Data engineering. Ediscope establishes the basic architecture for news analytics systems. For
a typical study, one needs content discovery, signal crawler, monitoring, statistical analysis and
visualization components. Looking into future, the research community will benefit from a shared
public stack of these tools. We do not want to recreate the same code again and again. Ediscope
platform can be extended in a number of ways. Of course, we need more signals: StumbleUpon,
Delicious, Yahoo Site Explorer, Digg, Spinn3r, comment counts, signals from public and private
hit counters. In its future versions, Ediscope can incorporate content metadata: author, publisher,
keywords, topics, headlines, tags, full text, content type, date and time, staff/guest/sponsored. User
data can be harder to add due to privacy concerns, but eventually it will be a part of analytics
systems. We need real-time content discovery and signal stream processing. Higher rate limits
should be negotiated with API providers. Then, there should be a way to add prediction, ranking
and optimization algorithms on top of basic infrastructure.

Measurements and modeling. Every category of web content can be a subject of social ana-
lytics: video, products, movies, books, websites, blogs, newspapers, magazines, TV shows, and
content farms. One can focus either on a particular vertical or on a content network (Yahoo, MSN,
Aol). A number of metrics can be created based on social signals: content lifespan, engagement
score, engagement-per-visit, share of social traffic to overall pageviews. Once we focus on a certain
content source and a metric, it is time for factor analysis. How do features of content, audience
and user interface affect the social success of a published material? Then, we need comprehensive
industry studies: the baseline numbers for social engagement and leaderboards. Finally, one can
create a taxonomy of engagement scenarios of content units.

Content optimization. Of course, the ultimate goal of social analytics is not just to collect data
and compute some metrics and rankings. The real impact is in using social insights for making
better publishing choices. Every online publisher faces the following issues: Choose stories and
topics to cover. Balance recency and importance in news coverage. Optimize headlines. Optimize
article length. Optimize in-network promotion. Rank its own stream of news [4] and make the best
selection for the frontpage. Find and fix underperforming areas. Optimize user interface. Make the
best content easy to discover. To conclude, the future of Ediscope and other analytics systems is to
recommend choices that maximize social engagement.

Acknowledgement. Author thanks Benjamin Moseley and Silvio Lattanzi for fruitful discussions
at the early stage of this project.

References
 [1] Workshop on Social Media Analytics, 2010. http://snap.stanford.edu/soma2010.
 [2] Z. Avramova, S. Wittevrongel, H. Bruneel, and D. De Vleeschauwer. Analysis and model-
     ing of video popularity evolution in various online video content systems: power-law versus
     exponential decay. In INTERNET’09.
 [3] M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. Analyzing the video popular-
     ity characteristics of large-scale user generated content systems. IEEE/ACM Trans. Netw.,
     17(5):1357–1370, 2009.
 [4] G.M. Del Corso, A. Gull´, and F. Romani. Ranking a stream of news. In WWW’05.
                               ı
 [5] S. Goel, A. Broder, E. Gabrilovich, and B. Pang. Anatomy of the long tail: Ordinary people
     with extraordinary tastes. In WSDM’10.
 [6] J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news
     cycle. In KDD’09.
 [7] M. Mathioudakis and N. Koudas. TwitterMonitor: Trend detection over the Twitter stream. In
     SIGMOD’10.
 [8] M. Mendoza, B. Poblete, and C. Castillo. Twitter under crisis: Can we trust what we RT? In
     SOMA’10.
 [9] P.      Ogilvie.              Modeling      blog    post     comment      counts,     2008.
     http://livewebir.com/blog/2008/07/modeling-blog-post-comment-counts/.
[10] J. Salman and H. Rangwala. Digging Digg: Comment mining, popularity prediction, and
     social network analysis. In WISM’09.
[11] T. Spiliotopoulos. Votes and comments in recommender systems: The case of Digg, 2010.
     http://hci.uma.pt/courses/socialweb/projects/2009.digg.paper.pdf.
[12] M. Tsagkias, W. Weerkamp, and M. de Rijke. News comments: Exploring, modeling, and
     online prediction. In ECIR’10.
[13] M. Tsagkias, W. Weerkamp, and M. de Rijke. Predicting the volume of comments on online
     news stories. In CIKM’09.

More Related Content

What's hot

Social Media, Big Data
Social Media, Big Data Social Media, Big Data
Social Media, Big Data robin fay
 
Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)
Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)
Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)Lora Aroyo
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_videoramikaurraminder
 
Smarter, More Social Browser
Smarter, More Social BrowserSmarter, More Social Browser
Smarter, More Social BrowserMika Li
 
Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...
Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...
Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...Michael Pranikoff
 
Web 2.0 Measurement: Open Government Innovations Conference
Web 2.0 Measurement: Open Government Innovations ConferenceWeb 2.0 Measurement: Open Government Innovations Conference
Web 2.0 Measurement: Open Government Innovations ConferenceAndrew Krzmarzick
 
Analysis of Facebook and Tuenti
Analysis of Facebook and TuentiAnalysis of Facebook and Tuenti
Analysis of Facebook and Tuenticpape21
 
Uw Digital Communications Social Media Is Not Search
Uw Digital Communications Social Media Is Not SearchUw Digital Communications Social Media Is Not Search
Uw Digital Communications Social Media Is Not SearchMarianne Sweeny
 
Web2 0 for eGovernment: why and how?
Web2 0 for eGovernment: why and how?Web2 0 for eGovernment: why and how?
Web2 0 for eGovernment: why and how?osimod
 
The big data strategy using social media
The big data strategy using social mediaThe big data strategy using social media
The big data strategy using social mediaVaibhav Thombre
 
Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...
Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...
Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...ESCP Exchange
 
Image Intelligence: Making Visual Content Predictive
Image Intelligence: Making Visual Content PredictiveImage Intelligence: Making Visual Content Predictive
Image Intelligence: Making Visual Content PredictiveAllan V. Braverman
 
Tagging - Can User Generated Content Improve Our Services?
Tagging - Can User Generated Content Improve Our Services?Tagging - Can User Generated Content Improve Our Services?
Tagging - Can User Generated Content Improve Our Services?guestff5a190a
 
Link Exchange Is Dead
Link Exchange Is DeadLink Exchange Is Dead
Link Exchange Is Deadjaxky
 
Search and Social Signals Pubcon 2011
Search and Social Signals Pubcon 2011Search and Social Signals Pubcon 2011
Search and Social Signals Pubcon 2011Rob Garner
 

What's hot (19)

Social Media, Big Data
Social Media, Big Data Social Media, Big Data
Social Media, Big Data
 
Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)
Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)
Lecture 6: How can we STUDY the (Social) Web? (VU Amsterdam Social Web Course)
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_video
 
Smarter, More Social Browser
Smarter, More Social BrowserSmarter, More Social Browser
Smarter, More Social Browser
 
Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...
Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...
Investor Relations & Emerging Media – Presented at the NIRI Capital Area Chap...
 
Web 2.0 Measurement: Open Government Innovations Conference
Web 2.0 Measurement: Open Government Innovations ConferenceWeb 2.0 Measurement: Open Government Innovations Conference
Web 2.0 Measurement: Open Government Innovations Conference
 
Blogosphere
BlogosphereBlogosphere
Blogosphere
 
Onalytica WP
Onalytica WPOnalytica WP
Onalytica WP
 
NLP journal paper
NLP journal paperNLP journal paper
NLP journal paper
 
Analysis of Facebook and Tuenti
Analysis of Facebook and TuentiAnalysis of Facebook and Tuenti
Analysis of Facebook and Tuenti
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
Uw Digital Communications Social Media Is Not Search
Uw Digital Communications Social Media Is Not SearchUw Digital Communications Social Media Is Not Search
Uw Digital Communications Social Media Is Not Search
 
Web2 0 for eGovernment: why and how?
Web2 0 for eGovernment: why and how?Web2 0 for eGovernment: why and how?
Web2 0 for eGovernment: why and how?
 
The big data strategy using social media
The big data strategy using social mediaThe big data strategy using social media
The big data strategy using social media
 
Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...
Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...
Kaplan & Haenlein - Users of the world, unite - the challenges and opportunit...
 
Image Intelligence: Making Visual Content Predictive
Image Intelligence: Making Visual Content PredictiveImage Intelligence: Making Visual Content Predictive
Image Intelligence: Making Visual Content Predictive
 
Tagging - Can User Generated Content Improve Our Services?
Tagging - Can User Generated Content Improve Our Services?Tagging - Can User Generated Content Improve Our Services?
Tagging - Can User Generated Content Improve Our Services?
 
Link Exchange Is Dead
Link Exchange Is DeadLink Exchange Is Dead
Link Exchange Is Dead
 
Search and Social Signals Pubcon 2011
Search and Social Signals Pubcon 2011Search and Social Signals Pubcon 2011
Search and Social Signals Pubcon 2011
 

Viewers also liked

48 e-luis-newyorkcity-101017122006-phpapp01
48 e-luis-newyorkcity-101017122006-phpapp0148 e-luis-newyorkcity-101017122006-phpapp01
48 e-luis-newyorkcity-101017122006-phpapp01Simona Converso
 
Issues in Family Law
Issues in Family Law Issues in Family Law
Issues in Family Law Mr Shipp
 
Yoga therapy techniques 1
Yoga therapy techniques 1Yoga therapy techniques 1
Yoga therapy techniques 1Shama
 
Rubrica para evaluación de recursos digitales
Rubrica   para   evaluación   de   recursos   digitalesRubrica   para   evaluación   de   recursos   digitales
Rubrica para evaluación de recursos digitalesgloria bonilla
 
Laura Baker Resume
Laura Baker ResumeLaura Baker Resume
Laura Baker Resumelaura baker
 
The Career Journey
The Career JourneyThe Career Journey
The Career JourneyMilan Sands
 
RACI SPACER 2016 CATALOG
RACI SPACER 2016 CATALOGRACI SPACER 2016 CATALOG
RACI SPACER 2016 CATALOGRaci srl Italy
 

Viewers also liked (8)

48 e-luis-newyorkcity-101017122006-phpapp01
48 e-luis-newyorkcity-101017122006-phpapp0148 e-luis-newyorkcity-101017122006-phpapp01
48 e-luis-newyorkcity-101017122006-phpapp01
 
Issues in Family Law
Issues in Family Law Issues in Family Law
Issues in Family Law
 
Yoga therapy techniques 1
Yoga therapy techniques 1Yoga therapy techniques 1
Yoga therapy techniques 1
 
Rubrica para evaluación de recursos digitales
Rubrica   para   evaluación   de   recursos   digitalesRubrica   para   evaluación   de   recursos   digitales
Rubrica para evaluación de recursos digitales
 
Laura Baker Resume
Laura Baker ResumeLaura Baker Resume
Laura Baker Resume
 
Completion Certificate QCS
Completion Certificate QCSCompletion Certificate QCS
Completion Certificate QCS
 
The Career Journey
The Career JourneyThe Career Journey
The Career Journey
 
RACI SPACER 2016 CATALOG
RACI SPACER 2016 CATALOGRACI SPACER 2016 CATALOG
RACI SPACER 2016 CATALOG
 

Similar to Yahoo! Engagement Study

Managing and measuring social media coventry combined
Managing and measuring social media coventry combinedManaging and measuring social media coventry combined
Managing and measuring social media coventry combinedWeb2LLP
 
Metrics & Analytics
Metrics & AnalyticsMetrics & Analytics
Metrics & AnalyticsCamberNoren
 
Data Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer AnalysisData Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer AnalysisZeno Group
 
socialflow data drives social performance wp
socialflow data drives social performance wpsocialflow data drives social performance wp
socialflow data drives social performance wpMohamed Mahdy
 
Data Drives Social Performance
Data Drives Social PerformanceData Drives Social Performance
Data Drives Social PerformanceEvgeny Tsarkov
 
Chapter 6 presentation
Chapter 6 presentationChapter 6 presentation
Chapter 6 presentationMiles223
 
Team Lecture on Blog
Team Lecture on BlogTeam Lecture on Blog
Team Lecture on Blogmcleanq
 
Chapter 6 presentation
Chapter 6 presentationChapter 6 presentation
Chapter 6 presentationsabucher
 
ICT-Project-for-social-change-trish.pptx
ICT-Project-for-social-change-trish.pptxICT-Project-for-social-change-trish.pptx
ICT-Project-for-social-change-trish.pptxxkhslintawahantrisha
 
Social Media Report UK Political Parties 2014
Social Media Report UK Political Parties 2014Social Media Report UK Political Parties 2014
Social Media Report UK Political Parties 2014SocialWin
 
Mc graw hill social media analytics - case studies - tools - tactics - mars...
Mc graw hill   social media analytics - case studies - tools - tactics - mars...Mc graw hill   social media analytics - case studies - tools - tactics - mars...
Mc graw hill social media analytics - case studies - tools - tactics - mars...Marshall Sponder
 
Social Media Dashboarding (reporting)
Social Media Dashboarding (reporting)Social Media Dashboarding (reporting)
Social Media Dashboarding (reporting)Scott K. Wilder
 
Social Media Dashboarding by Scott Wilder and semphonic
Social Media Dashboarding by Scott Wilder and semphonicSocial Media Dashboarding by Scott Wilder and semphonic
Social Media Dashboarding by Scott Wilder and semphonicEdelman Digital
 
Achieving and measuring success on the social web
Achieving and measuring success on the social webAchieving and measuring success on the social web
Achieving and measuring success on the social webBridey Lipscombe
 
Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...
Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...
Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...Julia Campbell
 
Ennen_Wharton_OMS_2010
Ennen_Wharton_OMS_2010Ennen_Wharton_OMS_2010
Ennen_Wharton_OMS_2010wimisteve
 
How to leverage social media at IT organizations
How to leverage social media at  IT organizationsHow to leverage social media at  IT organizations
How to leverage social media at IT organizationsThe Oren Group
 
STC 2010 Strategies for the Social Web for Documentation
STC 2010 Strategies for the Social Web for DocumentationSTC 2010 Strategies for the Social Web for Documentation
STC 2010 Strategies for the Social Web for DocumentationAnne Gentle
 
Social Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization ToolsSocial Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization ToolsSayani Majumder
 

Similar to Yahoo! Engagement Study (20)

Managing and measuring social media coventry combined
Managing and measuring social media coventry combinedManaging and measuring social media coventry combined
Managing and measuring social media coventry combined
 
Metrics & Analytics
Metrics & AnalyticsMetrics & Analytics
Metrics & Analytics
 
Data Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer AnalysisData Science: 2018 Media & Influencer Analysis
Data Science: 2018 Media & Influencer Analysis
 
socialflow data drives social performance wp
socialflow data drives social performance wpsocialflow data drives social performance wp
socialflow data drives social performance wp
 
Data Drives Social Performance
Data Drives Social PerformanceData Drives Social Performance
Data Drives Social Performance
 
Chapter 6 presentation
Chapter 6 presentationChapter 6 presentation
Chapter 6 presentation
 
Team Lecture on Blog
Team Lecture on BlogTeam Lecture on Blog
Team Lecture on Blog
 
Chapter 6 presentation
Chapter 6 presentationChapter 6 presentation
Chapter 6 presentation
 
ICT-Project-for-social-change-trish.pptx
ICT-Project-for-social-change-trish.pptxICT-Project-for-social-change-trish.pptx
ICT-Project-for-social-change-trish.pptx
 
Social Media Report UK Political Parties 2014
Social Media Report UK Political Parties 2014Social Media Report UK Political Parties 2014
Social Media Report UK Political Parties 2014
 
Mc graw hill social media analytics - case studies - tools - tactics - mars...
Mc graw hill   social media analytics - case studies - tools - tactics - mars...Mc graw hill   social media analytics - case studies - tools - tactics - mars...
Mc graw hill social media analytics - case studies - tools - tactics - mars...
 
Social Media Dashboarding (reporting)
Social Media Dashboarding (reporting)Social Media Dashboarding (reporting)
Social Media Dashboarding (reporting)
 
Social Media Dashboarding by Scott Wilder and semphonic
Social Media Dashboarding by Scott Wilder and semphonicSocial Media Dashboarding by Scott Wilder and semphonic
Social Media Dashboarding by Scott Wilder and semphonic
 
Achieving and measuring success on the social web
Achieving and measuring success on the social webAchieving and measuring success on the social web
Achieving and measuring success on the social web
 
Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...
Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...
Are We Getting Results? How to Track Your Nonprofit Social Media Efforts with...
 
Ennen_Wharton_OMS_2010
Ennen_Wharton_OMS_2010Ennen_Wharton_OMS_2010
Ennen_Wharton_OMS_2010
 
How to leverage social media at IT organizations
How to leverage social media at  IT organizationsHow to leverage social media at  IT organizations
How to leverage social media at IT organizations
 
Social media Enabling Smart Decisions
Social media Enabling Smart DecisionsSocial media Enabling Smart Decisions
Social media Enabling Smart Decisions
 
STC 2010 Strategies for the Social Web for Documentation
STC 2010 Strategies for the Social Web for DocumentationSTC 2010 Strategies for the Social Web for Documentation
STC 2010 Strategies for the Social Web for Documentation
 
Social Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization ToolsSocial Media Data Analysis and Visualization Tools
Social Media Data Analysis and Visualization Tools
 

More from Yahoo Deutschland

Visual Communication & The Future of Content Marketing
Visual Communication & The Future of Content MarketingVisual Communication & The Future of Content Marketing
Visual Communication & The Future of Content MarketingYahoo Deutschland
 
Yahoo-Studie zu Native Advertising "Native experience - ad content in context"
Yahoo-Studie zu Native Advertising "Native experience - ad content in context"Yahoo-Studie zu Native Advertising "Native experience - ad content in context"
Yahoo-Studie zu Native Advertising "Native experience - ad content in context"Yahoo Deutschland
 
Yahoo-Studie: Subconscious Storytelling
Yahoo-Studie: Subconscious StorytellingYahoo-Studie: Subconscious Storytelling
Yahoo-Studie: Subconscious StorytellingYahoo Deutschland
 
"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013
"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013
"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013Yahoo Deutschland
 
Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?
Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?
Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?Yahoo Deutschland
 
Yahoo!s Tipps zur Sicherheit beim Freigeben von Fotos
Yahoo!s Tipps zur Sicherheit beim Freigeben von FotosYahoo!s Tipps zur Sicherheit beim Freigeben von Fotos
Yahoo!s Tipps zur Sicherheit beim Freigeben von FotosYahoo Deutschland
 
2012 Yahoo! Zott FMCG Kaufstudie
2012 Yahoo! Zott FMCG Kaufstudie2012 Yahoo! Zott FMCG Kaufstudie
2012 Yahoo! Zott FMCG KaufstudieYahoo Deutschland
 
One Sheeter Yahoo! Research Case Zott
One Sheeter Yahoo! Research Case ZottOne Sheeter Yahoo! Research Case Zott
One Sheeter Yahoo! Research Case ZottYahoo Deutschland
 
Yahoo! Studie "Finanzberater Internet"
Yahoo! Studie "Finanzberater Internet"Yahoo! Studie "Finanzberater Internet"
Yahoo! Studie "Finanzberater Internet"Yahoo Deutschland
 
Yahoo! Vertical Study On Connected Devices
Yahoo! Vertical Study On Connected DevicesYahoo! Vertical Study On Connected Devices
Yahoo! Vertical Study On Connected DevicesYahoo Deutschland
 
Yahoo!-Studie - Connected Devices - Alle Wege fuehren ins Web
Yahoo!-Studie - Connected Devices - Alle Wege fuehren ins WebYahoo!-Studie - Connected Devices - Alle Wege fuehren ins Web
Yahoo!-Studie - Connected Devices - Alle Wege fuehren ins WebYahoo Deutschland
 
Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...
Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...
Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...Yahoo Deutschland
 
Yahoo Sicherheitstipps Sicheres Surfen
Yahoo Sicherheitstipps Sicheres SurfenYahoo Sicherheitstipps Sicheres Surfen
Yahoo Sicherheitstipps Sicheres SurfenYahoo Deutschland
 
Teilnahmebedingungen Gewinnspiel Coffee to go-Tassen
Teilnahmebedingungen Gewinnspiel Coffee to go-TassenTeilnahmebedingungen Gewinnspiel Coffee to go-Tassen
Teilnahmebedingungen Gewinnspiel Coffee to go-TassenYahoo Deutschland
 
Yahoo! Weihnachtsstudie 2011
Yahoo! Weihnachtsstudie 2011Yahoo! Weihnachtsstudie 2011
Yahoo! Weihnachtsstudie 2011Yahoo Deutschland
 
Vermarkterübergreifende Videostudie "Brands in (E)Motion"
Vermarkterübergreifende Videostudie "Brands in (E)Motion"Vermarkterübergreifende Videostudie "Brands in (E)Motion"
Vermarkterübergreifende Videostudie "Brands in (E)Motion"Yahoo Deutschland
 

More from Yahoo Deutschland (20)

Visual Communication & The Future of Content Marketing
Visual Communication & The Future of Content MarketingVisual Communication & The Future of Content Marketing
Visual Communication & The Future of Content Marketing
 
Yahoo-Studie zu Native Advertising "Native experience - ad content in context"
Yahoo-Studie zu Native Advertising "Native experience - ad content in context"Yahoo-Studie zu Native Advertising "Native experience - ad content in context"
Yahoo-Studie zu Native Advertising "Native experience - ad content in context"
 
Yahoo Finanzstudie
Yahoo FinanzstudieYahoo Finanzstudie
Yahoo Finanzstudie
 
Yahoo-Studie:Retail
Yahoo-Studie:RetailYahoo-Studie:Retail
Yahoo-Studie:Retail
 
Yahoo-Studie: Mobile Modes
Yahoo-Studie: Mobile ModesYahoo-Studie: Mobile Modes
Yahoo-Studie: Mobile Modes
 
Yahoo-Studie:Gen WiFi
Yahoo-Studie:Gen WiFiYahoo-Studie:Gen WiFi
Yahoo-Studie:Gen WiFi
 
Yahoo-Studie: Subconscious Storytelling
Yahoo-Studie: Subconscious StorytellingYahoo-Studie: Subconscious Storytelling
Yahoo-Studie: Subconscious Storytelling
 
"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013
"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013
"Shopping-Fieber auf allen Kanälen" - Yahoo! Deutschland Retail-Studie 2013
 
Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?
Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?
Yahoo! Studie_Wie wichtig ist "Grün" beim Autokauf?
 
Yahoo!s Tipps zur Sicherheit beim Freigeben von Fotos
Yahoo!s Tipps zur Sicherheit beim Freigeben von FotosYahoo!s Tipps zur Sicherheit beim Freigeben von Fotos
Yahoo!s Tipps zur Sicherheit beim Freigeben von Fotos
 
2012 Yahoo! Zott FMCG Kaufstudie
2012 Yahoo! Zott FMCG Kaufstudie2012 Yahoo! Zott FMCG Kaufstudie
2012 Yahoo! Zott FMCG Kaufstudie
 
One Sheeter Yahoo! Research Case Zott
One Sheeter Yahoo! Research Case ZottOne Sheeter Yahoo! Research Case Zott
One Sheeter Yahoo! Research Case Zott
 
Yahoo! Studie "Finanzberater Internet"
Yahoo! Studie "Finanzberater Internet"Yahoo! Studie "Finanzberater Internet"
Yahoo! Studie "Finanzberater Internet"
 
Yahoo! Vertical Study On Connected Devices
Yahoo! Vertical Study On Connected DevicesYahoo! Vertical Study On Connected Devices
Yahoo! Vertical Study On Connected Devices
 
Yahoo!-Studie - Connected Devices - Alle Wege fuehren ins Web
Yahoo!-Studie - Connected Devices - Alle Wege fuehren ins WebYahoo!-Studie - Connected Devices - Alle Wege fuehren ins Web
Yahoo!-Studie - Connected Devices - Alle Wege fuehren ins Web
 
Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...
Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...
Yahoo!-Studie - Alle Wege führen ins Web – Kauf von Smartphone, Tablet & Co. ...
 
Yahoo Sicherheitstipps Sicheres Surfen
Yahoo Sicherheitstipps Sicheres SurfenYahoo Sicherheitstipps Sicheres Surfen
Yahoo Sicherheitstipps Sicheres Surfen
 
Teilnahmebedingungen Gewinnspiel Coffee to go-Tassen
Teilnahmebedingungen Gewinnspiel Coffee to go-TassenTeilnahmebedingungen Gewinnspiel Coffee to go-Tassen
Teilnahmebedingungen Gewinnspiel Coffee to go-Tassen
 
Yahoo! Weihnachtsstudie 2011
Yahoo! Weihnachtsstudie 2011Yahoo! Weihnachtsstudie 2011
Yahoo! Weihnachtsstudie 2011
 
Vermarkterübergreifende Videostudie "Brands in (E)Motion"
Vermarkterübergreifende Videostudie "Brands in (E)Motion"Vermarkterübergreifende Videostudie "Brands in (E)Motion"
Vermarkterübergreifende Videostudie "Brands in (E)Motion"
 

Yahoo! Engagement Study

  • 1. TECHNICAL REPORT YL-2010-008 EDISCOPE: SOCIAL ANALYTICS FOR ONLINE NEWS Yury Lifshits Santa Clara, CA 95054 {lifshits@yahoo-inc.com} December 20, 2010 Bangalore • Barcelona • Haifa • Montreal • New York Santiago • Silicon Valley
  • 2. Yahoo! Labs Technical Report No. YL-2010-008
  • 3. EDISCOPE: SOCIAL ANALYTICS FOR ONLINE NEWS Yury Lifshits Santa Clara, CA 95054 {lifshits@yahoo-inc.com} December 20, 2010 ABSTRACT: We present Ediscope — an system for measuring social engagement around online news articles. Ediscope collects signals from Twitter, Facebook and Bit.ly. Using our link spotter and social crawler we address a number of questions. What is a lifespan of a typical news story? What are the typical engagement numbers per-pageview? Can social signals be used for pageview estimates? How much improvement a social optimization can bring to a news source? Our first results indicate that less than 20% of activity happens to an article after its first 24 hours. In average a story has 5-20 social actions per 1000 pageviews. For most feeds, top 7 stories a week capture 65% Facebook actions and 25% retweets. The correlation between pageviews and social signals is surprisingly low. Our measurements indicate a double digit improvement potential for social optimizations.
  • 4. 1. Introduction Online news are on the way to become our primary source of information. In order to win the competition and delight the users, the editors of online news have to constantly optimize their con- tent strategy. Content strategy is a new applied discipline that addresses the following questions: What should we write about? How many articles per day? How to allocate coverage shares be- tween main topics? How to discover breaking stories? Which stories to promote within a website? What is the most effective navigation structure for our content? Next to content strategy, there is the emerging field of social media optimization (SMO): How to maximize engagement? How to maximize secondary traffic from social sources (Facebook, Twitter)? How to grow the number of followers, subscribers and fans? To solve the problems of content strategy and social media optimization one needs both art and science. As web news are inherently more measurable than print news, the role of science is increasing. Until recently, most solutions were based on click-through rates, time spent, eye tracking and pageviews. This information is typically available only for website owners. Therefore, it was hard to create generic measurement and optimization solutions. Fortunately, in the last couple of years, social signals emerged as a universal and public feedback mechanism. In this paper, we present a study based on Facebook likes, links in Twitter, and clicks on Bit.ly links. The availability of social signals for content strategy problems created the new research direction of social media analytics [1]. Questions we address in this study: For how long an average article receives user attention? Can we guess the pageviews counts from social signals? Can social signals be used to promote best stories? Should editors focus on producing better content or on producing more content? How much improvement can it bring? Contribution. Our first contribution is the data engineering infrastructure we build for the project. Ediscope system has modules for link discovery, signal monitoring, statistical analysis and visual- ization. Ediscope data and lookup tool are available at http://ediscope.labs.yahoo.net. Currently, Ediscope toolkit is available on request as it is subject to third-party API rate limits. Feel free to contact Yury at lifshits@yahoo-inc.com to use Ediscope for your project or to order a custom report on your favorite news source. Our most surprising finding is the low correlation between social signals and the actual pageivew counts. The gap is especially large for non-top new, in that case Pearson coefficient approaches 0.5. To understand the role of these low correlations we introduce a simple user experience model. Under this model we demonstrate a potential for double digit improvement at Gawker, Business Insider, Change.org and Forbes blogs. In average we see around 10 Facebook/Twitter actions per 1000 pageviews. Correlation between social activities is higher for the top news than in the average case. Mainstream sources have much more Facebook activity than mentions on Twitter. Tech media has the opposite situation. Facebook actions are much more skewed to top news. Finally, Twitter signals have slightly better correlation to pageviews counts.
  • 5. Our results show that almost universally across news sources less than 20% of activity happens after the first 24 hours. Feeds and frontpages drive attention to the latest content units. Search brings traffic to “evergreen” content like Wikipedia. But there is no driver for materials with mid-range (few weeks – few months) lifespan. Perhaps, we need a new promotion mechanism for this type of content. Remark on focus. When scientists work with real world data there are two mindsets. One can focus on hard/intelligent tasks like model fitting and parameter predictions. This approach makes it easy to judge the project by comparing accuracy of results to the previous work. The other method is to measure the raw signals and turn them into actionable insights for domain experts. In this case, the findings can be judged by novelty of measurements and importance of resulting recommendations. This study follows the second approach. Here are our takeaway lessons for editors and product managers of online news: Create new promotion mechanisms for in-depth content. At the moment there is no middle place between breaking news and reference content. Perhaps, we need dedicated feed, section and frontpage module that highlight articles of mid-range lifespan. Use social signals for content optimization. There is a serious gap between what content units are most liked and what content units receive the most pageviews. In other words, user experience can be improved by using Facebook likes and retweet counts to promote the most popular content. Check your engagement scores. If you see less than 10-20 social actions per 1000 pageviews, your sharing functionality can be improved. Typically, it is as simple as getting the buttons of the right size, at the right place and minimize the number of clicks to share your content. Check your head/tail structure. If you have heavy head, improvements in quality and promotion mechanisms should be your priorities. If you have heavy tail, your best opportunity is in expanding content production. According to our measurements, one has heavier-than-typical head if over 75% of weekly Facebook actions or over 45% of weekly retweets is concentrated in top 7 articles. 1.1. Related work Social signals (Facebook likes, Retweet counters, Bit.ly click counters) are relatively new phe- nomena. In particular, Facebook Like button was introduced in May 2010, just 6 months prior to this paper. Until now, social analytics research was centered around text-based signals [6, 7, 8]. To our knowledge, we present the first temporal study of Facebook like counts. Before social signals, researchers were looking into comment counts, Digg counts and Youtube viewcounts. Tsagkias, Weerkamp and de Rijke developed several algorithms to predict the total volume of comments shortly after publication [12, 13]. Paul Ogilvie measured and modeled total comment counts across various RSS feeds as a part of FeedHub project [9]. Cha, Kwak, Rodriguez,
  • 6. Ahn, and Moon performed a long tail analysis of Youtube and Daum videos [3]. Avramova, Wit- tevrongel, Bruneel and De Vleeschauwer developed classifier that distinguishes videos with expo- nential and power law popularity decays [2]. Salman and Rangwala showed how to predict a total Digg count shortly after publication [10]. Spiliotopoulos studied correlations between Digg counts and comment counts for most popular stories [11]. The key advantage of social signals comparing to comment/Digg/Youtube counts is their uni- versality. Only now one can develop optimization/prediction/recommendation systems that will be applicable to any news source on the Web. 2. Overview of Ediscope System 2.1. Architecture of Ediscope. For our study we implemented a new social analytics system called Ediscope. It has four primary components. Link spotting tool is taking RSS feeds as an input and check them regularly to spot new links. In many cases, RSS feeds present proxy links in order to measure clicks from RSS readers. In particular, Feedburner and Pheedo do that. In this cases we convert proxy links to the original ones. The second component is signal crawler. It takes news URLs and calls public APIs (Facebook, Bit.ly, TweetMeme) to retrieve the current numbers for a given story. We also implemented custom scraping for pageview counts. After that, we have monitoring component that re-crawl active links in our database regularly (by default, every hour). Ediscope’s monitor computes the deltas to the previous crawl for measuring activity over the last interval. Monitoring functionality is used for temporal analysis of social engagement. Finally, we call Google Chart API for dynamic visualization of results at Ediscope’s website. In its current form, Ediscope has certain limitations. First of all, APIs we use have strict rate limits. In particular, TweetMeme only allows 250 requests per 60 minute time period. This forced us to focus on smaller datasets. Secondly, the same news article can be represented by several URLs. Sometimes, Facebook, Bit.ly or Twitter fail to recognize these links as the same object. As a result, APIs return lower engagement numbers, missing likes, clicks and retweets on non-canonic versions of an article. E.g. Wall Street Journal has different URLs for a story when you visit it directly vs. when you visit it from the frontpage. Next, many top websites do not have RSS feeds or their feeds do not work properly. For example, Yahoo’s today module, the central piece of its frontpage, does not have a feed. In these cases, one has to use manual lookups or scraping. Finally, Ediscope is using a pull mechanism to discover new stories. By the time we add an article to our system, around 15% of its social activity has already happened. In the future, push mechanisms such as PubHubSubBub can be used to address this issue. There are several commercial systems in the space of social analytics. Postrank is a proprietary article ranking algorithm that takes social signals into account. BackType is a lookup system that retrieves the current values of social metrics. Unlike Ediscope, it does not have the fully accessible temporal profiles or pageview extractor modules. Klout is using social signals to rate news sources and Twitter personalities.
  • 7. 2.2. Datasets. We created three datasets for our study: temporal set, pageview set, head-tail set. For temporal analysis we selected 10 RSS feeds from major US news sources. We used our linkspotting module to discover 20 articles per source. Link spotter was checking RSS feeds every 10 minutes in order to discover articles almost immediately after publishing. Then, we used our monitoring tool to update social counts every hour and compute the corresponding delta values. As a result we have got temporal social profiles for 20 articles at 10 sources. For pageview analysis we consider four major content networks that explicitly show viewcounts at their articles: Business Insider, Gawker, Forbes Blogs and Change.org. For every network, we picked three RSS feeds, launched our link spotting module and kept it live until we spotted around 50-75 articles per network. Then we waited for several days until the total social counts are close to their final values. Then, we used our crawler to measure social counts and pageview counts for every article in our dataset. For head/tail analysis we looked at RSS feeds of several major news sources. For every publisher, we used link spotter to get all articles from a one week period (around 200 articles per feed). Then we crawled them once to collect social counts. 3. Empirical Study 3.1. Article Lifespan In our temporal study we track 20 articles from each of the following sources: Washington Post, Gizmodo, CNN, MSNBC, HuffingtonPost, Yahoo News, New York Times, Engadget, Mashable, and TechCrunch. On average, every story has 901 Facebook actions (likes, shares and Facebook comments), 221 retweets and 660 clicks on from Bitly-shortened links. The following table repre- sents percentages of activities for the first, second, third, forth and fifth interval of 24 hours after publication. Note that the total share of activity is significantly less than 100%. This is due to ac- tivity in the interval between the time a story was published and the time Ediscope has discovered it. Out of all sources, Engadget articles have the slowest decay of activity and Yahoo News has the sharpest decay.
  • 8. Signals in average 1 day 2 day 3 day 4 day 5 day Facebook 73.94 11.57 2.83 1.29 0.48 Twitter 70.71 5.11 1.72 0.69 0.37 Bitly 73.27 8.07 2.49 1.06 1.01 Engadget signals Facebook 56.13 24.40 9.03 4.35 1.99 Twitter 71.27 9.24 4.12 1.28 0.71 Bitly 76.53 10.02 4.12 1.54 0.86 Yahoo News signals Facebook 85.49 6.69 1.01 0.38 0.10 Twitter 84.80 4.21 0.33 0.13 0.00 Bitly 33.88 2.08 0.40 0.21 0.05 Figure 1: Average activity of Engadget article during the first 68 hours of track- ing. Deep blue represents Facebook, light blue represents Twitter, yellow rep- resents Bit.ly. Figure 2: Social activity of Engadget article “BlackBerry users running out of loyalty”
  • 9. Here are our main observations: • Majority (typically, over 80%) of social activity happens during the first 24 hours. • Monotonicity. Majority of shapes are monotone or monotone after daytime correction (bump- next-morning effect). • Twitter is geeky. While mainstream sources like NYT, Yahoo, CNN, MSNBC and Washington Post have up to 10 Facebook actions for one retweet, TechCrunch and Mashable have more retweets than Facebook signals. The Facebook advantage over Twitter in mainstream news indicates that it can be a more reliable signal for content optimization solutions. • Non-original content has lower activity. HuffingtonPost has two patterns: one for original posts, another for aggregated content. Five links from TechCrunch feed are re-posts from CrunchGear and TechCrunch.EU and have much lower counts than TC-proper articles. • User experience flaws. The sharing functionality can have serious affect on total amount of activity. In particular, at New York Times Twitter buttons do not directly tweet the story, but instead ask reader to use Twitter for logging into NYT. The fact that most activity happens during the first day has serious implications for editors and product managers of online news. As our study shows, the currently used mechanisms for promotion (feeds, frontpage promotions, cross-linking) are only capable for driving the first day audience. In such an environment, weekly/analytic/evergreen content is highly discouraged and unsustainable. Thus, if a certain publisher wants to produce longer-lifespan articles, it should depart from existing content promotion strategies. On a positive side, we feel that the opportunity of high quality weekly/monthly analytic content is wide open in almost every vertical. 3.2. Per-pageview Statistics Several online content networks display actual pageview counts. This allows us to compute average amounts of social activity per 1000 pageviews. In some cases several top stories have different activity pattern than the rest of the site. To get more robust results we compute averages both for full sets of articles and the sets excluding top 10 articles. Network Facebook Twitter Bit.ly FB (non-top) TW (non-top) BT (non-top) Gawker 24.59 4.66 13.36 11.55 4.74 2.65 Forbes blogs 4.61 9.16 41.41 5.13 11.86 29.00 Business Insider 3.08 6.40 34.37 3.90 28.99 106.47 Change.org 4.43 2.74 3.54 8.69 4.12 6.25 Then we look at the Pearson correlation coefficient between social signals and the actual pageview counts. We also compute correlations between Facebook and Twitter signals and between Bit.ly and Twitter signals.
  • 10. Network FB / PV TW / PV BT / PV FB / TW BT / TW Gawker 0.92 0.95 0.93 0.95 0.95 Forbes blogs 0.35 0.40 0.63 0.34 0.63 Business Insider 0.93 0.54 0.65 0.65 0.87 Change.org -0.01 0.45 0.05 0.34 0.65 Excluding top 10 news Gawker 0.47 0.63 0.41 0.47 0.35 Forbes blogs 0.12 0.34 0.55 0.31 0.56 Business Insider 0.34 0.43 0.53 0.50 0.80 Change.org 0.67 0.50 -0.09 0.47 0.75 To get a visual sense of correlations we present plots for Gawker and Change.org. Absolute values are scaled to fit in the same space. The top-right point at Gawker plot is in fact far outside of the chart (Gawker has one outstandingly popular story). Figure 3: Correlation between retweets and pageviews at Gawker network Let us make some observations from the above tables: • On average articles have around 10 Facebook/Twitter actions per 1000 pageviews. • With the exception of Facebook signals at Gawker, the top news have less social actions per- pageview than the average stories. • For the non-top news, correlation between social signals and pageviews is around 0.5. Recall that Pearson coefficient is ranging from -1 (perfectly negatively correlated) to 0 (totally inde- pendent) to 1 (perfectly positively correlated). Thus, 0.5 value means that social signals are as close to perfect correlation as they are to total independence. • In 6 cases out of 8, retweets have higher correlation to pageviews than Facebook actions. • Change.org shows negative correlations in some cases. An article is more likely to get Face- book activity if it has less pageviews. It turns out that “Social Entrepreneurship” section has much more pageviews but the same (or even slightly lower) Facebook counts. Once we re- move articles from this section, the correlation returns to positive value. • As expected, bit.ly clicks are better correlated to retweets than Facebook signals.
  • 11. Figure 4: Correlation between pageviews vs. and Facebook (dark blue), Twitter (light blue) and Bit.ly (yellow) signals at Change.org. The gap in pageviews represents difference in popularity between different sections of the portal. Looking at our per-pageview results, one can try to reconstruct pageview counts for the rest of the Web. The baseline guess would be around Facebook count (or Twitter count) times 100. As our measurements show, there are more chances to accurately predict the pageviews for a top story than to do so for an average article. And looking at our lifespan study, we recommend Facebook over Twitter as the primary signal for mainstream sources. What lessons can one learn from these measurements? At the moment the role of social traffic in overall article success seems to be very small. For an average story there is a very low correlation between social signals and its pageview count. When we include top stories to the picture, social activity per pageview actually goes down. These observations hint that factors different from lik- ability and social cascades are playing the leading role in pageview success. As a result, traffic is allocated to not-so-likable stories. Let us do the following mind experiment. Assume for a moment that Facebook count or Twitter count represents the actual reader satisfaction score. Then we can compute the total user satisfaction score as the sum of products between pageviews and Facebook/Twitter counts. Now, let us reallocate pageview counts in a way that the top pageview value corresponds to the top Facebook count, the second top corresponds to the second top and so on. Then, we can calculate the “optimal” user satisfaction score. In other words, we want to check how much user benefit promotion-by-likability can bring to existing content networks. Below is the table of our results. Network FB increase TW increase FB increase (non-top) TW increase (non-top) Gawker 1.019 1.026 1.330 1.181 Forbes blogs 1.566 1.403 1.796 1.341 Business Insider 1.047 1.342 1.402 1.227 Change.org 2.346 1.245 1.109 1.110
  • 12. As we see, all networks have double digit potential to increase their user experience for non-top stories. Forbes blogs and Change.org can significantly increase the overall user experience. Again, Change.org results look a bit weird because it has two cluster of very different articles. One are more likable, others get more pageviews. So the overall experience can be improved significantly. Once we remove the top news (one cluster), the rest of the site can only achieve 10-11% increase. Of course, our model for user satisfaction score is oversimplification, but it can be used as a first-order approximation of possible improvement based on social signals. 3.3. Head vs. Tail Analysis In our final experiment we collect links from RSS feeds at several US news sources over the course of one week. These feeds have from 64 to 226 items per week. Then, for every source we retrieve and sort the social counts for discovered articles. We compute the percentage of weekly social activity that corresponds to the top story, top 7 stories and all stories outside top 7. We use the constant 7 as a reflection of one-story-per-day strategy. Feed Articles tracked Top item: FB / TW Top 7: FB / TW The rest: FB / TW TechCrunch 182 32.3 / 4.6 61.5 / 16.8 38.5 / 83.2 Mashable 162 23.1 / 2.1 47.1 / 13.2 52.9 / 86.8 Wired 120 9.9 / 4.8 41.4 / 24.9 58.6 / 75.1 Engadget 200 44.3 / 18.9 68.7 / 27.5 31.3 / 72.5 Wall Street Journal 201 36.6 / 5.8 65.4 / 18.5 34.6 / 81.5 Vanity Fair 64 21.8 / 11.4 70.5 / 44.7 29.5 / 55.3 Yahoo! Upshot 109 28.8 / 26.0 75.7 / 59.1 24.3 / 40.9 Yahoo! Top News 226 20.9 / 9.1 45.6 / 29.6 54.4 / 70.4 All Things D 139 66.2 / 17.2 89.2 / 41.5 10.8 / 58.5 Gizmodo 82 36.1 / 5.2 70.0 / 21.1 30.0 / 78.9 Aol News 78 19.2 / 11.4 85.1 / 44.4 14.9 / 55.6 One can made several immediate observations: • Typically, around 65% of Facebook actions and 25% of retweets happens around top 7 stories. • Facebook activity is much more heavy-headed than retweets. • Yahoo! Upshot is the most heavy-headed blog in our study. Only 40% of retweets and 25% of Facebook actions happens outside top 7 articles. Perhaps, it is so, because Upshot has very few dedicated readers, and majority of action corresponds to a few Yahoo-wide promoted stories. AllThingsD is also fairly heavy-headed. • Mashable and Wired have the heaviest tails. Both have over over 75% of retweets and over 50% of Facebook actions outside top 7 stories. Let us offer an interpretation from a content optimization perspective. The heavy head of social activity means that the total user satisfaction can be improved by improving quality of the tail
  • 13. content or by finding a better ways to promote it. The heavy tail indicates that the tail content has its own audience and is well promoted. Thus, the best opportunity for heavy-tail websites lies in expanding its content production. For more accurate interpretation, one should track individual consumption patterns. Goel, Broder, Gabrilovich and Pang have recently shown that the purpose of the tail inventory is not only to capture new users but also to better serve users who like some-of- the-top and some-of-the-niche [5]. 4. Roadmap for Social Analytics There is a number of natural next steps for Ediscope framework. First, we can turn our mea- surements into rankings of news sources and individual writers by their engagement scores and lifespans of their content. It is also informative to compare the signals for the same story covered at different destinations. Then, one can do in-depth factor analysis to find what features of content and audience increase the overall success of an article. In particular, what is the role of frontpages and other in-site promotions? Another important step is to release datasets for research community. Pageviews vs. social signals spreadsheet is likely to be published first. As we identified the problem with content of mid-range lifespan, one should have a closer look at this area. Videos and products have a longer lifespan and should be studied through social signals. And, of course, Ediscope should collect larger datasets to make its findings more robust. The general direction of using social signals for content management is wide open. Here is the overview of the key areas. Data engineering. Ediscope establishes the basic architecture for news analytics systems. For a typical study, one needs content discovery, signal crawler, monitoring, statistical analysis and visualization components. Looking into future, the research community will benefit from a shared public stack of these tools. We do not want to recreate the same code again and again. Ediscope platform can be extended in a number of ways. Of course, we need more signals: StumbleUpon, Delicious, Yahoo Site Explorer, Digg, Spinn3r, comment counts, signals from public and private hit counters. In its future versions, Ediscope can incorporate content metadata: author, publisher, keywords, topics, headlines, tags, full text, content type, date and time, staff/guest/sponsored. User data can be harder to add due to privacy concerns, but eventually it will be a part of analytics systems. We need real-time content discovery and signal stream processing. Higher rate limits should be negotiated with API providers. Then, there should be a way to add prediction, ranking and optimization algorithms on top of basic infrastructure. Measurements and modeling. Every category of web content can be a subject of social ana- lytics: video, products, movies, books, websites, blogs, newspapers, magazines, TV shows, and content farms. One can focus either on a particular vertical or on a content network (Yahoo, MSN, Aol). A number of metrics can be created based on social signals: content lifespan, engagement score, engagement-per-visit, share of social traffic to overall pageviews. Once we focus on a certain content source and a metric, it is time for factor analysis. How do features of content, audience
  • 14. and user interface affect the social success of a published material? Then, we need comprehensive industry studies: the baseline numbers for social engagement and leaderboards. Finally, one can create a taxonomy of engagement scenarios of content units. Content optimization. Of course, the ultimate goal of social analytics is not just to collect data and compute some metrics and rankings. The real impact is in using social insights for making better publishing choices. Every online publisher faces the following issues: Choose stories and topics to cover. Balance recency and importance in news coverage. Optimize headlines. Optimize article length. Optimize in-network promotion. Rank its own stream of news [4] and make the best selection for the frontpage. Find and fix underperforming areas. Optimize user interface. Make the best content easy to discover. To conclude, the future of Ediscope and other analytics systems is to recommend choices that maximize social engagement. Acknowledgement. Author thanks Benjamin Moseley and Silvio Lattanzi for fruitful discussions at the early stage of this project. References [1] Workshop on Social Media Analytics, 2010. http://snap.stanford.edu/soma2010. [2] Z. Avramova, S. Wittevrongel, H. Bruneel, and D. De Vleeschauwer. Analysis and model- ing of video popularity evolution in various online video content systems: power-law versus exponential decay. In INTERNET’09. [3] M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. Analyzing the video popular- ity characteristics of large-scale user generated content systems. IEEE/ACM Trans. Netw., 17(5):1357–1370, 2009. [4] G.M. Del Corso, A. Gull´, and F. Romani. Ranking a stream of news. In WWW’05. ı [5] S. Goel, A. Broder, E. Gabrilovich, and B. Pang. Anatomy of the long tail: Ordinary people with extraordinary tastes. In WSDM’10. [6] J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. In KDD’09. [7] M. Mathioudakis and N. Koudas. TwitterMonitor: Trend detection over the Twitter stream. In SIGMOD’10. [8] M. Mendoza, B. Poblete, and C. Castillo. Twitter under crisis: Can we trust what we RT? In SOMA’10. [9] P. Ogilvie. Modeling blog post comment counts, 2008. http://livewebir.com/blog/2008/07/modeling-blog-post-comment-counts/. [10] J. Salman and H. Rangwala. Digging Digg: Comment mining, popularity prediction, and social network analysis. In WISM’09. [11] T. Spiliotopoulos. Votes and comments in recommender systems: The case of Digg, 2010. http://hci.uma.pt/courses/socialweb/projects/2009.digg.paper.pdf.
  • 15. [12] M. Tsagkias, W. Weerkamp, and M. de Rijke. News comments: Exploring, modeling, and online prediction. In ECIR’10. [13] M. Tsagkias, W. Weerkamp, and M. de Rijke. Predicting the volume of comments on online news stories. In CIKM’09.