Yahoo! Engagement Study


Published on

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Yahoo! Engagement Study

  1. 1. TECHNICAL REPORT YL-2010-008EDISCOPE: SOCIAL ANALYTICS FOR ONLINE NEWS Yury Lifshits Santa Clara, CA 95054 {} December 20, 2010 Bangalore • Barcelona • Haifa • Montreal • New York Santiago • Silicon Valley
  2. 2. Yahoo! Labs Technical Report No. YL-2010-008
  3. 3. EDISCOPE: SOCIAL ANALYTICS FOR ONLINE NEWS Yury Lifshits Santa Clara, CA 95054 {} December 20, 2010ABSTRACT: We present Ediscope — an system for measuring social engagement around onlinenews articles. Ediscope collects signals from Twitter, Facebook and Using our link spotterand social crawler we address a number of questions. What is a lifespan of a typical news story?What are the typical engagement numbers per-pageview? Can social signals be used for pageviewestimates? How much improvement a social optimization can bring to a news source? Our firstresults indicate that less than 20% of activity happens to an article after its first 24 hours. In averagea story has 5-20 social actions per 1000 pageviews. For most feeds, top 7 stories a week capture65% Facebook actions and 25% retweets. The correlation between pageviews and social signalsis surprisingly low. Our measurements indicate a double digit improvement potential for socialoptimizations.
  4. 4. 1. Introduction Online news are on the way to become our primary source of information. In order to win thecompetition and delight the users, the editors of online news have to constantly optimize their con-tent strategy. Content strategy is a new applied discipline that addresses the following questions:What should we write about? How many articles per day? How to allocate coverage shares be-tween main topics? How to discover breaking stories? Which stories to promote within a website?What is the most effective navigation structure for our content? Next to content strategy, there isthe emerging field of social media optimization (SMO): How to maximize engagement? How tomaximize secondary traffic from social sources (Facebook, Twitter)? How to grow the number offollowers, subscribers and fans? To solve the problems of content strategy and social media optimization one needs both artand science. As web news are inherently more measurable than print news, the role of scienceis increasing. Until recently, most solutions were based on click-through rates, time spent, eyetracking and pageviews. This information is typically available only for website owners. Therefore,it was hard to create generic measurement and optimization solutions. Fortunately, in the last coupleof years, social signals emerged as a universal and public feedback mechanism. In this paper, wepresent a study based on Facebook likes, links in Twitter, and clicks on links. The availabilityof social signals for content strategy problems created the new research direction of social mediaanalytics [1].Questions we address in this study: For how long an average article receives user attention?Can we guess the pageviews counts from social signals? Can social signals be used to promotebest stories? Should editors focus on producing better content or on producing more content? Howmuch improvement can it bring?Contribution. Our first contribution is the data engineering infrastructure we build for the project.Ediscope system has modules for link discovery, signal monitoring, statistical analysis and visual-ization. Ediscope data and lookup tool are available at, Ediscope toolkit is available on request as it is subject to third-party API rate limits. Feelfree to contact Yury at to use Ediscope for your project or to order a customreport on your favorite news source. Our most surprising finding is the low correlation between social signals and the actual pageivewcounts. The gap is especially large for non-top new, in that case Pearson coefficient approaches 0.5.To understand the role of these low correlations we introduce a simple user experience model. Underthis model we demonstrate a potential for double digit improvement at Gawker, Business Insider, and Forbes blogs. In average we see around 10 Facebook/Twitter actions per 1000 pageviews. Correlation betweensocial activities is higher for the top news than in the average case. Mainstream sources have muchmore Facebook activity than mentions on Twitter. Tech media has the opposite situation. Facebookactions are much more skewed to top news. Finally, Twitter signals have slightly better correlationto pageviews counts.
  5. 5. Our results show that almost universally across news sources less than 20% of activity happensafter the first 24 hours. Feeds and frontpages drive attention to the latest content units. Search bringstraffic to “evergreen” content like Wikipedia. But there is no driver for materials with mid-range(few weeks – few months) lifespan. Perhaps, we need a new promotion mechanism for this type ofcontent.Remark on focus. When scientists work with real world data there are two mindsets. One canfocus on hard/intelligent tasks like model fitting and parameter predictions. This approach makes iteasy to judge the project by comparing accuracy of results to the previous work. The other method isto measure the raw signals and turn them into actionable insights for domain experts. In this case, thefindings can be judged by novelty of measurements and importance of resulting recommendations.This study follows the second approach. Here are our takeaway lessons for editors and productmanagers of online news:Create new promotion mechanisms for in-depth content. At the moment there is no middle place between breaking news and reference content. Perhaps, we need dedicated feed, section and frontpage module that highlight articles of mid-range lifespan.Use social signals for content optimization. There is a serious gap between what content units are most liked and what content units receive the most pageviews. In other words, user experience can be improved by using Facebook likes and retweet counts to promote the most popular content.Check your engagement scores. If you see less than 10-20 social actions per 1000 pageviews, your sharing functionality can be improved. Typically, it is as simple as getting the buttons of the right size, at the right place and minimize the number of clicks to share your content.Check your head/tail structure. If you have heavy head, improvements in quality and promotion mechanisms should be your priorities. If you have heavy tail, your best opportunity is in expanding content production. According to our measurements, one has heavier-than-typical head if over 75% of weekly Facebook actions or over 45% of weekly retweets is concentrated in top 7 articles.1.1. Related work Social signals (Facebook likes, Retweet counters, click counters) are relatively new phe-nomena. In particular, Facebook Like button was introduced in May 2010, just 6 months prior tothis paper. Until now, social analytics research was centered around text-based signals [6, 7, 8]. Toour knowledge, we present the first temporal study of Facebook like counts. Before social signals, researchers were looking into comment counts, Digg counts and Youtubeviewcounts. Tsagkias, Weerkamp and de Rijke developed several algorithms to predict the totalvolume of comments shortly after publication [12, 13]. Paul Ogilvie measured and modeled totalcomment counts across various RSS feeds as a part of FeedHub project [9]. Cha, Kwak, Rodriguez,
  6. 6. Ahn, and Moon performed a long tail analysis of Youtube and Daum videos [3]. Avramova, Wit-tevrongel, Bruneel and De Vleeschauwer developed classifier that distinguishes videos with expo-nential and power law popularity decays [2]. Salman and Rangwala showed how to predict a totalDigg count shortly after publication [10]. Spiliotopoulos studied correlations between Digg countsand comment counts for most popular stories [11]. The key advantage of social signals comparing to comment/Digg/Youtube counts is their uni-versality. Only now one can develop optimization/prediction/recommendation systems that will beapplicable to any news source on the Web.2. Overview of Ediscope System2.1. Architecture of Ediscope. For our study we implemented a new social analytics system called Ediscope. It has four primarycomponents. Link spotting tool is taking RSS feeds as an input and check them regularly to spotnew links. In many cases, RSS feeds present proxy links in order to measure clicks from RSSreaders. In particular, Feedburner and Pheedo do that. In this cases we convert proxy links tothe original ones. The second component is signal crawler. It takes news URLs and calls publicAPIs (Facebook,, TweetMeme) to retrieve the current numbers for a given story. We alsoimplemented custom scraping for pageview counts. After that, we have monitoring componentthat re-crawl active links in our database regularly (by default, every hour). Ediscope’s monitorcomputes the deltas to the previous crawl for measuring activity over the last interval. Monitoringfunctionality is used for temporal analysis of social engagement. Finally, we call Google Chart APIfor dynamic visualization of results at Ediscope’s website. In its current form, Ediscope has certain limitations. First of all, APIs we use have strict ratelimits. In particular, TweetMeme only allows 250 requests per 60 minute time period. This forcedus to focus on smaller datasets. Secondly, the same news article can be represented by severalURLs. Sometimes, Facebook, or Twitter fail to recognize these links as the same object. Asa result, APIs return lower engagement numbers, missing likes, clicks and retweets on non-canonicversions of an article. E.g. Wall Street Journal has different URLs for a story when you visit itdirectly vs. when you visit it from the frontpage. Next, many top websites do not have RSS feedsor their feeds do not work properly. For example, Yahoo’s today module, the central piece of itsfrontpage, does not have a feed. In these cases, one has to use manual lookups or scraping. Finally,Ediscope is using a pull mechanism to discover new stories. By the time we add an article to oursystem, around 15% of its social activity has already happened. In the future, push mechanismssuch as PubHubSubBub can be used to address this issue. There are several commercial systems in the space of social analytics. Postrank is a proprietaryarticle ranking algorithm that takes social signals into account. BackType is a lookup system thatretrieves the current values of social metrics. Unlike Ediscope, it does not have the fully accessibletemporal profiles or pageview extractor modules. Klout is using social signals to rate news sourcesand Twitter personalities.
  7. 7. 2.2. Datasets. We created three datasets for our study: temporal set, pageview set, head-tail set. For temporalanalysis we selected 10 RSS feeds from major US news sources. We used our linkspotting moduleto discover 20 articles per source. Link spotter was checking RSS feeds every 10 minutes in orderto discover articles almost immediately after publishing. Then, we used our monitoring tool toupdate social counts every hour and compute the corresponding delta values. As a result we havegot temporal social profiles for 20 articles at 10 sources. For pageview analysis we consider fourmajor content networks that explicitly show viewcounts at their articles: Business Insider, Gawker,Forbes Blogs and For every network, we picked three RSS feeds, launched our linkspotting module and kept it live until we spotted around 50-75 articles per network. Then we waitedfor several days until the total social counts are close to their final values. Then, we used our crawlerto measure social counts and pageview counts for every article in our dataset. For head/tail analysiswe looked at RSS feeds of several major news sources. For every publisher, we used link spotter toget all articles from a one week period (around 200 articles per feed). Then we crawled them onceto collect social counts.3. Empirical Study3.1. Article Lifespan In our temporal study we track 20 articles from each of the following sources: Washington Post,Gizmodo, CNN, MSNBC, HuffingtonPost, Yahoo News, New York Times, Engadget, Mashable,and TechCrunch. On average, every story has 901 Facebook actions (likes, shares and Facebookcomments), 221 retweets and 660 clicks on from Bitly-shortened links. The following table repre-sents percentages of activities for the first, second, third, forth and fifth interval of 24 hours afterpublication. Note that the total share of activity is significantly less than 100%. This is due to ac-tivity in the interval between the time a story was published and the time Ediscope has discoveredit. Out of all sources, Engadget articles have the slowest decay of activity and Yahoo News has thesharpest decay.
  8. 8. Signals in average 1 day 2 day 3 day 4 day 5 dayFacebook 73.94 11.57 2.83 1.29 0.48Twitter 70.71 5.11 1.72 0.69 0.37Bitly 73.27 8.07 2.49 1.06 1.01Engadget signalsFacebook 56.13 24.40 9.03 4.35 1.99Twitter 71.27 9.24 4.12 1.28 0.71Bitly 76.53 10.02 4.12 1.54 0.86Yahoo News signalsFacebook 85.49 6.69 1.01 0.38 0.10Twitter 84.80 4.21 0.33 0.13 0.00Bitly 33.88 2.08 0.40 0.21 0.05 Figure 1: Average activity of Engadget article during the first 68 hours of track- ing. Deep blue represents Facebook, light blue represents Twitter, yellow rep- resents Figure 2: Social activity of Engadget article “BlackBerry users running out of loyalty”
  9. 9. Here are our main observations: • Majority (typically, over 80%) of social activity happens during the first 24 hours. • Monotonicity. Majority of shapes are monotone or monotone after daytime correction (bump- next-morning effect). • Twitter is geeky. While mainstream sources like NYT, Yahoo, CNN, MSNBC and Washington Post have up to 10 Facebook actions for one retweet, TechCrunch and Mashable have more retweets than Facebook signals. The Facebook advantage over Twitter in mainstream news indicates that it can be a more reliable signal for content optimization solutions. • Non-original content has lower activity. HuffingtonPost has two patterns: one for original posts, another for aggregated content. Five links from TechCrunch feed are re-posts from CrunchGear and TechCrunch.EU and have much lower counts than TC-proper articles. • User experience flaws. The sharing functionality can have serious affect on total amount of activity. In particular, at New York Times Twitter buttons do not directly tweet the story, but instead ask reader to use Twitter for logging into NYT. The fact that most activity happens during the first day has serious implications for editorsand product managers of online news. As our study shows, the currently used mechanisms forpromotion (feeds, frontpage promotions, cross-linking) are only capable for driving the first dayaudience. In such an environment, weekly/analytic/evergreen content is highly discouraged andunsustainable. Thus, if a certain publisher wants to produce longer-lifespan articles, it should departfrom existing content promotion strategies. On a positive side, we feel that the opportunity of highquality weekly/monthly analytic content is wide open in almost every vertical.3.2. Per-pageview Statistics Several online content networks display actual pageview counts. This allows us to computeaverage amounts of social activity per 1000 pageviews. In some cases several top stories havedifferent activity pattern than the rest of the site. To get more robust results we compute averagesboth for full sets of articles and the sets excluding top 10 articles. Network Facebook Twitter FB (non-top) TW (non-top) BT (non-top) Gawker 24.59 4.66 13.36 11.55 4.74 2.65 Forbes blogs 4.61 9.16 41.41 5.13 11.86 29.00 Business Insider 3.08 6.40 34.37 3.90 28.99 106.47 4.43 2.74 3.54 8.69 4.12 6.25 Then we look at the Pearson correlation coefficient between social signals and the actual pageviewcounts. We also compute correlations between Facebook and Twitter signals and between andTwitter signals.
  10. 10. Network FB / PV TW / PV BT / PV FB / TW BT / TW Gawker 0.92 0.95 0.93 0.95 0.95 Forbes blogs 0.35 0.40 0.63 0.34 0.63 Business Insider 0.93 0.54 0.65 0.65 0.87 -0.01 0.45 0.05 0.34 0.65 Excluding top 10 news Gawker 0.47 0.63 0.41 0.47 0.35 Forbes blogs 0.12 0.34 0.55 0.31 0.56 Business Insider 0.34 0.43 0.53 0.50 0.80 0.67 0.50 -0.09 0.47 0.75 To get a visual sense of correlations we present plots for Gawker and Absolutevalues are scaled to fit in the same space. The top-right point at Gawker plot is in fact far outside ofthe chart (Gawker has one outstandingly popular story). Figure 3: Correlation between retweets and pageviews at Gawker network Let us make some observations from the above tables: • On average articles have around 10 Facebook/Twitter actions per 1000 pageviews. • With the exception of Facebook signals at Gawker, the top news have less social actions per- pageview than the average stories. • For the non-top news, correlation between social signals and pageviews is around 0.5. Recall that Pearson coefficient is ranging from -1 (perfectly negatively correlated) to 0 (totally inde- pendent) to 1 (perfectly positively correlated). Thus, 0.5 value means that social signals are as close to perfect correlation as they are to total independence. • In 6 cases out of 8, retweets have higher correlation to pageviews than Facebook actions. • shows negative correlations in some cases. An article is more likely to get Face- book activity if it has less pageviews. It turns out that “Social Entrepreneurship” section has much more pageviews but the same (or even slightly lower) Facebook counts. Once we re- move articles from this section, the correlation returns to positive value. • As expected, clicks are better correlated to retweets than Facebook signals.
  11. 11. Figure 4: Correlation between pageviews vs. and Facebook (dark blue), Twitter (light blue) and (yellow) signals at The gap in pageviews represents difference in popularity between different sections of the portal. Looking at our per-pageview results, one can try to reconstruct pageview counts for the rest ofthe Web. The baseline guess would be around Facebook count (or Twitter count) times 100. As ourmeasurements show, there are more chances to accurately predict the pageviews for a top story thanto do so for an average article. And looking at our lifespan study, we recommend Facebook overTwitter as the primary signal for mainstream sources. What lessons can one learn from these measurements? At the moment the role of social trafficin overall article success seems to be very small. For an average story there is a very low correlationbetween social signals and its pageview count. When we include top stories to the picture, socialactivity per pageview actually goes down. These observations hint that factors different from lik-ability and social cascades are playing the leading role in pageview success. As a result, traffic isallocated to not-so-likable stories. Let us do the following mind experiment. Assume for a moment that Facebook count or Twittercount represents the actual reader satisfaction score. Then we can compute the total user satisfactionscore as the sum of products between pageviews and Facebook/Twitter counts. Now, let us reallocatepageview counts in a way that the top pageview value corresponds to the top Facebook count, thesecond top corresponds to the second top and so on. Then, we can calculate the “optimal” usersatisfaction score. In other words, we want to check how much user benefit promotion-by-likabilitycan bring to existing content networks. Below is the table of our results. Network FB increase TW increase FB increase (non-top) TW increase (non-top) Gawker 1.019 1.026 1.330 1.181 Forbes blogs 1.566 1.403 1.796 1.341 Business Insider 1.047 1.342 1.402 1.227 2.346 1.245 1.109 1.110
  12. 12. As we see, all networks have double digit potential to increase their user experience for non-topstories. Forbes blogs and can significantly increase the overall user experience. Again, results look a bit weird because it has two cluster of very different articles. One aremore likable, others get more pageviews. So the overall experience can be improved significantly.Once we remove the top news (one cluster), the rest of the site can only achieve 10-11% increase. Ofcourse, our model for user satisfaction score is oversimplification, but it can be used as a first-orderapproximation of possible improvement based on social signals.3.3. Head vs. Tail Analysis In our final experiment we collect links from RSS feeds at several US news sources over thecourse of one week. These feeds have from 64 to 226 items per week. Then, for every source weretrieve and sort the social counts for discovered articles. We compute the percentage of weeklysocial activity that corresponds to the top story, top 7 stories and all stories outside top 7. We usethe constant 7 as a reflection of one-story-per-day strategy. Feed Articles tracked Top item: FB / TW Top 7: FB / TW The rest: FB / TW TechCrunch 182 32.3 / 4.6 61.5 / 16.8 38.5 / 83.2 Mashable 162 23.1 / 2.1 47.1 / 13.2 52.9 / 86.8 Wired 120 9.9 / 4.8 41.4 / 24.9 58.6 / 75.1 Engadget 200 44.3 / 18.9 68.7 / 27.5 31.3 / 72.5 Wall Street Journal 201 36.6 / 5.8 65.4 / 18.5 34.6 / 81.5 Vanity Fair 64 21.8 / 11.4 70.5 / 44.7 29.5 / 55.3 Yahoo! Upshot 109 28.8 / 26.0 75.7 / 59.1 24.3 / 40.9 Yahoo! Top News 226 20.9 / 9.1 45.6 / 29.6 54.4 / 70.4 All Things D 139 66.2 / 17.2 89.2 / 41.5 10.8 / 58.5 Gizmodo 82 36.1 / 5.2 70.0 / 21.1 30.0 / 78.9 Aol News 78 19.2 / 11.4 85.1 / 44.4 14.9 / 55.6 One can made several immediate observations: • Typically, around 65% of Facebook actions and 25% of retweets happens around top 7 stories. • Facebook activity is much more heavy-headed than retweets. • Yahoo! Upshot is the most heavy-headed blog in our study. Only 40% of retweets and 25% of Facebook actions happens outside top 7 articles. Perhaps, it is so, because Upshot has very few dedicated readers, and majority of action corresponds to a few Yahoo-wide promoted stories. AllThingsD is also fairly heavy-headed. • Mashable and Wired have the heaviest tails. Both have over over 75% of retweets and over 50% of Facebook actions outside top 7 stories. Let us offer an interpretation from a content optimization perspective. The heavy head of socialactivity means that the total user satisfaction can be improved by improving quality of the tail
  13. 13. content or by finding a better ways to promote it. The heavy tail indicates that the tail contenthas its own audience and is well promoted. Thus, the best opportunity for heavy-tail websites liesin expanding its content production. For more accurate interpretation, one should track individualconsumption patterns. Goel, Broder, Gabrilovich and Pang have recently shown that the purpose ofthe tail inventory is not only to capture new users but also to better serve users who like some-of-the-top and some-of-the-niche [5].4. Roadmap for Social Analytics There is a number of natural next steps for Ediscope framework. First, we can turn our mea-surements into rankings of news sources and individual writers by their engagement scores andlifespans of their content. It is also informative to compare the signals for the same story coveredat different destinations. Then, one can do in-depth factor analysis to find what features of contentand audience increase the overall success of an article. In particular, what is the role of frontpagesand other in-site promotions? Another important step is to release datasets for research community.Pageviews vs. social signals spreadsheet is likely to be published first. As we identified the problemwith content of mid-range lifespan, one should have a closer look at this area. Videos and productshave a longer lifespan and should be studied through social signals. And, of course, Ediscope shouldcollect larger datasets to make its findings more robust. The general direction of using social signals for content management is wide open. Here is theoverview of the key areas.Data engineering. Ediscope establishes the basic architecture for news analytics systems. Fora typical study, one needs content discovery, signal crawler, monitoring, statistical analysis andvisualization components. Looking into future, the research community will benefit from a sharedpublic stack of these tools. We do not want to recreate the same code again and again. Ediscopeplatform can be extended in a number of ways. Of course, we need more signals: StumbleUpon,Delicious, Yahoo Site Explorer, Digg, Spinn3r, comment counts, signals from public and privatehit counters. In its future versions, Ediscope can incorporate content metadata: author, publisher,keywords, topics, headlines, tags, full text, content type, date and time, staff/guest/sponsored. Userdata can be harder to add due to privacy concerns, but eventually it will be a part of analyticssystems. We need real-time content discovery and signal stream processing. Higher rate limitsshould be negotiated with API providers. Then, there should be a way to add prediction, rankingand optimization algorithms on top of basic infrastructure.Measurements and modeling. Every category of web content can be a subject of social ana-lytics: video, products, movies, books, websites, blogs, newspapers, magazines, TV shows, andcontent farms. One can focus either on a particular vertical or on a content network (Yahoo, MSN,Aol). A number of metrics can be created based on social signals: content lifespan, engagementscore, engagement-per-visit, share of social traffic to overall pageviews. Once we focus on a certaincontent source and a metric, it is time for factor analysis. How do features of content, audience
  14. 14. and user interface affect the social success of a published material? Then, we need comprehensiveindustry studies: the baseline numbers for social engagement and leaderboards. Finally, one cancreate a taxonomy of engagement scenarios of content units.Content optimization. Of course, the ultimate goal of social analytics is not just to collect dataand compute some metrics and rankings. The real impact is in using social insights for makingbetter publishing choices. Every online publisher faces the following issues: Choose stories andtopics to cover. Balance recency and importance in news coverage. Optimize headlines. Optimizearticle length. Optimize in-network promotion. Rank its own stream of news [4] and make the bestselection for the frontpage. Find and fix underperforming areas. Optimize user interface. Make thebest content easy to discover. To conclude, the future of Ediscope and other analytics systems is torecommend choices that maximize social engagement.Acknowledgement. Author thanks Benjamin Moseley and Silvio Lattanzi for fruitful discussionsat the early stage of this project.References [1] Workshop on Social Media Analytics, 2010. [2] Z. Avramova, S. Wittevrongel, H. Bruneel, and D. De Vleeschauwer. Analysis and model- ing of video popularity evolution in various online video content systems: power-law versus exponential decay. In INTERNET’09. [3] M. Cha, H. Kwak, P. Rodriguez, Y.-Y. Ahn, and S. Moon. Analyzing the video popular- ity characteristics of large-scale user generated content systems. IEEE/ACM Trans. Netw., 17(5):1357–1370, 2009. [4] G.M. Del Corso, A. Gull´, and F. Romani. Ranking a stream of news. In WWW’05. ı [5] S. Goel, A. Broder, E. Gabrilovich, and B. Pang. Anatomy of the long tail: Ordinary people with extraordinary tastes. In WSDM’10. [6] J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. In KDD’09. [7] M. Mathioudakis and N. Koudas. TwitterMonitor: Trend detection over the Twitter stream. In SIGMOD’10. [8] M. Mendoza, B. Poblete, and C. Castillo. Twitter under crisis: Can we trust what we RT? In SOMA’10. [9] P. Ogilvie. Modeling blog post comment counts, 2008.[10] J. Salman and H. Rangwala. Digging Digg: Comment mining, popularity prediction, and social network analysis. In WISM’09.[11] T. Spiliotopoulos. Votes and comments in recommender systems: The case of Digg, 2010.
  15. 15. [12] M. Tsagkias, W. Weerkamp, and M. de Rijke. News comments: Exploring, modeling, and online prediction. In ECIR’10.[13] M. Tsagkias, W. Weerkamp, and M. de Rijke. Predicting the volume of comments on online news stories. In CIKM’09.