Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Daily Deals - Cornell University


Published on

Daily Deals: Prediction, Social Diffusion, and Reputational Ramifications

September 7th, 2011

John W. Byers
Computer Science Dept.
Boston University

Michael Mitzenmacher
School of Eng. Appl. Sci.
Harvard University

Georgios Zervas
Computer Science Dept.
Boston University

Published in: Education
  • Be the first to comment

Daily Deals - Cornell University

  1. 1. Daily Deals: Prediction, Social Diffusion, and Reputational Ramifications ∗ † ∗ John W. Byers Michael Mitzenmacher Georgios Zervas Computer Science Dept. School of Eng. Appl. Sci. Computer Science Dept. Boston University Harvard University Boston University ABSTRACT as does LivingSocial. Deals each have a minimum threshold Daily deal sites have become the latest Internet sensation, size that must be reached for the deal to take hold, and providing discounted offers to customers for restaurants, tick- sellers may also set a maximum threshold size to limit the eted events, services, and other items. We begin by under- number of coupons sold. taking a study of the economics of daily deals on the web, Daily deal sites represent a change from recent Internet advertising trends. While large-scale e-mail distributionsarXiv:1109.1530v1 [cs.SI] 7 Sep 2011 based on a dataset we compiled by monitoring Groupon and LivingSocial sales in 20 large cities over several months. We for sale offers are commonplace (generally in the form of use this dataset to characterize deal purchases; glean insights spam) and coupon sites have long existed on the Internet, about operational strategies of these firms; and evaluate cus- Groupon and LivingSocial have achieved notable success tomers’ sensitivity to factors such as price, deal schedul- with their emphasis on higher quality localized deals, as well ing, and limited inventory. We then marry our daily deals as their marketing savvy both with respect to buyers and dataset with additional datasets we compiled from Facebook sellers (merchants). This paper represents an attempt to and Yelp users to study the interplay between social net- gain insight into the success of this business model, using a works and daily deal sites. First, by studying user activity combination of data analysis and modeling. on Facebook while a deal is running, we provide evidence The contributions of the paper are as follows: that daily deal sites benefit from significant word-of-mouth • We compile and analyze datasets we gathered moni- effects during sales events, consistent with results predicted toring Groupon over a period of six months and Liv- by cascade models. Second, we consider the effects of daily ingSocial over a period of three months in 20 large US deals on the longer-term reputation of merchants, based on markets. Our datasets will be made publicly available their Yelp reviews before and after they run a daily deal. (on publication of this paper). Our analysis shows that while the number of reviews in- • We consider how the price elasticity of demand, as well creases significantly due to daily deals, average rating scores as what we call “soft incentives”, affect the size and from reviewers who mention daily deals are 10% lower than revenue of Groupon and LivingSocial deals empirically. scores of their peers on average. Soft incentives include deal aspects other than price, such as whether a deal is featured and what days of 1. INTRODUCTION the week it is available. Groupon and LivingSocial are websites offering various • We study the predictability of the size of Groupon deals-of-the-day, with localized deals for major geographic deals, based on deal parameters and on temporal progress. markets. Groupon in particular has been one of the fastest We show that deal sizes can be predicted with moder- growing Internet sales businesses in history, with tens of ate accuracy based on a small number of parameters, millions of registered users and 2011 sales expected to exceed and with substantially better accuracy shortly after a 1 billion dollars. deal goes live. We briefly describe how daily deal sites work; additional • We examine dependencies between the spread of Groupon details relevant to our measurement methodology will be deals and social networks by cross-referencing our Groupon given subsequently. In each geographic market, or city, there dataset with Facebook data tracking the frequency are one or more deals of the day. Generally, one deal in with which users “like” Groupon deals. We offer evi- each market is the featured deal of the day, and receives dence that propagation of Groupon deals is consistent the prominent position on the primary webpage targeting with predictions of social spreading made by cascade that market. The deal provides a coupon for some product models. or service at a substantial discount (generally 40-60%) to the list price. Deals may be available for one or more days. • We examine the change in reputation of merchants We use the term size of a deal to represent the number of based on their Yelp reviews before and after they run coupons sold, and the term revenue of a deal to represent a Groupon deal. We find that reviewers mentioning the number of coupons multiplied by the price per coupon. daily deals are significantly more negative than their Groupon retains approximately half the revenue from the peers on average, and the volume of their reviews ma- discounted coupons [10], and provides the rest to the seller, terially lowers Yelp scores in the months after a daily ∗E-mail: {byers, zg} Supported in part by Ad- deal offering. verplex, Inc. and by NSF grant CNS-1040800. We note that we presented preliminary findings based on †E-mail: Supported in part by a single month of Groupon data that focused predominantly NSF grants CCF-0915922 and CNS-0721491, and in part by on the issue of soft incentives in a technical report [3]. The a research grant from Yahoo! current paper enriches that study in several ways, both in
  2. 2. its consideration of LivingSocial as a comparison point, and riod. Our criteria for city selection were population and ge-especially in our use of social network data sources, such as ographic distribution. Specifically, our list of cities includes:Facebook and Yelp, to study deal sites. Indeed, we believe Atlanta, Boston, Chicago, Dallas, Detroit, Houston, Las Ve-this use of multiple disparate data sources, while not novel gas, Los Angeles, Miami, New Orleans, New York, Orlando,as a research methodology, appears original in this context Philadelphia, San Diego, San Francisco, San Jose, Seattle,of gaining insight into deal sites. Tallahassee, Vancouver, and Washington DC. In total, our Before continuing, we acknowledge that a reasonable ques- data set contains statistics for 16,692 deals.tion is why we gathered data ourselves, instead of asking Each Groupon deal is associated with a set of features:Groupon for data; such data (if provided) would likely be the deal description, the retail and discounted prices, themore accurate and possibly more comprehensive. We of- start and end dates, the threshold number of sales requiredfer several justifications. First, by gathering our own data, for the deal to be activated, the number of coupons sold,we can make it public, for others to use and to verify our whether the deal was available in limited quantities, and ifresults. Second, by relying on a deals site as a source for it sold out. Each deal is also associated with a category suchdata, we would be limited to data they were willing to pro- as “Restaurants”, “Nightlife”, or “Automotive”. From thesevide, as opposed to data we thought we needed (and was basic features we compute further quantities of interest suchpublicly available). Gathering our own data also motivated as the revenue derived by each deal, the deal duration, andus to gather and compare data from multiple sources. Fi- the percentage discount.nally, due to fortuitous timing, Groupon’s recent S-1 filing With each Groupon deal, we collected intraday time-series[10] allowed us to validate several aggregate measures of the data which monitors two time-varying parameters: cumula-datasets we collected. tive sales, and whether or not a given deal is currently fea-Related Work on Daily Deals: To date, there has tured. To compile these time-series, we monitored each dealbeen little previous work examining Groupon and LivingSo- in roughly ten-minute intervals and downloaded the value ofcial specifically. Edelman et al. consider the benefits and the sales counter. Occasionally some of our requests faileddrawbacks of using Groupon from the side of the merchant, and therefore some gaps are present in our time-series data,modeling whether the advertising and price discrimination but this does not materially affect our conclusions. Theeffects can make such discounts profitable [9]. Dholakia polls second parameter we monitored was whether a deal was fea-businesses to determine their experience of providing a deal tured or not, with featured deals being those deals that arewith Groupon [8], and Arabshahi examines their business presented in the subject line of daily subscriber e-mails whilemodel [2]. Several works have studied other online group being given prominent presentation in the associated city’sbuying schemes that arose before Groupon, and that utilize webpage. For example, visiting, onesubstantially different dynamic pricing schemes [1, 12]. Ye notices that a single deal occupies a significant proportionet al. recently provide a stochastic “tipping point” model of the screen real-estate, while the rest of the deals whichfor sales from daily deal sites that incorporates social net- are concurrently active are summarized in a smaller effects [21]. They provide supporting evidence for their Although Groupon has a public API1 through which onemodel using a Groupon data set they collected that is sim- can obtain some basic deal information, we decided also toilar to, but less comprehensive, than ours, but they do not monitor the Groupon website directly. Our primary ratio-measure social network activity. nale was that certain deal features, such as whether a link to reviews for the merchant offering the deal was present, were2. THE DAILY DEALS LANDSCAPE not available through the Groupon API. We used the API to obtain a category for each deal and to validate the sales In this section, we describe the current landscape of daily data we collected. Observed discrepancies were infrequentdeal sites exemplified by Groupon and LivingSocial. We and small: we used the API-collected data as the groundstart by describing the measurement methodology we em- truth in these cases. We did not use the API to collectployed to collect longitudinal data from these sites, and time-series data.provide additional background on how these sites operate. We collected data from LivingSocial between March 21stWe then describe basic insights that can be gleaned directly and July 3rd, 2011 for the same set of 20 cities. In total,from our datasets, including revenue and sales broken out our LivingSocial dataset contains 2,609 deals. LivingSocialby week, by deal, by geographic location, and by deal type. deals differ from their Groupon counterparts in that theyMoving on, we observe that given an offering, daily deal have no tipping point, and in that they do not explicitlysites can optimize the performance of the offering around indicate whether they are available in limited quantities (al-various parameters, most obviously price, but also day-of- though they do sell out occasionally). LivingSocial runs twoweek, duration, etc. We explore these through the lens of types of deals: one featured deal per day, and a secondaryour datasets. “Family Edition” deal, which offers family-friendly activities,2.1 Measurement Methodology and receives less prominent placement on the LivingSocial website. For LivingSocial deals we only collected data on We collected longitudinal data from the top two group their outcomes; we did not collect time-series sites, Groupon and LivingSocial, as well as from Face-book and Yelp. Our datasets are complex and we describe 2.1.2 Facebook datathem in detail below. Both Groupon and LivingSocial display a Facebook Like2.1.1 Deal data button for each deal, where the Like button is associated with a counter representing the number of Facebook users We collected data from Groupon between January 3rd and who have clicked the button to express their positive feed-July 3rd, 2011. We monitored – to the best of our knowl- 1edge – all deals offered in 20 different cities during this pe-
  3. 3. 300K $15M Superbowl ads Sales q Revenue Sales 750K Revenue $6M q FTD offer q q 1 q q 2 q q 3 q q q q q q q q q q q q 3 200K $10M q q q q q q q q q 500K q q $4M q q 1 q qq 2 q Revenue q Revenue q q Sales q q q Sales q q q q q q 100K $5M 250K $2M q Barnes & Noble 1 q 1 5 Nights in Puerto Vallarta q Body Shop 2 q 2 5 Days in Cabo San Lucas q Old Navy 3 q 7 Nights in the Caribbean $0M 3 $0M 0K 0K Jan Mar May Jul Apr May Jun (a) Groupon (b) LivingSocial Figure 1: Weekly revenue and sales in 20 selected cities (2011). 1000 q Sales/deal q q Revenue/deal Sales/deal 1200 q Revenue/deal $20K $30K q q q q q q q q q q q 800 q 1000 $25K q q q Revenue/deal $15K q q q Revenue/deal Sales/deal q q q q Sales/deal q q q q q q q q $20K q q 800 q q q q 600 $10K q q $15K q 600 $5K 400 Jan Mar May Jul Apr May Jun (a) Groupon (b) LivingSocial Figure 2: Revenue and coupons sold per deal week-over-week.back. We refer to the value of the counter as the number contains 56,048 reviews, for 2,332 merchants who ran 2,496of likes a deal has received, and we collected this value for deals on Groupon during our monitoring period. Yelp haseach Groupon and LivingSocial deal in our dataset. implemented one measure to discourage the automated col- As a technical aside, we mention that Groupon and Liv- lection of reviews which directly affected our data collec-ingSocial have different implementations of the Facebook tion: it hides a set of user reviews for each merchant. ToLike button that necessitated our collecting data from them see the hidden reviews, one has to solve a CAPTCHA. Wein different ways. Within each deal page, Groupon embeds did not attempt to circumvent CAPTCHAs, and we do notcode that dynamically renders the Facebook Like button. It know whether the hidden reviews are randomly selected, ordoes so by sending Facebook a request that contains a unique are selected by other criteria. Since Yelp reports the totalidentifier associated with the corresponding deal page. We number of reviews available per merchant, we ascertainedextracted the unique identifier from Groupon deal pages and that approximately 23% of all reviews for merchants in ourdirectly contacted Facebook to obtain the number of likes for dataset were hidden from our collection.every deal. LivingSocial instead hard-codes the Like buttonand its associated counter within each page. As we couldnot obtain the identifier associated with each LivingSocial 2.2 Operational Insightsdeal, we could not query Facebook to independently obtain Figure 1 serves as an overview of insights we are able tothe number of likes, and thus we collected the hard-coded gain using our dataset. It displays the weekly revenue as wellnumber from LivingSocial deal pages. as the weekly sales of coupons across all 20 cities we moni- tored for Groupon and LivingSocial, respectively. Notably,2.1.3 Yelp data while both Groupon and LivingSocial are widely regarded as Groupon occasionally displays reviews for the merchant companies enjoying extremely rapid growth, our first take-offering the deal in the form of a star-rating, as well as se- away from these plots is that sales and revenue in these 20lected reviewer comments. The reviews are sourced from established markets are relatively flat across the time pe-major review sites such as Yelp, Citysearch, and TripAdvi- riod. We conjecture that much of the reported growth is insor. For Groupon deals that were linked to Yelp reviews, newer markets.we collected the individual reviewer ratings and comments By happenstance, Groupon’s recent S-1 filing [10] pro-left by customers on Yelp. We collected this dataset dur- vided financial information that allowed us to validate someing the first week of September 2011. In total, our dataset of the aggregate revenue data that we collected indepen-