Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Hype (or Reality)

This presentation gives an analysis of the article, "Big Data Hype (or Reality)" by Gregory Piatetsky-Shapiro.
It mentions two major insights relevant to a manager in India.
1. Randomness inherent in human behavior is the limiting
factor to consumer modeling success.
2. Big data analytics can improve predictions, but
the biggest effects of big data will be in
creating wholly new areas.

  • Login to see the comments

  • Be the first to like this

Big Data Hype (or Reality)

  1. 1. Analysis of the article Big Data hype (or reality) by Gregory Piatetsky-Shapiro
  2. 2. Gregory Piatetsky-Shapiro Gregory I. Piatetsky-Shapiro is a data scientist and the co-founder of the KDD conference and the Association for Computing Machinery SIGKDD association for Knowledge Discovery and Data Mining.
  3. 3. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
  4. 4. Big data offers unprecedented awareness of phenomena — particularly of consumers’ actions and attitudes Three areas where better prediction of consumer behavior would clearly be valuable. 1) Film Ratings 2) Churn Prediction 3) Web advertising response
  5. 5. Case #1: Film Ratings “Film ratings are critical for a company that thrives when people consume more content.” This is a prediction challenge
  6. 6. The Netflix launched a competition to improve on the Cinematch algorithm it had developed over many years. It released a record-large (for 2007) dataset, with about 480,000 anonymized users, 17,770 movies, and user/movie ratings ranging from 1 to 5 (stars).
  7. 7. The error of Netflix’s own algorithm was about 0.95 (using a root-mean-square error), meaning that its predictions tended to be off by almost a full “star.” The Netflix Prize of $1 million would go to the first algorithm to reduce that error by just 10%, to about 0.86. It took about three years before the BellKor’s Pragmatic Chaos team managed to win the prize with a score of 0.8567 RMSE. The winning algorithm was a very complex ensemble of many different approaches — so complex that it was never implemented by Netflix.
  8. 8. Case #2: Churn Prediciton If predictive analytics drawing on big data could accurately point to who in particular was about to jump ship, direct marketing dollars could be efficiently deployed to intervene, perhaps by offering those wavering customers new benefits or discounts.
  9. 9. Lift of a target group identified by churn analysis reflects the higher proportion of customers who actually drop the service. when compared with the population of customers as a whole. If, typically, 2 percent of customers drop the service per month, and, within the group identified as “churners,” 8 percent drop the service, the “lift” is 4.
  10. 10. Case #3: Web advertising response Challenge of predicting the click-thru rate (CTR %) of an online ad — clearly a valuable thing to get right, given the sums changing hands in that business. We should exclude search advertising, where the ad is always related to user intent, and focus on the rates for display ads.
  11. 11. The average CTR% for display ads has been reported as low as 0.1-0.2% with researchers reporting up to seven-fold improvements from 0.2% amounts to 1.4% “Today’s     best    targeted    advertising     is ignored       98.6% of         the time.”
  12. 12. Relevant insights for a manager
  13. 13. INSiGHT #1 Randomness inherent in human behavior is the limiting factor to consumer modeling success. When an activity is driven by consumers’ whims, no amount of ingenuity can produce the ability to know what will happen.
  14. 14. Predictive analytics can figure out how to land on Mars, but not who will buy a Mars bar.
  15. 15. Big data analytics can improve predictions, but the biggest effects of big data will be in creating wholly new areas. INSiGHT #2
  16. 16. The success of Facebook, Twitter, and LinkedIn social networks depends on their scale, and big data tools and analytics will be required for them to keep growing.
  17. 17. “If you’re counting on Big Data to make people much more predictable, you’re expecting too much.”
  18. 18. Thank You