The Myths of Big Data


Published on

Although analyzing “big data” has the power to transform your business, the ease of doing so has been over-stated. In reality, harnessing big data is still a messy and labor-intensive business. We are incredibly excited by what we can do with data but also think some of the hype is doing brands a disservice, because it creates a false expectation of how easy this work is going to be. Most things in life that are important and worthwhile are difficult, and the analysis of Big Data is no different. Don’t believe these commonly heard myths…

Published in: Business, Technology
  • nice overview and well presented, thanks.

    a little thing I noticed on summary slide 25: under #3 the text of #1 is used.
    Are you sure you want to  Yes  No
    Your message goes here
  • Very good presentation. Oursource business
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Well, the first thing is….Big data isn’t big. And not only is “Big Data” poor English, but it’s also very misleading. What we’re talking about is a large volume of data points, updated at high-frequency, with short lag to the actual event (real or near real-time).
  • But, it’s very granular, ie: small individual data. It’s individual transaction data; it’s a certain credit card, paying for a certain amount of gas, at a certain gas station. Big Data is actually lots and lots of very small data. So much SMALL DATA, actually lies at the very heart of the Big data OPPORTUNITY and the CHALLENGE.
  • Big Data Myth #2: Big Data analytics is an automated process!!!We hear “real-time” analytics a lot, when in reality model-building and analytics is anything but real time. It’s a messy, manual business of getting data aligned, pictures tagged correctly and so on, that happens episodically to update a model, not in “real time”
  • True picture of searches for BIG DATA, if you take a slightly longer term view. Disaggregate, real time, is not always good for MARKETERS>One data point about a guy at a gas station convenience store buying nappies and beer together, does not necessarily mean a Marketer should use that to invent a great new co-promotion. Disaggregate and real time is real misleading. Dis-aggregating across regions, store types and son, CAN give more granular data, but can make for very noisy data, where the noise drowns the signal.Indeed – Real time is not as good as tracking changes over time. The interesting bit of analytics is looking at changes…. Which customers are now buying different products than before?The other interesting bit of analytics is predicting, and we need to look at a degree of history to make predictions about the future.
  • So, what big data analytics means for marketers is that they have to learn to be patient, and get analyses slowly than the big data hype suggests, and most of all that more data means more conflicting evidence, and so they have to get really good at analysing the analytics, and weighing up the evidence. This is a new skill set they need to develop and nurture, and one that their partners in research and consulting can help our clients with. This requites Analytics with a HUMAN EDGE.Thank you for listening,Any questions.
  • The first slide of the presentation should be an interesting or provocative statement or image; this slide should be white, orange, dark grey or full photo image. It should be generally centered on the page
  • Intro to Prophet value prop
  • The Myths of Big Data

    1. Proprietary and confidential. Do not distribute.The Myths of Big DataProprietary and confidential. Do not distribute.Prepare for Internal Prophet TeamClick to edit Master text styles
    2. Proprietary and confidential. Do not distribute. 1Proprietary and confidential. Do not distribute. 1What is Big Data?Big Volume Big Velocity Big VarietyWith better hosting andcomputation capabilities,Big Data is getting biggerand bigger. Companiescan now track everysingle click on everywebpage for every visitVelocity refers to frequencyof data generation orfrequency of data delivery.With sensor and web datacoming in real time, ability tohandle velocity is a corefeature of Big DataBig data is made up ofseveral data sources thatneed to be integrated torun useful analytics.Big ValueThe ability to accessand utilize the BigData for businessadvantage‘Terabytes’‘Transaction Level Data’‘Data Warehouse’‘Web Data’‘Real time data feed’‘Streaming Data’‘40 Ms online ad response’‘Twitter’‘ETL’‘Data Integration’‘Blog’‘Click stream’‘85% Ineffectiveby 2015’‘40% growth’‘175Bn by 2015’$=
    3. Proprietary and confidential. Do not distribute. 2Myths of Big Data
    4. Proprietary and confidential. Do not distribute. 3Proprietary and confidential. Do not distribute. 3Big Data is BigBIG DATA MYTH #1
    5. Proprietary and confidential. Do not distribute. 4Proprietary and confidential. Do not distribute. 4Big data is not one big chunk of data, it’s a collection of several different types ofdata feeds in its entirety that makes it bigMobile / IpadFuture ChannelsTransaction DataLoyalty CardPrice/PromotionWebsite DataDemographics
    6. Proprietary and confidential. Do not distribute. 5Proprietary and confidential. Do not distribute. 5BIG DATA MYTHBig Data Analytics AreAutomated Processes#2
    7. Proprietary and confidential. Do not distribute. 6Proprietary and confidential. Do not distribute. 6Detailed Message Key Message“We make a change to our search algorithms atleast once a day but these are manual updates”- Matt Cutts , Google Web Spam TeamBig Data Analytics requires a lot of “dirty” datacleaning, handling and modelingA complete Big Data project often involvesunstructured data such as flat files sitting onsomeone’s laptopThe very nature of Big Data makes it difficult to havestandardized automated processes for all clientsBig Data projects are just like other analytics projects.They have a beginning, middle and the endThe impression of Big Data Analytics being a superintelligent artificial intelligence based analytics is falseThe majority of time in Big Data Analytics is taken upby data scientists cleaning up the messy data.Advanced modeling and statistics is a very smallpercentageThe pre-modeling and analysis work in any Big Dataproject makes it difficult to achieve automationThe process of enacting and understanding “big data” is a very manual process,built up of many layers
    8. Proprietary and confidential. Do not distribute. 7Proprietary and confidential. Do not distribute. 7BIG DATA MYTHThe More GranularThe Data, The Better#3
    9. Proprietary and confidential. Do not distribute. 8Proprietary and confidential. Do not distribute. 8You’ll miss the forest for seeing the trees.The first quarter of a football game doesn’t predict how a whole game plays out. Real-time data can be too close to theaction. Sometimes, you need to pull back for the long shot to reveal what’s really going on.Big data is encumbered by a huge amount of white noise.The noise as a proportion of the total signal increases with higher resolution, for example, data by minute rather than byweek or data at a town level rather than state.Do not confuse precision with accuracy. Big Data, in its raw disaggregate form can be misleading.There needs to be an appropriate level of aggregation for all the white noise to cancel each out. So, all those grains ofsands need to aggregate appropriately to make any sense of them.When viewed on a short horizon of time, the left chart explains how Big Data interest is slowing downbut when looking at the broader picture (or more aggregated view), the message changes
    10. Proprietary and confidential. Do not distribute. 9Proprietary and confidential. Do not distribute. 9BIG DATA MYTHBig Data is Good, Clean Data#4
    11. Proprietary and confidential. Do not distribute. 10Proprietary and confidential. Do not distribute. 10Big Data is dirty and messyDISTINCTIONThere is a distinctionbetween a lot of data and alot of good data. Poorquality data has lots oferrors, lots of missing datathat can be misleading. BigData is inherently messy anddirty and it takes a smartmodel or analyst time tomake sense of data andclean it. In fact, a majorproportion of data has to bethrown away.ANALYZETo analyze Big Data, one ofthe first things you have tofigure out is what data toinclude in youranalysis, and what youneed to throw away. Baddata can lead you off the righttrack. It can sap countlessweeks or months ofimputation, definitions andrealignment. Identifying andfocusing on the most usefuldata can get you ahead.FOCUSAnalyzing messy data maylead companies to losefocus from beingconsumer centric to datacentric. For example: if aconsumer tweets that she issad because she can’t find aspecific pair of Nike shortsshe’s been searchingfor, certain sentiment trackerswill plot that as a “negative”proof point, when in actualitythis is a brand-loyal customerproviding “positive” sentimentfor Nike.
    12. Proprietary and confidential. Do not distribute. 11Proprietary and confidential. Do not distribute. 11BIG DATA MYTHBig Data Means ThatAnalysts Become All-important#5
    13. Proprietary and confidential. Do not distribute. 12Proprietary and confidential. Do not distribute. 12Marketers need to be empowered to do their own analysesAnalysts will just become important from a data readinessperspective, but the new age of analytics will still be about marketershaving access to sophisticated, fast and easy to use tools that can aidthem in decision making.SPEEDThe speed and quantity ofdata means that analysts arebecoming enablers. They arehelping marketers use smartmodeling tools themselves inorder to analyze data andmake marketing decisions.CLEANAlthough big data requiresa whole new set of analysttalents to clean and churndata, this cleaning andchurning process actuallyrepresents a pre-analyticsstage when data can becleaned to a stage relevantfor modeling.WORKING MODELThe working model willchange. No longer willanalysts present, run a modeland then come back andpresent again. Work betweenanalysts and marketingleadership will be interactiveand ongoing.
    14. Proprietary and confidential. Do not distribute. 13Proprietary and confidential. Do not distribute. 13BIG DATA MYTHBig Data Gives You Concrete,Black And White Answers#6
    15. Proprietary and confidential. Do not distribute. 14Proprietary and confidential. Do not distribute. 14We can never do away with human judgment and contextThe more data you have, and the more analyses you run, the more likely you are to havecontradictions and ambiguities that require resolution. More data gives you more witnesses,but doesn’t get you closer to the truth until you leverage experienced human judgment toreconcile conflicting evidence.123The future of analytics is all about combining, weighing and judging multiple sources ofinformation and different analyses.The role of the analyst, and especially the role of the marketer is about weighing theevidence. The future is all about evidence-based marketing.Analyzing Big Data to derive marketing insights is just like analyzingsmall data. The model/analysis results have to weighed againstbusiness objectives and context. The complicated and dirty nature ofBig Data makes this task even more important.
    16. Proprietary and confidential. Do not distribute. 15Proprietary and confidential. Do not distribute. 15BIG DATA MYTHBig Data is A Magic 8-Ball#7
    17. Proprietary and confidential. Do not distribute. 16Proprietary and confidential. Do not distribute. 16Well, yes, but you need to ask the question in exactly the right way.WISHESIt’s a bit like when a geniegives you your three wishes.You have to phrase yourwishes very carefully.QUESTIONSApplying analytics with a lackof precision or detailedhypothesis creation inadvance, when applied tocomplex data sets such ascell phone or calling networkdata, can actually lead youastray and give an incorrectanswer. You need to askyour questions very carefullyof the “Big Data” crystal ball.INTER-RELATEDWe get questions all thetime about optimizing SKUmix, or optimizing pricing, oroptimizing promotions.Asking the Big Data Tools tooptimize SKU mix withoutlooking at changing pricingat the same time will giveyou very wrong answers. Ifyou ask a model thesequestions “one at a time,”without thinking a bit moredeeply about how they areinter-related, you’ll get thewrong answer.?
    18. Proprietary and confidential. Do not distribute. 17Proprietary and confidential. Do not distribute. 17BIG DATA MYTHBig Data Can CreateSelf-Learning Algorithms#8
    19. Proprietary and confidential. Do not distribute. 18Proprietary and confidential. Do not distribute. 18It can, but they’re unreliable.Marketers must be careful about the false insights from rogue data. For example - call centercall volume prediction from direct response TV ads can be factually incorrect. Just like roguedata. We have seen this recently with a call center, where there optimization models wereeasily thrown by being too sensitive to the most recent data points.123This mean that there are quite a lot of limits to the marketing purposes of automated models.Rogue data from a Super Bowl weekend could distort an auto-update algorithm.There are some exceptions, and there are some great examples of auto-analytics. Cell phoneoperators have demonstrated good use of non-marketing data for marketing. They know whoyou friends are, they can guess your age, they know the parts of town where you hang out,they know what websites you visit, what apps you use, and when. Insurance companies canuse telemetrics for obtaining data for marketing, not just underwriting.
    20. Proprietary and confidential. Do not distribute. 19Proprietary and confidential. Do not distribute. 19BIG DATA MYTHBig Data Makes Big CompaniesAll Powerful#9
    21. Proprietary and confidential. Do not distribute. 20Proprietary and confidential. Do not distribute. 20Yes, they know a lot. But do they know what to do with their knowledge?It is true that Big Data might meancompanies know a lot about theircustomers. For example, a cell phonecompany could know what websitesyou’re looking at, what part of townyou’re in and at what time of night. Theycould also take a stab at guessing yoursexual orientation, if you’re pregnant, orif you’re lonely.But the problem arises when we need toaggregate the findings for a singlecustomer to a group towards whichmarketers can direct their marketingtools. That is not an easy or well definedprocess.In a lot of cases, Big Data feeds arepublicly and easily available. Forexample – it is easy to look at cellphone usage in a neighborhood usinggovernment dataThis means that Big data actuallydemocratizes markets and removes theexclusivity of statisticians and in housemodelers that many big companies areso proud of.
    22. Proprietary and confidential. Do not distribute. 21Proprietary and confidential. Do not distribute. 21BIG DATA MYTHIt’s The Math That Matters#10
    23. Proprietary and confidential. Do not distribute. 22Proprietary and confidential. Do not distribute. 22It’s not the math that matters, it’s the people and the processTo make analytics effective, there is a lot of non-math that you need to get right. It’s crucial tohave an organizational structure with proper roles and responsibilities, use of tools, andcreation of the correct process (e.g. for planning when and how to take a price discountpromotion at a fashion retailer).123Although evidence based marketing is replacing guesswork, a decision maker needs to mullover multiple, often conflicting intelligence reports, Even if they conflict, it is better than justguesswork and instinct.The outcomes of different analysis and different data sets will often conflict, not confirm, oneanother. The marketer must become more comfortable with understanding the true nature ofbig data analytics, and being happy to dance with ambiguity.
    24. Proprietary and confidential. Do not distribute. 23Proprietary and confidential. Do not distribute. 23All 10 of 10.
    25. Proprietary and confidential. Do not distribute. 24Proprietary and confidential. Do not distribute. 24#1BIG DATA IS BIGBig data is not one big chunkof data, it’s a collection ofseveral different types ofdata feeds in its entirety thatmakes it big#2BIG DATA ANALYTICSARE AUTOMATEDPROCESSESThe process of enacting andunderstanding “big data” is avery manual process, built upof many layers#3THE MOREGRANULAR THEDATA, THE BETTERYou’ll miss the forest forseeing the trees.#4BIG DATA IS GOOD,CLEAN DATABig data is dirty and messy#5BIG DATA MEANS THATANALYSTS BECOMEALL-IMPORTANTMarketers need to beempowered to do theirown analyses#6BIG DATA GIVES YOUCONCRETE, BLACKAND WHITE ANSWERSWe can never do away withhuman judgment andcontext#7BIG DATA IS AMAGIC 8-BALLWell, yes, but you need toask the question in exactlythe right way.#8BIG DATA CAN CREATESELF-LEARNINGALGORITHMSIt can, but they’re unreliable.#9BIG DATA MAKES BIGCOMPANIES ALLPOWERFULYes, the know a lot. But dothey know what to do withtheir knowledge?#10IT’S THE MATH THATMATTERSIt’s not the math that matters,it’s the people and the processBig Data MYTHS
    26. Proprietary and confidential. Do not distribute. 25About Prophet
    27. Proprietary and confidential. Do not distribute. 26Prophet is a strategic brand and marketing consultancyWE HELP CLIENTS WIN BY DEVELOPINGINSPIRED AND ACTIONABLE IDEAS
    28. Proprietary and confidential. Do not distribute. 27BRANDBrand Positioning & IdentityBrand Portfolio & ArchitectureBrand Activation & ManagementBrand Voice & NamingDESIGNLogos, Visual Systems, & GuidelinesDigital Design & CommunicationsCustomer Experience DesignRetail Design & PrototypingDIGITALDigital strategy, audit , & roadmapDigital customer experienceDigital activationDigital measurement and effectivenessMARKETINGGrowth StrategyValue PropositionsCustomer Experience StrategyMarketing Organization & CapabilitiesINNOVATIONIdeation & Rapid ConceptingPortfolio & Product DevelopmentBusiness of the FutureInnovative OrganizationINSIGHTS & ANALYTICSConsumer & Shopper InsightsPricing, SKU & Promotions OptimizationMarketing Analytics & Mix ModelingCustomer and CRM AnalyticsWe are uniquely skilled in a full range of capabilities
    29. Proprietary and confidential. Do not distribute. 28Proprietary and confidential. Do not distribute. 28For more information contact:James WalkerSenior PartnerJ_walker@prophet.comNew York160 Fifth AvenueFifth FloorNew York, NY