Although analyzing “big data” has the power to transform your business, the ease of doing so has been over-stated. In reality, harnessing big data is still a messy and labor-intensive business. We are incredibly excited by what we can do with data but also think some of the hype is doing brands a disservice, because it creates a false expectation of how easy this work is going to be. Most things in life that are important and worthwhile are difficult, and the analysis of Big Data is no different. Don’t believe these commonly heard myths…
Falcon Invoice Discounting: Empowering Your Business Growth
The Myths of Big Data
1. Proprietary and confidential. Do not distribute.
The Myths of Big Data
Proprietary and confidential. Do not distribute.
Prepare for Internal Prophet Team
Click to edit Master text styles
2. Proprietary and confidential. Do not distribute. 1Proprietary and confidential. Do not distribute. 1
What is Big Data?
Big Volume Big Velocity Big Variety
With better hosting and
computation capabilities,
Big Data is getting bigger
and bigger. Companies
can now track every
single click on every
webpage for every visit
Velocity refers to frequency
of data generation or
frequency of data delivery.
With sensor and web data
coming in real time, ability to
handle velocity is a core
feature of Big Data
Big data is made up of
several data sources that
need to be integrated to
run useful analytics.
Big Value
The ability to access
and utilize the Big
Data for business
advantage
‘Terabytes’
‘Transaction Level Data’
‘Data Warehouse’
‘Web Data’
‘Real time data feed’
‘Streaming Data’
‘40 Ms online ad response’
‘Twitter’
‘ETL’
‘Data Integration’
‘Blog’
‘Click stream’
‘85% Ineffective
by 2015’
‘40% growth’
‘175Bn by 2015’
$=
4. Proprietary and confidential. Do not distribute. 3Proprietary and confidential. Do not distribute. 3
Big Data is Big
BIG DATA MYTH #1
5. Proprietary and confidential. Do not distribute. 4Proprietary and confidential. Do not distribute. 4
Big data is not one big chunk of data, it’s a collection of several different types of
data feeds in its entirety that makes it big
Mobile / Ipad
Future Channels
Transaction Data
Loyalty Card
Price/Promotion
Website Data
Demographics
6. Proprietary and confidential. Do not distribute. 5Proprietary and confidential. Do not distribute. 5
BIG DATA MYTH
Big Data Analytics Are
Automated Processes
#2
7. Proprietary and confidential. Do not distribute. 6Proprietary and confidential. Do not distribute. 6
Detailed Message Key Message
“We make a change to our search algorithms at
least once a day but these are manual updates”
- Matt Cutts , Google Web Spam Team
Big Data Analytics requires a lot of “dirty” data
cleaning, handling and modeling
A complete Big Data project often involves
unstructured data such as flat files sitting on
someone’s laptop
The very nature of Big Data makes it difficult to have
standardized automated processes for all clients
Big Data projects are just like other analytics projects.
They have a beginning, middle and the end
The impression of Big Data Analytics being a super
intelligent artificial intelligence based analytics is false
The majority of time in Big Data Analytics is taken up
by data scientists cleaning up the messy data.
Advanced modeling and statistics is a very small
percentage
The pre-modeling and analysis work in any Big Data
project makes it difficult to achieve automation
The process of enacting and understanding “big data” is a very manual process,
built up of many layers
8. Proprietary and confidential. Do not distribute. 7Proprietary and confidential. Do not distribute. 7
BIG DATA MYTH
The More Granular
The Data, The Better
#3
9. Proprietary and confidential. Do not distribute. 8Proprietary and confidential. Do not distribute. 8
You’ll miss the forest for seeing the trees.
The first quarter of a football game doesn’t predict how a whole game plays out. Real-time data can be too close to the
action. Sometimes, you need to pull back for the long shot to reveal what’s really going on.
Big data is encumbered by a huge amount of white noise.
The noise as a proportion of the total signal increases with higher resolution, for example, data by minute rather than by
week or data at a town level rather than state.
Do not confuse precision with accuracy. Big Data, in its raw disaggregate form can be misleading.
There needs to be an appropriate level of aggregation for all the white noise to cancel each out. So, all those grains of
sands need to aggregate appropriately to make any sense of them.
When viewed on a short horizon of time, the left chart explains how Big Data interest is slowing down
but when looking at the broader picture (or more aggregated view), the message changes
10. Proprietary and confidential. Do not distribute. 9Proprietary and confidential. Do not distribute. 9
BIG DATA MYTH
Big Data is Good, Clean Data
#4
11. Proprietary and confidential. Do not distribute. 10Proprietary and confidential. Do not distribute. 10
Big Data is dirty and messy
DISTINCTION
There is a distinction
between a lot of data and a
lot of good data. Poor
quality data has lots of
errors, lots of missing data
that can be misleading. Big
Data is inherently messy and
dirty and it takes a smart
model or analyst time to
make sense of data and
clean it. In fact, a major
proportion of data has to be
thrown away.
ANALYZE
To analyze Big Data, one of
the first things you have to
figure out is what data to
include in your
analysis, and what you
need to throw away. Bad
data can lead you off the right
track. It can sap countless
weeks or months of
imputation, definitions and
realignment. Identifying and
focusing on the most useful
data can get you ahead.
FOCUS
Analyzing messy data may
lead companies to lose
focus from being
consumer centric to data
centric. For example: if a
consumer tweets that she is
sad because she can’t find a
specific pair of Nike shorts
she’s been searching
for, certain sentiment trackers
will plot that as a “negative”
proof point, when in actuality
this is a brand-loyal customer
providing “positive” sentiment
for Nike.
12. Proprietary and confidential. Do not distribute. 11Proprietary and confidential. Do not distribute. 11
BIG DATA MYTH
Big Data Means That
Analysts Become All-important
#5
13. Proprietary and confidential. Do not distribute. 12Proprietary and confidential. Do not distribute. 12
Marketers need to be empowered to do their own analyses
Analysts will just become important from a data readiness
perspective, but the new age of analytics will still be about marketers
having access to sophisticated, fast and easy to use tools that can aid
them in decision making.
SPEED
The speed and quantity of
data means that analysts are
becoming enablers. They are
helping marketers use smart
modeling tools themselves in
order to analyze data and
make marketing decisions.
CLEAN
Although big data requires
a whole new set of analyst
talents to clean and churn
data, this cleaning and
churning process actually
represents a pre-analytics
stage when data can be
cleaned to a stage relevant
for modeling.
WORKING MODEL
The working model will
change. No longer will
analysts present, run a model
and then come back and
present again. Work between
analysts and marketing
leadership will be interactive
and ongoing.
14. Proprietary and confidential. Do not distribute. 13Proprietary and confidential. Do not distribute. 13
BIG DATA MYTH
Big Data Gives You Concrete,
Black And White Answers
#6
15. Proprietary and confidential. Do not distribute. 14Proprietary and confidential. Do not distribute. 14
We can never do away with human judgment and context
The more data you have, and the more analyses you run, the more likely you are to have
contradictions and ambiguities that require resolution. More data gives you more witnesses,
but doesn’t get you closer to the truth until you leverage experienced human judgment to
reconcile conflicting evidence.
1
2
3
The future of analytics is all about combining, weighing and judging multiple sources of
information and different analyses.
The role of the analyst, and especially the role of the marketer is about weighing the
evidence. The future is all about evidence-based marketing.
Analyzing Big Data to derive marketing insights is just like analyzing
small data. The model/analysis results have to weighed against
business objectives and context. The complicated and dirty nature of
Big Data makes this task even more important.
16. Proprietary and confidential. Do not distribute. 15Proprietary and confidential. Do not distribute. 15
BIG DATA MYTH
Big Data is A Magic 8-Ball
#7
17. Proprietary and confidential. Do not distribute. 16Proprietary and confidential. Do not distribute. 16
Well, yes, but you need to ask the question in exactly the right way.
WISHES
It’s a bit like when a genie
gives you your three wishes.
You have to phrase your
wishes very carefully.
QUESTIONS
Applying analytics with a lack
of precision or detailed
hypothesis creation in
advance, when applied to
complex data sets such as
cell phone or calling network
data, can actually lead you
astray and give an incorrect
answer. You need to ask
your questions very carefully
of the “Big Data” crystal ball.
INTER-RELATED
We get questions all the
time about optimizing SKU
mix, or optimizing pricing, or
optimizing promotions.
Asking the Big Data Tools to
optimize SKU mix without
looking at changing pricing
at the same time will give
you very wrong answers. If
you ask a model these
questions “one at a time,”
without thinking a bit more
deeply about how they are
inter-related, you’ll get the
wrong answer.
?
18. Proprietary and confidential. Do not distribute. 17Proprietary and confidential. Do not distribute. 17
BIG DATA MYTH
Big Data Can Create
Self-Learning Algorithms
#8
19. Proprietary and confidential. Do not distribute. 18Proprietary and confidential. Do not distribute. 18
It can, but they’re unreliable.
Marketers must be careful about the false insights from rogue data. For example - call center
call volume prediction from direct response TV ads can be factually incorrect. Just like rogue
data. We have seen this recently with a call center, where there optimization models were
easily thrown by being too sensitive to the most recent data points.
1
2
3
This mean that there are quite a lot of limits to the marketing purposes of automated models.
Rogue data from a Super Bowl weekend could distort an auto-update algorithm.
There are some exceptions, and there are some great examples of auto-analytics. Cell phone
operators have demonstrated good use of non-marketing data for marketing. They know who
you friends are, they can guess your age, they know the parts of town where you hang out,
they know what websites you visit, what apps you use, and when. Insurance companies can
use telemetrics for obtaining data for marketing, not just underwriting.
20. Proprietary and confidential. Do not distribute. 19Proprietary and confidential. Do not distribute. 19
BIG DATA MYTH
Big Data Makes Big Companies
All Powerful
#9
21. Proprietary and confidential. Do not distribute. 20Proprietary and confidential. Do not distribute. 20
Yes, they know a lot. But do they know what to do with their knowledge?
It is true that Big Data might mean
companies know a lot about their
customers. For example, a cell phone
company could know what websites
you’re looking at, what part of town
you’re in and at what time of night. They
could also take a stab at guessing your
sexual orientation, if you’re pregnant, or
if you’re lonely.
But the problem arises when we need to
aggregate the findings for a single
customer to a group towards which
marketers can direct their marketing
tools. That is not an easy or well defined
process.
In a lot of cases, Big Data feeds are
publicly and easily available. For
example – it is easy to look at cell
phone usage in a neighborhood using
government data
This means that Big data actually
democratizes markets and removes the
exclusivity of statisticians and in house
modelers that many big companies are
so proud of.
22. Proprietary and confidential. Do not distribute. 21Proprietary and confidential. Do not distribute. 21
BIG DATA MYTH
It’s The Math That Matters
#10
23. Proprietary and confidential. Do not distribute. 22Proprietary and confidential. Do not distribute. 22
It’s not the math that matters, it’s the people and the process
To make analytics effective, there is a lot of non-math that you need to get right. It’s crucial to
have an organizational structure with proper roles and responsibilities, use of tools, and
creation of the correct process (e.g. for planning when and how to take a price discount
promotion at a fashion retailer).
1
2
3
Although evidence based marketing is replacing guesswork, a decision maker needs to mull
over multiple, often conflicting intelligence reports, Even if they conflict, it is better than just
guesswork and instinct.
The outcomes of different analysis and different data sets will often conflict, not confirm, one
another. The marketer must become more comfortable with understanding the true nature of
big data analytics, and being happy to dance with ambiguity.
24. Proprietary and confidential. Do not distribute. 23Proprietary and confidential. Do not distribute. 23
All 10 of 10.
25. Proprietary and confidential. Do not distribute. 24Proprietary and confidential. Do not distribute. 24
#1
BIG DATA IS BIG
Big data is not one big chunk
of data, it’s a collection of
several different types of
data feeds in its entirety that
makes it big
#2
BIG DATA ANALYTICS
ARE AUTOMATED
PROCESSES
The process of enacting and
understanding “big data” is a
very manual process, built up
of many layers
#3
THE MORE
GRANULAR THE
DATA, THE BETTER
You’ll miss the forest for
seeing the trees.
#4
BIG DATA IS GOOD,
CLEAN DATA
Big data is dirty and messy
#5
BIG DATA MEANS THAT
ANALYSTS BECOME
ALL-IMPORTANT
Marketers need to be
empowered to do their
own analyses
#6
BIG DATA GIVES YOU
CONCRETE, BLACK
AND WHITE ANSWERS
We can never do away with
human judgment and
context
#7
BIG DATA IS A
MAGIC 8-BALL
Well, yes, but you need to
ask the question in exactly
the right way.
#8
BIG DATA CAN CREATE
SELF-LEARNING
ALGORITHMS
It can, but they’re unreliable.
#9
BIG DATA MAKES BIG
COMPANIES ALL
POWERFUL
Yes, the know a lot. But do
they know what to do with
their knowledge?
#10
IT’S THE MATH THAT
MATTERS
It’s not the math that matters,
it’s the people and the process
Big Data MYTHS
27. Proprietary and confidential. Do not distribute. 26
Prophet is a strategic brand and marketing consultancy
WE HELP CLIENTS WIN BY DEVELOPING
INSPIRED AND ACTIONABLE IDEAS
28. Proprietary and confidential. Do not distribute. 27
BRAND
Brand Positioning & Identity
Brand Portfolio & Architecture
Brand Activation & Management
Brand Voice & Naming
DESIGN
Logos, Visual Systems, & Guidelines
Digital Design & Communications
Customer Experience Design
Retail Design & Prototyping
DIGITAL
Digital strategy, audit , & roadmap
Digital customer experience
Digital activation
Digital measurement and effectiveness
MARKETING
Growth Strategy
Value Propositions
Customer Experience Strategy
Marketing Organization & Capabilities
INNOVATION
Ideation & Rapid Concepting
Portfolio & Product Development
Business of the Future
Innovative Organization
INSIGHTS & ANALYTICS
Consumer & Shopper Insights
Pricing, SKU & Promotions Optimization
Marketing Analytics & Mix Modeling
Customer and CRM Analytics
We are uniquely skilled in a full range of capabilities
29. Proprietary and confidential. Do not distribute. 28Proprietary and confidential. Do not distribute. 28
For more information contact:
James Walker
Senior Partner
J_walker@prophet.com
New York
160 Fifth Avenue
Fifth Floor
New York, NY 10010
www.prophet.com
Editor's Notes
Well, the first thing is….Big data isn’t big. And not only is “Big Data” poor English, but it’s also very misleading. What we’re talking about is a large volume of data points, updated at high-frequency, with short lag to the actual event (real or near real-time).
But, it’s very granular, ie: small individual data. It’s individual transaction data; it’s a certain credit card, paying for a certain amount of gas, at a certain gas station. Big Data is actually lots and lots of very small data. So much SMALL DATA, actually lies at the very heart of the Big data OPPORTUNITY and the CHALLENGE.
Big Data Myth #2: Big Data analytics is an automated process!!!We hear “real-time” analytics a lot, when in reality model-building and analytics is anything but real time. It’s a messy, manual business of getting data aligned, pictures tagged correctly and so on, that happens episodically to update a model, not in “real time”
True picture of searches for BIG DATA, if you take a slightly longer term view. Disaggregate, real time, is not always good for MARKETERS>One data point about a guy at a gas station convenience store buying nappies and beer together, does not necessarily mean a Marketer should use that to invent a great new co-promotion. Disaggregate and real time is real misleading. Dis-aggregating across regions, store types and son, CAN give more granular data, but can make for very noisy data, where the noise drowns the signal.Indeed – Real time is not as good as tracking changes over time. The interesting bit of analytics is looking at changes…. Which customers are now buying different products than before?The other interesting bit of analytics is predicting, and we need to look at a degree of history to make predictions about the future.
So, what big data analytics means for marketers is that they have to learn to be patient, and get analyses slowly than the big data hype suggests, and most of all that more data means more conflicting evidence, and so they have to get really good at analysing the analytics, and weighing up the evidence. This is a new skill set they need to develop and nurture, and one that their partners in research and consulting can help our clients with. This requites Analytics with a HUMAN EDGE.Thank you for listening,Any questions.
The first slide of the presentation should be an interesting or provocative statement or image; this slide should be white, orange, dark grey or full photo image. It should be generally centered on the page