SlideShare a Scribd company logo
1 of 39
Download to read offline
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 1
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 2
Speakers
Dr. Michael Stonebraker
Co-Founder,
Tamr
Anthony Deighton
Chief Product Officer,
Tamr
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #1
Not Planning to Move Most EVERYTHING to the Cloud
3
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
It may take a decade, but it is the right thing to do
● Dewitt vignette
● Hamilton vignette
● Elasticity!!!
● Data will move easier than applications -- decision support first
4
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
YABUT...
5
Security
● Cloud security is
likely better than
yours
● Misconfiguration,
rogue employees
Cost
● Likely that
you are
cheating
Geographic
Restrictions
● Cloud guys
respect this
Legal
Restrictions
● Hopefully a
short term
problem
Other
Restrictions
● Your CEO
doesn’t
approve (see
item 11 to
come)
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
YABUT...
6
Where does App run?
● Decision support: move the app
● Other stuff:
○ Start with local deployment; move to remote data (SLOWLY!!!)
○ Migrate to cloud-native as you have resources, starting with the most
costly ones
○ This may be a lot of work and may take a decade or more
○ Issue is legacy code/hardware
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 7
Blunder #2
Not Planning for AI/ML to be Disruptive
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #2
Not Planning for AI/ML to be Disruptive
ML (whether deep or conventional) is getting much better
● Will displace workers with easy-to-explain jobs
● Think autonomous vehicles, automatic checkout, drone delivery, actuary
calculations
Likely to be disruptive
● You can be a disruptor or get disrupted - Your choice
● Think Uber/Lyft or taxis
8
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
So what to do?
9
Pay up to get some AI/ML experts
● They are in short supply and very expensive
● Don’t contract this out (See Blunder #8)
Get going on the coming arms race
● You will be a winner or a loser in a winner-take-all sweepstakes
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 10
Blunder #3
Not Solving your REAL Data Science Problem
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #3
Not Solving your REAL Data Science Problem
Typical data scientist spends 90+% of his/her time on data discovery, data
integration and data cleaning
● Irobot vignette
● Merck vignette
Nobody quotes less than 80%!!!
● Without clean data ML is worthless!!!
○ More accurately without “clean enough” data, ML is worthless
Obvious directive: Get a strategy in place to do this
● Start by giving Chief Data Officer (CDO) read access to ALL enterprise data!
11
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 12
Blunder #4
Belief that Traditional Data Integration
Techniques Will Solve Issue #3
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #4
Belief that Traditional Data Integration Techniques Will
Solve Issue #3
Exact Transformation and Load
(Available from a variety of vendors)
13
Master Data Management
(Also available from the usual suspects)
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
ETL
What’s attempted:
● Decide what data sources to
integrate (top dow)
● Build a global data model (up front)
● For each data source
○ Send a programmer to interview
the data set owner
○ He then builds an extractor, data
cleaning routines (in a proprietary
scripting language)
○ And loads data into the global
schema
14
Why it doesn’t work:
● I have never seen this technique work for
more than 20 data sources
○ Too human intensive
● Building a global schema upfront is way
too different at scale
○ Remember enterprise wide data models
from 15-20 years ago...
● Most enterprises I know have way more
than 20 data sources
○ Merck has 4000+/- Oracle data
bases
○ A data lake
○ Countless files
○ And data from the web is also
important
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
MDM
● Once you have run ETL, you need “match/merge”
● MDM suggests building “golden records” by
○ Implementing match rues (e.g. two entities are the same if they have the same
address)
○ Implementing merge rules (e.g. take the most recent value and ignore older ones)
Doesn’t Scale!
● GE classification problem: 20M spend transactions to be classified
into a pre-built hierarchy
● 500 rules classified only 10% of the spend transaction
15
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
So what to do?
At scale, you need a solution that leverages ML and statistics
● OK to use rules to generate training data
● That’s what Tamr did on the GE problem
16
+
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 17
Blunder #5
Belief that Data Warehouses will Solve all your Problems
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #5
Belief that Data Warehouses will Solve all your Problems
18
Data warehouses are good at customer facing structured data
FROM A FEW DATA SOURCES
● But not text, images, video, …
● Use the technology for what it is good for
○ Do not perform unnatural acts!
○ And get rid of the “high price spread”, if you bought into it
○ And remember that your warehouse will move to the cloud (see
Blunder #1)
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 19
Blunder #6
Belief that Hadoop/Spark will Solve all your Problems
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #6
Belief that Hadoop/Spark will Solve all your Problems
20
● Hadoop/Spark is not very good at anything
○ E.g. Spark/SQL is not competitive (but getting better)
○ E.g. Spark/Streaming is not competitive (last time I looked)
● Use “best of breed” not “lowest common denominator” -- at least for your
“secret sauce”
○ This is a universal blunder -- desire to use only one vendor
○ Hadoop/Spark is not very good at anything
● And…
○ Spark/Hadoop is useless on Blunders #3 and #4 (i.e. data integration)
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
So what to do with your Hadoop/Spark cluster?
● Repurpose it or a Data Lake
● Repurpose it for Data Integration
● Throw it Away
○ Hardware lifetime is 3 years (maybe)
○ Remember Blunder #1
21
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 22
Blunder #7
Belief that Data Lakes will Solve all your Problems
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #7
Belief that Data Lakes will Solve all your Problems
23
Conventional Wisdom
Just load all your data into a “data
lake” and you will be able to
correlate all data sets
Important Fact (Tattoo this on
your Brain):
Independently constructed data
sets are never “plug compatible”
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Why?
● Schemas don’t match
○ You call it salary; I call it
wages
● Units don’t match
○ You use Euros; I use $$$
● Semantics don’t match
○ My salaries are gross before
taxes; yours are net after
taxes with a lunch allowance
24
● Time granularity doesn't match
○ You have annual data; I have
monthly data
● Data is dirty
○ 99 means null (sometimes)
○ Null means “data missing” or
“data not allowed” or...
● Duplicates must be removed
○ And there are no keys
○ I am Mike Stonebraker in
one data set; M.R.
Stonebreaker in a second
one
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
The Net Result
● Your analytics will be garbage
○ “GIGO”
● Your ML models will fail
○ I.e. produce garbage
○ Again “GIGO”
25
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
So what to do?
● You don’t have a data lake; you have a data swamp
● Need a data curation system
○ Which will solve the aforementioned problems
○ And this will not be trivial!!
● Traditional technology likely to fail (See Blunder #4)
● This is an 800 pound gorilla
○ Make sure you put your best people on it!!!!
○ Chances are your in-house solution is crap
○ Use modern technology (from startups) not your “home brew”
● If you want the best technology, you have to deal with startups!!!!
26
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 27
Blunder #8
Outsourcing your new stuff to Palantir, IBM, Mu Sigma
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #8
Outsourcing your new stuff to Palantir, IBM, Mu Sigma
28
● Typical enterprise spends 95% of its IT resources keeping current
(legacy) code running
○ i.e. Maintenance
○ Most are dug in pretty deep
○ Often have the best people “keeping the lights on”
● “Shiny new stuff” gets outsourced
○ Often because here is no appropriate talent internally
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
This is a catch 22
● Your maintenance is boring!
○ So creative people quit
○ So there is no good talent to work on the new stuff
○ And you can’t hire great talent (Takes great people to hire great people)
● Your new stuff is your “secret sauce” over the next decade or so…
○ Please don’t outsource it. This is long-term suicide
○ Instead outsource the diddly-crap (e-mail et. al.)
○ Software is your secret sauce -- invest in your own people
29
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
So what to do?
1. Start by solving Blunder #2
(Not planning for AI/ML to change most everything)
1. Outsource the borning maintenance
2. Cancel the Palantir contract
30
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 31
Blunder #9
Succumbing to the “Innovator’s Dilemma”
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #9
Succumbing to the “Innovator’s Dilemma”
32
● Must read book by Clayton
Christensen
● Stream shovel example
○ Cable stream shovels - big payload
○ Hydraulics - much safer, but low
payload
● Used for “small jobs”
○ Payloads increased and hydraulics
won
○ Cable guys went out of business
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Net-Net
● Have to be willing to give up your current business model
● And reinvent yourself
● Possibly losing some current customers in the process
○ Otherwise, you go out of business in the long run
○ Taxi licenses in Cambridge have gone from $700k to $10k
33
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 34
Blunder #10
Not Paying Up for a Few “Rocket Scientists”
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #10
Not Paying Up for a Few “Rocket Scientists”
35
● They will be your guiding light to avoiding these blunders
● They will be “off scale”
○ Your HR folks won’t like what you have to pay
● Chances are they will be weird
○ E.g. no shoes, no socks, no tie, feet on the table, ...
● Please don’t drive them away!
○ As Citibank did to one of my Berkeley students a while ago
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 36
Blunder #11 (Bonus)
Working for a Company That is not Trying to
do Something about the “Sins of the Past”
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Blunder #11 (Bonus)
Working for a Company That is not Trying to do
Something about the “Sins of the Past”
37
If you work for a company that is succumbing to (even one) of these blunders
then:
1. You should be fixing it
a. Be part of the solution, not part of the problem
2. Or looking for a new employer
a. Tamr is hiring!
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021
Questions?
38
How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 39
Thank You!
To learn more about Tamr visit tamr.com
You’ll receive the 10 Big Data Analytics
Blunders Infographic via email.

More Related Content

What's hot

Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata StrategiesDATAVERSITY
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
Data-Ed Webinar: Your Data Strategy
Data-Ed Webinar: Your Data StrategyData-Ed Webinar: Your Data Strategy
Data-Ed Webinar: Your Data StrategyDATAVERSITY
 
ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...
ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...
ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...DATAVERSITY
 
Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...
Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...
Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...DATAVERSITY
 
DAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use CasesDAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use CasesDATAVERSITY
 
DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Helping HR to Cross the Big Data Chasm
Helping HR to Cross the Big Data ChasmHelping HR to Cross the Big Data Chasm
Helping HR to Cross the Big Data ChasmDATAVERSITY
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance -  Combining Data Management with Organizational ...DAS Slides: Data Governance -  Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...DATAVERSITY
 
Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!DATAVERSITY
 
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapData-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapDATAVERSITY
 
DataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best PracticesDataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best PracticesDATAVERSITY
 
Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)DATAVERSITY
 
Big Data Strategies – Organizational Structure and Technology
Big Data Strategies – Organizational Structure and TechnologyBig Data Strategies – Organizational Structure and Technology
Big Data Strategies – Organizational Structure and TechnologyDATAVERSITY
 
Analytic Platforms Should Be Columnar Orientation
Analytic Platforms Should Be Columnar OrientationAnalytic Platforms Should Be Columnar Orientation
Analytic Platforms Should Be Columnar OrientationDATAVERSITY
 
DataEd Slides: Getting (Re)Started with Data Stewardship
DataEd Slides: Getting (Re)Started with Data StewardshipDataEd Slides: Getting (Re)Started with Data Stewardship
DataEd Slides: Getting (Re)Started with Data StewardshipDATAVERSITY
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Data-Ed Online Webinar: Data Governance Strategies
Data-Ed Online Webinar: Data Governance StrategiesData-Ed Online Webinar: Data Governance Strategies
Data-Ed Online Webinar: Data Governance StrategiesDATAVERSITY
 

What's hot (20)

Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata Strategies
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
Data-Ed Webinar: Your Data Strategy
Data-Ed Webinar: Your Data StrategyData-Ed Webinar: Your Data Strategy
Data-Ed Webinar: Your Data Strategy
 
ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...
ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...
ADV Slides: Organizational Change Management in Becoming an Analytic Organiza...
 
Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...
Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...
Lead Your Data Revolution - How to Build a Foundation of Trust and Data Gover...
 
DAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use CasesDAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use Cases
 
DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Webinar: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Helping HR to Cross the Big Data Chasm
Helping HR to Cross the Big Data ChasmHelping HR to Cross the Big Data Chasm
Helping HR to Cross the Big Data Chasm
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance -  Combining Data Management with Organizational ...DAS Slides: Data Governance -  Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
 
Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!Data Leadership - Stop Talking About Data and Start Making an Impact!
Data Leadership - Stop Talking About Data and Start Making an Impact!
 
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & RoadmapData-Ed Online Webinar: Data-centric Strategy & Roadmap
Data-Ed Online Webinar: Data-centric Strategy & Roadmap
 
DataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best PracticesDataEd Slides: Data Management Best Practices
DataEd Slides: Data Management Best Practices
 
Customer digitaldecisioningfinal
Customer digitaldecisioningfinalCustomer digitaldecisioningfinal
Customer digitaldecisioningfinal
 
Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)
 
Big Data Strategies – Organizational Structure and Technology
Big Data Strategies – Organizational Structure and TechnologyBig Data Strategies – Organizational Structure and Technology
Big Data Strategies – Organizational Structure and Technology
 
Analytic Platforms Should Be Columnar Orientation
Analytic Platforms Should Be Columnar OrientationAnalytic Platforms Should Be Columnar Orientation
Analytic Platforms Should Be Columnar Orientation
 
DataEd Slides: Getting (Re)Started with Data Stewardship
DataEd Slides: Getting (Re)Started with Data StewardshipDataEd Slides: Getting (Re)Started with Data Stewardship
DataEd Slides: Getting (Re)Started with Data Stewardship
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Data-Ed Online Webinar: Data Governance Strategies
Data-Ed Online Webinar: Data Governance StrategiesData-Ed Online Webinar: Data Governance Strategies
Data-Ed Online Webinar: Data Governance Strategies
 

Similar to Slides: How to Avoid the 10 Big Data Analytics Blunders — Best Practices for Success in 2021

Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI StrategyAtScale
 
Pitfalls and pro-tips for effective and transparent Business Intelligence too...
Pitfalls and pro-tips for effective and transparent Business Intelligence too...Pitfalls and pro-tips for effective and transparent Business Intelligence too...
Pitfalls and pro-tips for effective and transparent Business Intelligence too...Data Con LA
 
Webinar | Good Guys vs. Bad Data: How to Be a Data Quality Hero
Webinar | Good Guys vs. Bad Data: How to Be a Data Quality HeroWebinar | Good Guys vs. Bad Data: How to Be a Data Quality Hero
Webinar | Good Guys vs. Bad Data: How to Be a Data Quality HeroAngela Sun
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...Dataconomy Media
 
Robotics & AI: Where Are You on Your Automation Journey?
Robotics & AI: Where Are You on Your Automation Journey?Robotics & AI: Where Are You on Your Automation Journey?
Robotics & AI: Where Are You on Your Automation Journey?ITESOFT
 
BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015Fiona Lew
 
Modelling for decisions
Modelling for decisionsModelling for decisions
Modelling for decisionscoppeliamla
 
Understanding the Data Renaissance in Manufacturing
Understanding the Data Renaissance in ManufacturingUnderstanding the Data Renaissance in Manufacturing
Understanding the Data Renaissance in ManufacturingSafetyChain Software
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
A DevOps Checklist for Startups
A DevOps Checklist for StartupsA DevOps Checklist for Startups
A DevOps Checklist for StartupsRick Manelius
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companiesData Science Milan
 
Michael Cusumano - Strategy Rules
Michael Cusumano - Strategy RulesMichael Cusumano - Strategy Rules
Michael Cusumano - Strategy RulesINBOUND
 
Top reasons why big data projects are still a failure
Top reasons why big data projects are still a failureTop reasons why big data projects are still a failure
Top reasons why big data projects are still a failureArun Kapoor
 
11 steps you must take before purchasing talent acquisition technology
11 steps you must take before purchasing talent acquisition technology11 steps you must take before purchasing talent acquisition technology
11 steps you must take before purchasing talent acquisition technologyRecruitingDaily.com LLC
 
Un-dooming IT – a CTO survival manual of how to save your company before it's...
Un-dooming IT – a CTO survival manual of how to save your company before it's...Un-dooming IT – a CTO survival manual of how to save your company before it's...
Un-dooming IT – a CTO survival manual of how to save your company before it's...Klederson Bueno
 
From Paris Hilton to Walmart: welcome to the Big Data Revolution
From Paris Hilton to Walmart: welcome to the Big Data RevolutionFrom Paris Hilton to Walmart: welcome to the Big Data Revolution
From Paris Hilton to Walmart: welcome to the Big Data RevolutionWilliam Visterin
 
Agile Mumbai 2022 - Abhishek Mishra | How to fail in your AI Endeavors
Agile Mumbai 2022 - Abhishek Mishra | How to fail in your AI EndeavorsAgile Mumbai 2022 - Abhishek Mishra | How to fail in your AI Endeavors
Agile Mumbai 2022 - Abhishek Mishra | How to fail in your AI EndeavorsAgileNetwork
 
Top 5 Scale Up Mistakes
Top 5 Scale Up MistakesTop 5 Scale Up Mistakes
Top 5 Scale Up Mistakessaastr
 
Top 10 game developer legal mistakes
Top 10 game developer legal mistakesTop 10 game developer legal mistakes
Top 10 game developer legal mistakesJas Purewal
 

Similar to Slides: How to Avoid the 10 Big Data Analytics Blunders — Best Practices for Success in 2021 (20)

Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI Strategy
 
Pitfalls and pro-tips for effective and transparent Business Intelligence too...
Pitfalls and pro-tips for effective and transparent Business Intelligence too...Pitfalls and pro-tips for effective and transparent Business Intelligence too...
Pitfalls and pro-tips for effective and transparent Business Intelligence too...
 
Webinar | Good Guys vs. Bad Data: How to Be a Data Quality Hero
Webinar | Good Guys vs. Bad Data: How to Be a Data Quality HeroWebinar | Good Guys vs. Bad Data: How to Be a Data Quality Hero
Webinar | Good Guys vs. Bad Data: How to Be a Data Quality Hero
 
"What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual..."What we learned from 5 years of building a data science software that actual...
"What we learned from 5 years of building a data science software that actual...
 
Robotics & AI: Where Are You on Your Automation Journey?
Robotics & AI: Where Are You on Your Automation Journey?Robotics & AI: Where Are You on Your Automation Journey?
Robotics & AI: Where Are You on Your Automation Journey?
 
BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015BIG DATA WORKBOOK OCT 2015
BIG DATA WORKBOOK OCT 2015
 
Modelling for decisions
Modelling for decisionsModelling for decisions
Modelling for decisions
 
Understanding the Data Renaissance in Manufacturing
Understanding the Data Renaissance in ManufacturingUnderstanding the Data Renaissance in Manufacturing
Understanding the Data Renaissance in Manufacturing
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
A DevOps Checklist for Startups
A DevOps Checklist for StartupsA DevOps Checklist for Startups
A DevOps Checklist for Startups
 
"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies"You don't need a bigger boat": serverless MLOps for reasonable companies
"You don't need a bigger boat": serverless MLOps for reasonable companies
 
Michael Cusumano - Strategy Rules
Michael Cusumano - Strategy RulesMichael Cusumano - Strategy Rules
Michael Cusumano - Strategy Rules
 
Top reasons why big data projects are still a failure
Top reasons why big data projects are still a failureTop reasons why big data projects are still a failure
Top reasons why big data projects are still a failure
 
11 steps you must take before purchasing talent acquisition technology
11 steps you must take before purchasing talent acquisition technology11 steps you must take before purchasing talent acquisition technology
11 steps you must take before purchasing talent acquisition technology
 
Un-dooming IT – a CTO survival manual of how to save your company before it's...
Un-dooming IT – a CTO survival manual of how to save your company before it's...Un-dooming IT – a CTO survival manual of how to save your company before it's...
Un-dooming IT – a CTO survival manual of how to save your company before it's...
 
From Paris Hilton to Walmart: welcome to the Big Data Revolution
From Paris Hilton to Walmart: welcome to the Big Data RevolutionFrom Paris Hilton to Walmart: welcome to the Big Data Revolution
From Paris Hilton to Walmart: welcome to the Big Data Revolution
 
Agile Mumbai 2022 - Abhishek Mishra | How to fail in your AI Endeavors
Agile Mumbai 2022 - Abhishek Mishra | How to fail in your AI EndeavorsAgile Mumbai 2022 - Abhishek Mishra | How to fail in your AI Endeavors
Agile Mumbai 2022 - Abhishek Mishra | How to fail in your AI Endeavors
 
Winning Equation Presentation Nov 12 2015 FINAL
Winning Equation Presentation Nov 12 2015 FINALWinning Equation Presentation Nov 12 2015 FINAL
Winning Equation Presentation Nov 12 2015 FINAL
 
Top 5 Scale Up Mistakes
Top 5 Scale Up MistakesTop 5 Scale Up Mistakes
Top 5 Scale Up Mistakes
 
Top 10 game developer legal mistakes
Top 10 game developer legal mistakesTop 10 game developer legal mistakes
Top 10 game developer legal mistakes
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 

Recently uploaded (20)

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 

Slides: How to Avoid the 10 Big Data Analytics Blunders — Best Practices for Success in 2021

  • 1. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 1
  • 2. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 2 Speakers Dr. Michael Stonebraker Co-Founder, Tamr Anthony Deighton Chief Product Officer, Tamr
  • 3. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #1 Not Planning to Move Most EVERYTHING to the Cloud 3
  • 4. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 It may take a decade, but it is the right thing to do ● Dewitt vignette ● Hamilton vignette ● Elasticity!!! ● Data will move easier than applications -- decision support first 4
  • 5. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 YABUT... 5 Security ● Cloud security is likely better than yours ● Misconfiguration, rogue employees Cost ● Likely that you are cheating Geographic Restrictions ● Cloud guys respect this Legal Restrictions ● Hopefully a short term problem Other Restrictions ● Your CEO doesn’t approve (see item 11 to come)
  • 6. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 YABUT... 6 Where does App run? ● Decision support: move the app ● Other stuff: ○ Start with local deployment; move to remote data (SLOWLY!!!) ○ Migrate to cloud-native as you have resources, starting with the most costly ones ○ This may be a lot of work and may take a decade or more ○ Issue is legacy code/hardware
  • 7. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 7 Blunder #2 Not Planning for AI/ML to be Disruptive
  • 8. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #2 Not Planning for AI/ML to be Disruptive ML (whether deep or conventional) is getting much better ● Will displace workers with easy-to-explain jobs ● Think autonomous vehicles, automatic checkout, drone delivery, actuary calculations Likely to be disruptive ● You can be a disruptor or get disrupted - Your choice ● Think Uber/Lyft or taxis 8
  • 9. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 So what to do? 9 Pay up to get some AI/ML experts ● They are in short supply and very expensive ● Don’t contract this out (See Blunder #8) Get going on the coming arms race ● You will be a winner or a loser in a winner-take-all sweepstakes
  • 10. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 10 Blunder #3 Not Solving your REAL Data Science Problem
  • 11. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #3 Not Solving your REAL Data Science Problem Typical data scientist spends 90+% of his/her time on data discovery, data integration and data cleaning ● Irobot vignette ● Merck vignette Nobody quotes less than 80%!!! ● Without clean data ML is worthless!!! ○ More accurately without “clean enough” data, ML is worthless Obvious directive: Get a strategy in place to do this ● Start by giving Chief Data Officer (CDO) read access to ALL enterprise data! 11
  • 12. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 12 Blunder #4 Belief that Traditional Data Integration Techniques Will Solve Issue #3
  • 13. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #4 Belief that Traditional Data Integration Techniques Will Solve Issue #3 Exact Transformation and Load (Available from a variety of vendors) 13 Master Data Management (Also available from the usual suspects)
  • 14. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 ETL What’s attempted: ● Decide what data sources to integrate (top dow) ● Build a global data model (up front) ● For each data source ○ Send a programmer to interview the data set owner ○ He then builds an extractor, data cleaning routines (in a proprietary scripting language) ○ And loads data into the global schema 14 Why it doesn’t work: ● I have never seen this technique work for more than 20 data sources ○ Too human intensive ● Building a global schema upfront is way too different at scale ○ Remember enterprise wide data models from 15-20 years ago... ● Most enterprises I know have way more than 20 data sources ○ Merck has 4000+/- Oracle data bases ○ A data lake ○ Countless files ○ And data from the web is also important
  • 15. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 MDM ● Once you have run ETL, you need “match/merge” ● MDM suggests building “golden records” by ○ Implementing match rues (e.g. two entities are the same if they have the same address) ○ Implementing merge rules (e.g. take the most recent value and ignore older ones) Doesn’t Scale! ● GE classification problem: 20M spend transactions to be classified into a pre-built hierarchy ● 500 rules classified only 10% of the spend transaction 15
  • 16. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 So what to do? At scale, you need a solution that leverages ML and statistics ● OK to use rules to generate training data ● That’s what Tamr did on the GE problem 16 +
  • 17. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 17 Blunder #5 Belief that Data Warehouses will Solve all your Problems
  • 18. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #5 Belief that Data Warehouses will Solve all your Problems 18 Data warehouses are good at customer facing structured data FROM A FEW DATA SOURCES ● But not text, images, video, … ● Use the technology for what it is good for ○ Do not perform unnatural acts! ○ And get rid of the “high price spread”, if you bought into it ○ And remember that your warehouse will move to the cloud (see Blunder #1)
  • 19. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 19 Blunder #6 Belief that Hadoop/Spark will Solve all your Problems
  • 20. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #6 Belief that Hadoop/Spark will Solve all your Problems 20 ● Hadoop/Spark is not very good at anything ○ E.g. Spark/SQL is not competitive (but getting better) ○ E.g. Spark/Streaming is not competitive (last time I looked) ● Use “best of breed” not “lowest common denominator” -- at least for your “secret sauce” ○ This is a universal blunder -- desire to use only one vendor ○ Hadoop/Spark is not very good at anything ● And… ○ Spark/Hadoop is useless on Blunders #3 and #4 (i.e. data integration)
  • 21. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 So what to do with your Hadoop/Spark cluster? ● Repurpose it or a Data Lake ● Repurpose it for Data Integration ● Throw it Away ○ Hardware lifetime is 3 years (maybe) ○ Remember Blunder #1 21
  • 22. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 22 Blunder #7 Belief that Data Lakes will Solve all your Problems
  • 23. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #7 Belief that Data Lakes will Solve all your Problems 23 Conventional Wisdom Just load all your data into a “data lake” and you will be able to correlate all data sets Important Fact (Tattoo this on your Brain): Independently constructed data sets are never “plug compatible”
  • 24. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Why? ● Schemas don’t match ○ You call it salary; I call it wages ● Units don’t match ○ You use Euros; I use $$$ ● Semantics don’t match ○ My salaries are gross before taxes; yours are net after taxes with a lunch allowance 24 ● Time granularity doesn't match ○ You have annual data; I have monthly data ● Data is dirty ○ 99 means null (sometimes) ○ Null means “data missing” or “data not allowed” or... ● Duplicates must be removed ○ And there are no keys ○ I am Mike Stonebraker in one data set; M.R. Stonebreaker in a second one
  • 25. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 The Net Result ● Your analytics will be garbage ○ “GIGO” ● Your ML models will fail ○ I.e. produce garbage ○ Again “GIGO” 25
  • 26. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 So what to do? ● You don’t have a data lake; you have a data swamp ● Need a data curation system ○ Which will solve the aforementioned problems ○ And this will not be trivial!! ● Traditional technology likely to fail (See Blunder #4) ● This is an 800 pound gorilla ○ Make sure you put your best people on it!!!! ○ Chances are your in-house solution is crap ○ Use modern technology (from startups) not your “home brew” ● If you want the best technology, you have to deal with startups!!!! 26
  • 27. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 27 Blunder #8 Outsourcing your new stuff to Palantir, IBM, Mu Sigma
  • 28. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #8 Outsourcing your new stuff to Palantir, IBM, Mu Sigma 28 ● Typical enterprise spends 95% of its IT resources keeping current (legacy) code running ○ i.e. Maintenance ○ Most are dug in pretty deep ○ Often have the best people “keeping the lights on” ● “Shiny new stuff” gets outsourced ○ Often because here is no appropriate talent internally
  • 29. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 This is a catch 22 ● Your maintenance is boring! ○ So creative people quit ○ So there is no good talent to work on the new stuff ○ And you can’t hire great talent (Takes great people to hire great people) ● Your new stuff is your “secret sauce” over the next decade or so… ○ Please don’t outsource it. This is long-term suicide ○ Instead outsource the diddly-crap (e-mail et. al.) ○ Software is your secret sauce -- invest in your own people 29
  • 30. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 So what to do? 1. Start by solving Blunder #2 (Not planning for AI/ML to change most everything) 1. Outsource the borning maintenance 2. Cancel the Palantir contract 30
  • 31. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 31 Blunder #9 Succumbing to the “Innovator’s Dilemma”
  • 32. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #9 Succumbing to the “Innovator’s Dilemma” 32 ● Must read book by Clayton Christensen ● Stream shovel example ○ Cable stream shovels - big payload ○ Hydraulics - much safer, but low payload ● Used for “small jobs” ○ Payloads increased and hydraulics won ○ Cable guys went out of business
  • 33. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Net-Net ● Have to be willing to give up your current business model ● And reinvent yourself ● Possibly losing some current customers in the process ○ Otherwise, you go out of business in the long run ○ Taxi licenses in Cambridge have gone from $700k to $10k 33
  • 34. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 34 Blunder #10 Not Paying Up for a Few “Rocket Scientists”
  • 35. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #10 Not Paying Up for a Few “Rocket Scientists” 35 ● They will be your guiding light to avoiding these blunders ● They will be “off scale” ○ Your HR folks won’t like what you have to pay ● Chances are they will be weird ○ E.g. no shoes, no socks, no tie, feet on the table, ... ● Please don’t drive them away! ○ As Citibank did to one of my Berkeley students a while ago
  • 36. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 36 Blunder #11 (Bonus) Working for a Company That is not Trying to do Something about the “Sins of the Past”
  • 37. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Blunder #11 (Bonus) Working for a Company That is not Trying to do Something about the “Sins of the Past” 37 If you work for a company that is succumbing to (even one) of these blunders then: 1. You should be fixing it a. Be part of the solution, not part of the problem 2. Or looking for a new employer a. Tamr is hiring!
  • 38. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 Questions? 38
  • 39. How to Avoid the 10 Big Data Blunders - Best Practices for Success in 2021 39 Thank You! To learn more about Tamr visit tamr.com You’ll receive the 10 Big Data Analytics Blunders Infographic via email.