SlideShare a Scribd company logo
1 of 33
Download to read offline
Data Scientists:Myths &
Mathemagical Powers
      James Kobielus
James Kobielus shoots down
10 myths about Data Scientists



      “Data Scientists: Myths and Mathemagical Powers,”
    James Kobielus, Thinking Inside the Box, June 29, 2012
Myth #1




Data scientists are mythical
 beings, like the unicorns.
IBMbigdatahub.com
IBMbigdatahub.com
Myth #2




 Data scientists are an elite
bunch of precious eggheads.
Data scientists get their fingernails
  dirty dumping piles of data into
 analytical sandboxes, cleansing,
  and sifting through it for useful
patterns that may or may not exist.
  Then, they do it all over again.



              Reality #2    IBMbigdatahub.com
Data scientists get their fingernails
                  It’s ofte
               nu piles n mind- into
  dirty dumpingm
                     bingly
                           of data
 analytical sandboxes, detailed
                 grunt       cleansing,
             the sp      work,
                     ort of a n useful
  and sifting through it for ot
                             rm
              data por may chairexist.
patterns that may hiloso not
                             phers.
  Then, they do it all over again.



              Reality #2     IBMbigdatahub.com
Myth #3




Data scientists are a nouveau
   fad that will soon fade.
The term “data scientist” has been
around for years, and the various
   advanced analytics specialties
  that fall under it are even older.
Recently, the term has been used
 in the convergence of disciplines
    that have become super-hot.


             Reality #3    IBMbigdatahub.com
The term “data scientist” has been
around for years, and the various
   advanced analytics specialties
  that fall growth
               under      n job
                        iit are even older.
     Ste  ady the academic been used
Recently,and term has.
      st i ngs              iable
                   unden
    lithe convergence of disciplines
 in ricula is
    c ur               fad.
    that Thi   s is no
             have become super-hot.


                Reality #3       IBMbigdatahub.com
Myth #4




Data scientists are all just
  PhD statisticians who
 failed to make tenure.
Many data scientists acquired
 their quantitative and statistical
   modeling skills in college, but
   pursued degrees in business
  administration, economics and
engineering. They actually know
    about business problems.


            Reality #4     IBMbigdatahub.com
M ny
  Many dataascientists acquired
                   data s
                                c entis
            you’ll and istatistical
 their quantitativenco
                   e                    ts
            the wo           unter
   modeling skills rking
                    in college, but  in
          are bu                world
                 sine in business
   pursued degreesss dom
               sp e c ia            ain
  administration, economics and
                         l i st s !
engineering. They actually know
    about business problems.


               Reality #4       IBMbigdatahub.com
Myth #5




  Data scientists are just BI
specialists with fancier titles.
Many longtime BI power users
 are, in fact, data scientists of a
 sort. They are business domain
  specialists whose jobs involve
multivariate analysis, forecasting,
what-if modeling, and simulation.



             Reality #5   IBMbigdatahub.com
nt
                    meBI power users
 Many develop ey
       er longtime
 Care            i f th
                tdata scientists of a
 are,yintall ou speed
    a s fact, to
  m           p
           y uare business domain
 sort.t They e Hadoop
 do n’ sta ik
  on to ictiv
  specialists e mod     e ing.
        pics l whose ljobs involve
      pred
multivariate analysis, forecasting,
and
what-if modeling, and simulation.



             Reality #5     IBMbigdatahub.com
Myth #6




 Data scientists aren’t really
scientists in any meaningful
     sense of the word.
Statistical controls are the
  bedrock of true science—the core
responsibility of the data scientist. If
 data scientists are confirming their
 findings through statistical controls
and real-world experiments, they’re
     scientists, plain and simple.


               Reality #6     IBMbigdatahub.com
Statistical controls are the
  bedrock of true science—the core
responsibility of the data scientist. If
                  True s
                         cience
 data scientistsnare confirming their
                  othing         is
                           withou
 findings throughvstatistical tcontrols
               obser
                     ationa
                             l data
and real-world experiments, .they’re
     scientists, plain and simple.


               Reality #6     IBMbigdatahub.com
Myth #7




 Data scientists need fancy,
 expensive statistical power
tools to get their work done.
The job of the data scientists is to
 look for hidden patterns. They can
accomplish this through user-friendly
  visualization tools, search-driven
 BI tools and other approaches that
   don’t require a deep mastery of
          statistical analysis.


              Reality #7    IBMbigdatahub.com
The job of the data scientists is to
 look for hidden patterns. They can
accomplish rthisfo ory  r cost- user-friendly
               a ket through
      The m explorat
  visualization tools, y
           ctive            n search-driven
      effe           as ma g
 BI tools tools h cludin
        BI and other approaches that
   don’t end    ors, ina deep mastery of
        v require gnos.
             I BM C o
            statistical analysis.


                 Reality #7      IBMbigdatahub.com
Myth #8




Data scientists simply pour
data into Hadoop and pull
out mind-blowing insights.
The data scientist will be the
first to tell you that Hadoop is
just another platform for deep
      exploration into data.




           Reality #8    IBMbigdatahub.com
There
                      i n’t a
 The data scientistswill be the
              Ouija           magic
                     board
first to tell youich
               wh that Hadoop h
                             throug is
                      the big
just anotherspirits sp forddeep
                platform          ata
                        eak to
                 me e m
      exploration rintoodata. s   u
                           rtals.




             Reality #8       IBMbigdatahub.com
Myth #9




 Data scientists are analytics
junkies who couldn’t care less
 about business applications.
If you spend time with any real-
  world data scientist, they’ll bend
    your ear discussing how they
tackled a specific business problem,
 such as reducing customer churn,
  targeting offers across channels,
    and mitigating financial risks.


             Reality #9    IBMbigdatahub.com
If you spend time withnany real-
                              e t i st s
                       ta sci
  world data ost da rds. They bend
            Mscientist, they’ll
             are  n’t ne
    your ear discussing how    egarthey d
                       e ople r ingo
            kn  ow pbusinessl problem,
tackled a specific big data on.
            al l th is       g jarg churn,
                       u si n
 such as reducing fcustomer
             as con
  targeting offers across channels,
    and mitigating financial risks.


               Reality #9      IBMbigdatahub.com
Myth #10




Data scientists don’t have any
responsibilities that force them
   out of their ivory towers.
That used to be the case. However,
 as next best action and real-world
experiments become ubiquitous, the
  data scientist is evolving into the
  role that stokes, tweaks and fuels
        the operational engine.



             Reality #10   IBMbigdatahub.com
That used to be the case. However,
       Da best action and real-world
 as nextta scien
      analy        tists te
                            s the
            tic become t ubiquitous, the
experiments- cent
       at the        ric mo
                              dels
  data scientistrt oevolving into the
               hea is
       busine           f agile
               ss pro tweaks and fuels
  role that stokes,cess
                            es.
        the operational engine.



              Reality #10     IBMbigdatahub.com
For more from James Kobielus and
  other big data thought leaders,
     visit The Big Data Hub at
       IBMbigdatahub.com

More Related Content

What's hot

A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"Nicola Ferraro
 
Big data
Big dataBig data
Big datahsn99
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best PracticesCapgemini
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trendsAlan Morrison
 
Large Scale Geospatial Indexing and Analysis on Apache Spark
Large Scale Geospatial Indexing and Analysis on Apache SparkLarge Scale Geospatial Indexing and Analysis on Apache Spark
Large Scale Geospatial Indexing and Analysis on Apache SparkDatabricks
 
Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Big data ppt
Big data pptBig data ppt
Big data pptYash Raj
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big DataDATAVERSITY
 

What's hot (20)

Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Data modelling 101
Data modelling 101Data modelling 101
Data modelling 101
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best Practices
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trends
 
Large Scale Geospatial Indexing and Analysis on Apache Spark
Large Scale Geospatial Indexing and Analysis on Apache SparkLarge Scale Geospatial Indexing and Analysis on Apache Spark
Large Scale Geospatial Indexing and Analysis on Apache Spark
 
Big data
Big dataBig data
Big data
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
Data Modeling for Big Data
Data Modeling for Big DataData Modeling for Big Data
Data Modeling for Big Data
 
Big Data
Big DataBig Data
Big Data
 
Big Data & Privacy
Big Data & PrivacyBig Data & Privacy
Big Data & Privacy
 
Data science
Data scienceData science
Data science
 

Viewers also liked

Artificial Intelligence Presentation
Artificial Intelligence PresentationArtificial Intelligence Presentation
Artificial Intelligence Presentationlpaviglianiti
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in PythonImry Kissos
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)Prof. Dr. Diego Kuonen
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learningjoshwills
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning SystemsXavier Amatriain
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013Philip Zheng
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningVarad Meru
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesPier Luca Lanzi
 
Tutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsTutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsNhatHai Phan
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitionsOwen Zhang
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle CompetitionsDataRobot
 

Viewers also liked (20)

Artificial Intelligence Presentation
Artificial Intelligence PresentationArtificial Intelligence Presentation
Artificial Intelligence Presentation
 
Hands-on Deep Learning in Python
Hands-on Deep Learning in PythonHands-on Deep Learning in Python
Hands-on Deep Learning in Python
 
A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)A Statistician's View on Big Data and Data Science (Version 1)
A Statistician's View on Big Data and Data Science (Version 1)
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Hadoop and Machine Learning
Hadoop and Machine LearningHadoop and Machine Learning
Hadoop and Machine Learning
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Machine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification RulesMachine Learning and Data Mining: 12 Classification Rules
Machine Learning and Data Mining: 12 Classification Rules
 
Tutorial on Deep learning and Applications
Tutorial on Deep learning and ApplicationsTutorial on Deep learning and Applications
Tutorial on Deep learning and Applications
 
Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions
 
Robots
RobotsRobots
Robots
 

Similar to Myths and Mathemagical Superpowers of Data Scientists

Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsIBM Analytics
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Inside Analysis
 
Big Data for Beginners
Big Data for BeginnersBig Data for Beginners
Big Data for BeginnersMichael Perez
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Garrett Teoh Hor Keong
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingDATAVERSITY
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerLucas Group
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist prateek kumar
 
20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big dataRiver11river
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxAbderrahmanABID2
 
Realism credai dec 2010 article
Realism credai dec 2010 articleRealism credai dec 2010 article
Realism credai dec 2010 articlerealism.IN
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data scienceGlobalTechCouncil
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraVin Malhotra
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Good Rebels
 

Similar to Myths and Mathemagical Superpowers of Data Scientists (20)

Myths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data ScientistsMyths and Mathemagical Superpowers of Data Scientists
Myths and Mathemagical Superpowers of Data Scientists
 
Data science
Data scienceData science
Data science
 
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and...
 
Data scientist
Data scientistData scientist
Data scientist
 
Big Data for Beginners
Big Data for BeginnersBig Data for Beginners
Big Data for Beginners
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
Big Data World Singapore 2017 - Moving Towards Digitization & Artificial Inte...
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive Computing
 
Big Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its powerBig Data; Big Potential: How to find the talent who can harness its power
Big Data; Big Potential: How to find the talent who can harness its power
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Data science
Data scienceData science
Data science
 
20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data20 Emerging influencers in 2020 for big data
20 Emerging influencers in 2020 for big data
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Big Data Challenges
Big Data ChallengesBig Data Challenges
Big Data Challenges
 
Ch1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptxCh1IntroductiontoDataScience.pptx
Ch1IntroductiontoDataScience.pptx
 
Realism credai dec 2010 article
Realism credai dec 2010 articleRealism credai dec 2010 article
Realism credai dec 2010 article
 
Top 10 areas of expertise in data science
Top 10 areas of expertise in data scienceTop 10 areas of expertise in data science
Top 10 areas of expertise in data science
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Data Scientist - Good Rebels -
Data Scientist - Good Rebels -Data Scientist - Good Rebels -
Data Scientist - Good Rebels -
 

More from David Pittman

Cloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo HighlightsCloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo HighlightsDavid Pittman
 
Data, Analytics and the Insurance Industry
Data, Analytics and the Insurance IndustryData, Analytics and the Insurance Industry
Data, Analytics and the Insurance IndustryDavid Pittman
 
Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica David Pittman
 
Seattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better careSeattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better careDavid Pittman
 
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...David Pittman
 
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...David Pittman
 
Infographic: Big Data Exploration
Infographic: Big Data ExplorationInfographic: Big Data Exploration
Infographic: Big Data ExplorationDavid Pittman
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionDavid Pittman
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataDavid Pittman
 

More from David Pittman (9)

Cloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo HighlightsCloud Infrastructure & IT Optimization Expo Highlights
Cloud Infrastructure & IT Optimization Expo Highlights
 
Data, Analytics and the Insurance Industry
Data, Analytics and the Insurance IndustryData, Analytics and the Insurance Industry
Data, Analytics and the Insurance Industry
 
Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica Big Data & Analytics and the Retail Industry: Luxottica
Big Data & Analytics and the Retail Industry: Luxottica
 
Seattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better careSeattle Children's Hospital turns Big Data into better care
Seattle Children's Hospital turns Big Data into better care
 
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...First Tennessee Bank: applying analytics to drive higher ROI from market prog...
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
 
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
 
Infographic: Big Data Exploration
Infographic: Big Data ExplorationInfographic: Big Data Exploration
Infographic: Big Data Exploration
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in Action
 
Analytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big DataAnalytics: The Real-world Use of Big Data
Analytics: The Real-world Use of Big Data
 

Recently uploaded

GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInThousandEyes
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingMAGNIntelligence
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxKaustubhBhavsar6
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIVijayananda Mohire
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfInfopole1
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024Brian Pichman
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosErol GIRAUDY
 
20140402 - Smart house demo kit
20140402 - Smart house demo kit20140402 - Smart house demo kit
20140402 - Smart house demo kitJamie (Taka) Wang
 

Recently uploaded (20)

GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedInOutage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
Outage Analysis: March 5th/6th 2024 Meta, Comcast, and LinkedIn
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced Computing
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptx
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAI
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
Extra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdfExtra-120324-Visite-Entreprise-icare.pdf
Extra-120324-Visite-Entreprise-icare.pdf
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
Planetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile BrochurePlanetek Italia Srl - Corporate Profile Brochure
Planetek Italia Srl - Corporate Profile Brochure
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenarios
 
20140402 - Smart house demo kit
20140402 - Smart house demo kit20140402 - Smart house demo kit
20140402 - Smart house demo kit
 

Myths and Mathemagical Superpowers of Data Scientists

  • 1. Data Scientists:Myths & Mathemagical Powers James Kobielus
  • 2. James Kobielus shoots down 10 myths about Data Scientists “Data Scientists: Myths and Mathemagical Powers,” James Kobielus, Thinking Inside the Box, June 29, 2012
  • 3. Myth #1 Data scientists are mythical beings, like the unicorns.
  • 6. Myth #2 Data scientists are an elite bunch of precious eggheads.
  • 7. Data scientists get their fingernails dirty dumping piles of data into analytical sandboxes, cleansing, and sifting through it for useful patterns that may or may not exist. Then, they do it all over again. Reality #2 IBMbigdatahub.com
  • 8. Data scientists get their fingernails It’s ofte nu piles n mind- into dirty dumpingm bingly of data analytical sandboxes, detailed grunt cleansing, the sp work, ort of a n useful and sifting through it for ot rm data por may chairexist. patterns that may hiloso not phers. Then, they do it all over again. Reality #2 IBMbigdatahub.com
  • 9. Myth #3 Data scientists are a nouveau fad that will soon fade.
  • 10. The term “data scientist” has been around for years, and the various advanced analytics specialties that fall under it are even older. Recently, the term has been used in the convergence of disciplines that have become super-hot. Reality #3 IBMbigdatahub.com
  • 11. The term “data scientist” has been around for years, and the various advanced analytics specialties that fall growth under n job iit are even older. Ste ady the academic been used Recently,and term has. st i ngs iable unden lithe convergence of disciplines in ricula is c ur fad. that Thi s is no have become super-hot. Reality #3 IBMbigdatahub.com
  • 12. Myth #4 Data scientists are all just PhD statisticians who failed to make tenure.
  • 13. Many data scientists acquired their quantitative and statistical modeling skills in college, but pursued degrees in business administration, economics and engineering. They actually know about business problems. Reality #4 IBMbigdatahub.com
  • 14. M ny Many dataascientists acquired data s c entis you’ll and istatistical their quantitativenco e ts the wo unter modeling skills rking in college, but in are bu world sine in business pursued degreesss dom sp e c ia ain administration, economics and l i st s ! engineering. They actually know about business problems. Reality #4 IBMbigdatahub.com
  • 15. Myth #5 Data scientists are just BI specialists with fancier titles.
  • 16. Many longtime BI power users are, in fact, data scientists of a sort. They are business domain specialists whose jobs involve multivariate analysis, forecasting, what-if modeling, and simulation. Reality #5 IBMbigdatahub.com
  • 17. nt meBI power users Many develop ey er longtime Care i f th tdata scientists of a are,yintall ou speed a s fact, to m p y uare business domain sort.t They e Hadoop do n’ sta ik on to ictiv specialists e mod e ing. pics l whose ljobs involve pred multivariate analysis, forecasting, and what-if modeling, and simulation. Reality #5 IBMbigdatahub.com
  • 18. Myth #6 Data scientists aren’t really scientists in any meaningful sense of the word.
  • 19. Statistical controls are the bedrock of true science—the core responsibility of the data scientist. If data scientists are confirming their findings through statistical controls and real-world experiments, they’re scientists, plain and simple. Reality #6 IBMbigdatahub.com
  • 20. Statistical controls are the bedrock of true science—the core responsibility of the data scientist. If True s cience data scientistsnare confirming their othing is withou findings throughvstatistical tcontrols obser ationa l data and real-world experiments, .they’re scientists, plain and simple. Reality #6 IBMbigdatahub.com
  • 21. Myth #7 Data scientists need fancy, expensive statistical power tools to get their work done.
  • 22. The job of the data scientists is to look for hidden patterns. They can accomplish this through user-friendly visualization tools, search-driven BI tools and other approaches that don’t require a deep mastery of statistical analysis. Reality #7 IBMbigdatahub.com
  • 23. The job of the data scientists is to look for hidden patterns. They can accomplish rthisfo ory r cost- user-friendly a ket through The m explorat visualization tools, y ctive n search-driven effe as ma g BI tools tools h cludin BI and other approaches that don’t end ors, ina deep mastery of v require gnos. I BM C o statistical analysis. Reality #7 IBMbigdatahub.com
  • 24. Myth #8 Data scientists simply pour data into Hadoop and pull out mind-blowing insights.
  • 25. The data scientist will be the first to tell you that Hadoop is just another platform for deep exploration into data. Reality #8 IBMbigdatahub.com
  • 26. There i n’t a The data scientistswill be the Ouija magic board first to tell youich wh that Hadoop h throug is the big just anotherspirits sp forddeep platform ata eak to me e m exploration rintoodata. s u rtals. Reality #8 IBMbigdatahub.com
  • 27. Myth #9 Data scientists are analytics junkies who couldn’t care less about business applications.
  • 28. If you spend time with any real- world data scientist, they’ll bend your ear discussing how they tackled a specific business problem, such as reducing customer churn, targeting offers across channels, and mitigating financial risks. Reality #9 IBMbigdatahub.com
  • 29. If you spend time withnany real- e t i st s ta sci world data ost da rds. They bend Mscientist, they’ll are n’t ne your ear discussing how egarthey d e ople r ingo kn ow pbusinessl problem, tackled a specific big data on. al l th is g jarg churn, u si n such as reducing fcustomer as con targeting offers across channels, and mitigating financial risks. Reality #9 IBMbigdatahub.com
  • 30. Myth #10 Data scientists don’t have any responsibilities that force them out of their ivory towers.
  • 31. That used to be the case. However, as next best action and real-world experiments become ubiquitous, the data scientist is evolving into the role that stokes, tweaks and fuels the operational engine. Reality #10 IBMbigdatahub.com
  • 32. That used to be the case. However, Da best action and real-world as nextta scien analy tists te s the tic become t ubiquitous, the experiments- cent at the ric mo dels data scientistrt oevolving into the hea is busine f agile ss pro tweaks and fuels role that stokes,cess es. the operational engine. Reality #10 IBMbigdatahub.com
  • 33. For more from James Kobielus and other big data thought leaders, visit The Big Data Hub at IBMbigdatahub.com