SlideShare a Scribd company logo
1 of 31
Data Sciences and Statistics
What is the intersection?
Ron Wasserstein, Executive Director
American Statistical Association
Data
Science

Statistics
Data
Science

Statistics
Data
Science

Statistics
Statistics

Data
Science
Data
Science

Statistics
Statistics
Statistics
Fire up your mobile devices
• #ronwnyc
• ron@amstat.org
• Vocal cords
Feedback request #1
• In a short phrase, like the way
someone might describe their work
(firefighter, lawyer, bank teller,
secretary, etc.), please describe the
work that you do.

ron@amstat.org #ronwnyc
Feedback request #2
• What is your formal job title?

ron@amstat.org #ronwnyc
Now imagine we had 100,000,000 of
these responses

• WWADSD?
• WWASD?
Talent Analytics Brief: Four Functional
Clusters of Analytics Professionals
• Pasha Roberts, Greta Roberts
• July 2013
• http://www.talentanalytics.com/wpcontent/uploads/2014/01/TA_AP_ResearchBrief.pdf

• Drop me a note and I’ll send you this URL

ron@amstat.org #ronwnyc
Four functional clusters of
analytics professionals
•
•
•
•

Data Preparation Analysts
Analytics Programmers
Analytics Managers
Analytics Generalists
Feedback request #3
• Without waiting for the descriptions
of these things, which one best
describes how you view yourself
currently? (“None” is also an option)
– Data Preparation Analysts
– Analytics Programmers
– Analytics Managers
– Analytics Generalists

ron@amstat.org #ronwnyc
Feedback request #4
• Does this set of clusters adequately
describe the data science world, in
your experience? Why or why not?
– Data Preparation Analysts
– Analytics Programmers
– Analytics Managers
– Analytics Generalists

ron@amstat.org #ronwnyc
Clustering data scientists
•
•
•
•

Data developer
Data researcher
Data creative
Data businessperson

ron@amstat.org #ronwnyc
Feedback request #5
• Without waiting for the descriptions
of these things, which one best
describes how you view yourself?
(“None” is also an option)
– Data developer
– Data researcher
– Data creative
– Data businessperson

ron@amstat.org #ronwnyc
Feedback request #6
• Does this set of clusters adequately
describe the data science world, in
your experience? Why or why not?
– Data developer
– Data researcher
– Data creative
– Data businessperson

ron@amstat.org #ronwnyc
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Tomasz Tunguz (Redpoint)
Which of the Five Types of Data
Science Does Your Startup Need?
•
•
•
•
•

Quantitative, exploratory data scientists
Operational data scientists
Product data scientists
Marketing data scientists
Research data scientists

• http://www.linkedin.com/today/post/article/201
31002174328-4444200-which-of-the-five-typesof-data-science-does-your-startup-need
ron@amstat.org #ronwnyc
Sean Owens blog (Cloudera) today
• those practicing “investigative analytics”
and
• those implementing “operational
analytics.”
• blog.cloudera.com/blog/2014/03/whyapache-spark-is-a-crossover-hit-for-datascientists
ron@amstat.org #ronwnyc
Shaped for success
•
•
•
•

I-shaped
“dash” shaped
T-shaped
Π-shaped (or even “comb-shaped”)

ron@amstat.org #ronwnyc
Feedback request #7
• What is the one skill you’ve most
needed that you did NOT learn in
school?

ron@amstat.org #ronwnyc
What are people hiring data scientists
saying?
Can you get to the middle?
©American Statistical Association 2014

Statisticians bring
integrity to the
processes and
data that fuel
innovation and
have real impact
on our world.
Final feedback
• What else can you tell me that will
help the statistics community better
understand the data science
community?

ron@amstat.org #ronwnyc

More Related Content

Viewers also liked

Data Visualization: A Quick Tour for Data Science Enthusiasts
Data Visualization: A Quick Tour for Data Science EnthusiastsData Visualization: A Quick Tour for Data Science Enthusiasts
Data Visualization: A Quick Tour for Data Science EnthusiastsKrist Wongsuphasawat
 
BlockChain Strategists - English presentation
BlockChain Strategists - English presentationBlockChain Strategists - English presentation
BlockChain Strategists - English presentationBlockChain Strategists
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Beat Signer
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturechris wiggins
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and ProcessingMehul Gondaliya
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQLMike Crabb
 
CBGTBT - Part 1 - Workshop introduction & primer
CBGTBT - Part 1 - Workshop introduction & primerCBGTBT - Part 1 - Workshop introduction & primer
CBGTBT - Part 1 - Workshop introduction & primerBlockstrap.com
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Hamilton
 

Viewers also liked (12)

D3 js
D3 jsD3 js
D3 js
 
No-Bullshit Data Science
No-Bullshit Data ScienceNo-Bullshit Data Science
No-Bullshit Data Science
 
Data Visualization: A Quick Tour for Data Science Enthusiasts
Data Visualization: A Quick Tour for Data Science EnthusiastsData Visualization: A Quick Tour for Data Science Enthusiasts
Data Visualization: A Quick Tour for Data Science Enthusiasts
 
D3.js workshop
D3.js workshopD3.js workshop
D3.js workshop
 
BlockChain Strategists - English presentation
BlockChain Strategists - English presentationBlockChain Strategists - English presentation
BlockChain Strategists - English presentation
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
 
data science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecturedata science @NYT ; inaugural Data Science Initiative Lecture
data science @NYT ; inaugural Data Science Initiative Lecture
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and Processing
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
CBGTBT - Part 1 - Workshop introduction & primer
CBGTBT - Part 1 - Workshop introduction & primerCBGTBT - Part 1 - Workshop introduction & primer
CBGTBT - Part 1 - Workshop introduction & primer
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
 

Similar to Nyc open data meetup wasserstein presentation

Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)Thinkful
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCTJ Stalcup
 
DATA SCINCE.pptx
DATA SCINCE.pptxDATA SCINCE.pptx
DATA SCINCE.pptxMeesanRaza
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
Getting started in ds (july 17) atlanta
Getting started in ds (july 17)   atlantaGetting started in ds (july 17)   atlanta
Getting started in ds (july 17) atlantaThinkful
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologiesenterprisesearchmeetup
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinciJohnny Lopez
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
 
Search Analytics for Fun and Profit
Search Analytics for Fun and ProfitSearch Analytics for Fun and Profit
Search Analytics for Fun and ProfitLouis Rosenfeld
 
SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...
SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...
SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...eavanesian
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your siteLouis Rosenfeld
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
 
AI @ Wholi - Bucharest.AI Meetup #5
AI @ Wholi - Bucharest.AI Meetup #5AI @ Wholi - Bucharest.AI Meetup #5
AI @ Wholi - Bucharest.AI Meetup #5Traian Rebedea
 

Similar to Nyc open data meetup wasserstein presentation (20)

Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Getting started in data science (4:3)
Getting started in data science (4:3)Getting started in data science (4:3)
Getting started in data science (4:3)
 
Beyond User Research
Beyond User ResearchBeyond User Research
Beyond User Research
 
Thinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
 
DATA SCINCE.pptx
DATA SCINCE.pptxDATA SCINCE.pptx
DATA SCINCE.pptx
 
How Google works
How Google worksHow Google works
How Google works
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Getting started in ds (july 17) atlanta
Getting started in ds (july 17)   atlantaGetting started in ds (july 17)   atlanta
Getting started in ds (july 17) atlanta
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
 
Relevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search TechnologiesRelevancy and Search Quality Analysis - Search Technologies
Relevancy and Search Quality Analysis - Search Technologies
 
The evolution of Search spscinci
The evolution of Search spscinciThe evolution of Search spscinci
The evolution of Search spscinci
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Search Analytics for Fun and Profit
Search Analytics for Fun and ProfitSearch Analytics for Fun and Profit
Search Analytics for Fun and Profit
 
SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...
SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...
SharePoint Saturday Richmond - So you want to implement SharePoint 2010, what...
 
Reactive crowdsourcing
Reactive crowdsourcingReactive crowdsourcing
Reactive crowdsourcing
 
Search Analytics: Diagnosing what ails your site
Search Analytics:  Diagnosing what ails your siteSearch Analytics:  Diagnosing what ails your site
Search Analytics: Diagnosing what ails your site
 
Riley-o.com
Riley-o.comRiley-o.com
Riley-o.com
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
AI @ Wholi - Bucharest.AI Meetup #5
AI @ Wholi - Bucharest.AI Meetup #5AI @ Wholi - Bucharest.AI Meetup #5
AI @ Wholi - Bucharest.AI Meetup #5
 

More from Vivian S. Zhang

Career services workshop- Roger Ren
Career services workshop- Roger RenCareer services workshop- Roger Ren
Career services workshop- Roger RenVivian S. Zhang
 
Nycdsa wordpress guide book
Nycdsa wordpress guide bookNycdsa wordpress guide book
Nycdsa wordpress guide bookVivian S. Zhang
 
We're so skewed_presentation
We're so skewed_presentationWe're so skewed_presentation
We're so skewed_presentationVivian S. Zhang
 
Wikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataWikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataVivian S. Zhang
 
A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data Vivian S. Zhang
 
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow Vivian S. Zhang
 
Data mining with caret package
Data mining with caret packageData mining with caret package
Data mining with caret packageVivian S. Zhang
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on HadoopVivian S. Zhang
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorVivian S. Zhang
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedVivian S. Zhang
 
Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015 Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015 Vivian S. Zhang
 
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public dataTHE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public dataVivian S. Zhang
 
Max Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learningMax Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learningVivian S. Zhang
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangVivian S. Zhang
 
Using Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York TimesUsing Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York TimesVivian S. Zhang
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rVivian S. Zhang
 

More from Vivian S. Zhang (20)

Why NYC DSA.pdf
Why NYC DSA.pdfWhy NYC DSA.pdf
Why NYC DSA.pdf
 
Career services workshop- Roger Ren
Career services workshop- Roger RenCareer services workshop- Roger Ren
Career services workshop- Roger Ren
 
Nycdsa wordpress guide book
Nycdsa wordpress guide bookNycdsa wordpress guide book
Nycdsa wordpress guide book
 
We're so skewed_presentation
We're so skewed_presentationWe're so skewed_presentation
We're so skewed_presentation
 
Wikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataWikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big Data
 
A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data
 
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
 
Data mining with caret package
Data mining with caret packageData mining with caret package
Data mining with caret package
 
Xgboost
XgboostXgboost
Xgboost
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Xgboost
XgboostXgboost
Xgboost
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
 
Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015 Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015
 
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public dataTHE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
 
Max Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learningMax Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learning
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
 
Using Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York TimesUsing Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York Times
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
 
Bayesian models in r
Bayesian models in rBayesian models in r
Bayesian models in r
 

Nyc open data meetup wasserstein presentation

  • 1. Data Sciences and Statistics What is the intersection? Ron Wasserstein, Executive Director American Statistical Association
  • 2.
  • 10. Fire up your mobile devices • #ronwnyc • ron@amstat.org • Vocal cords
  • 11. Feedback request #1 • In a short phrase, like the way someone might describe their work (firefighter, lawyer, bank teller, secretary, etc.), please describe the work that you do. ron@amstat.org #ronwnyc
  • 12. Feedback request #2 • What is your formal job title? ron@amstat.org #ronwnyc
  • 13. Now imagine we had 100,000,000 of these responses • WWADSD? • WWASD?
  • 14. Talent Analytics Brief: Four Functional Clusters of Analytics Professionals • Pasha Roberts, Greta Roberts • July 2013 • http://www.talentanalytics.com/wpcontent/uploads/2014/01/TA_AP_ResearchBrief.pdf • Drop me a note and I’ll send you this URL ron@amstat.org #ronwnyc
  • 15. Four functional clusters of analytics professionals • • • • Data Preparation Analysts Analytics Programmers Analytics Managers Analytics Generalists
  • 16. Feedback request #3 • Without waiting for the descriptions of these things, which one best describes how you view yourself currently? (“None” is also an option) – Data Preparation Analysts – Analytics Programmers – Analytics Managers – Analytics Generalists ron@amstat.org #ronwnyc
  • 17. Feedback request #4 • Does this set of clusters adequately describe the data science world, in your experience? Why or why not? – Data Preparation Analysts – Analytics Programmers – Analytics Managers – Analytics Generalists ron@amstat.org #ronwnyc
  • 18.
  • 19. Clustering data scientists • • • • Data developer Data researcher Data creative Data businessperson ron@amstat.org #ronwnyc
  • 20. Feedback request #5 • Without waiting for the descriptions of these things, which one best describes how you view yourself? (“None” is also an option) – Data developer – Data researcher – Data creative – Data businessperson ron@amstat.org #ronwnyc
  • 21. Feedback request #6 • Does this set of clusters adequately describe the data science world, in your experience? Why or why not? – Data developer – Data researcher – Data creative – Data businessperson ron@amstat.org #ronwnyc
  • 23. Tomasz Tunguz (Redpoint) Which of the Five Types of Data Science Does Your Startup Need? • • • • • Quantitative, exploratory data scientists Operational data scientists Product data scientists Marketing data scientists Research data scientists • http://www.linkedin.com/today/post/article/201 31002174328-4444200-which-of-the-five-typesof-data-science-does-your-startup-need ron@amstat.org #ronwnyc
  • 24. Sean Owens blog (Cloudera) today • those practicing “investigative analytics” and • those implementing “operational analytics.” • blog.cloudera.com/blog/2014/03/whyapache-spark-is-a-crossover-hit-for-datascientists ron@amstat.org #ronwnyc
  • 25. Shaped for success • • • • I-shaped “dash” shaped T-shaped Π-shaped (or even “comb-shaped”) ron@amstat.org #ronwnyc
  • 26. Feedback request #7 • What is the one skill you’ve most needed that you did NOT learn in school? ron@amstat.org #ronwnyc
  • 27. What are people hiring data scientists saying?
  • 28. Can you get to the middle?
  • 29.
  • 30. ©American Statistical Association 2014 Statisticians bring integrity to the processes and data that fuel innovation and have real impact on our world.
  • 31. Final feedback • What else can you tell me that will help the statistics community better understand the data science community? ron@amstat.org #ronwnyc

Editor's Notes

  1. Statisticians bring integrity to the processes and data that fuel innovation and have real impact on our world.