SlideShare a Scribd company logo
Evidence
based
data
analysis @jtleek
Data science
as a
Science
(DSaaS) @jtleek
“Data science is as much
art as it is science.”
Wouldn’t it be amazing if we got
2,000 people to learn statistics!
“
”-Jeff Leek
7/17/12
date: 7/19/12
from: jtleek@gmail.com
Roger let me know you gave him a
ballpark figure for the number of
students registered for his course
"Computing for Data Analysis”. Could
you give me an idea of how many have
registered for my course "Data
Analysis?”
date: 7/19/12
from: pangwei@coursera.org
Hi Jeff,
7,000 students! It's pretty awesome.
(You'll be able to check this out yourself
next week, once the class sites are up.)
date: 7/19/12
from: rdpeng@gmail.com
You are f**ed.
-roger
9 classes
1 month long
Always open
Data Science Specialization
Total Enrollments: 3,815,890
Total Completions: 409,712
Genomic Data Science Specialization
Total Enrollments: 173,495
Total Completions: 10,826
Executive Data Science
Specialization
Total Enrollments: 62,076
A theoretical model
Data
A theoretical model
Data
Y = some outcome
X = some covariate
D = (X,Y)
lm(Y ~ X)
Y = some outcome
X = some covariate
D = (X,Y)
lm(Y ~ X)
Leek and Peng, Nature 2015
F0
Ul
F0(S)
Ul
F0(Y)
Fithian, Sun and Taylor arXiv 2015
σ-algebra
“what we know”
F0
Ul
F0(S)
Ul
F0(Y)
“we’ve done nothing”
F0
Ul
F0(S)
Ul
F0(Y)
“we did model selection”
F0
Ul
F0(S)
Ul
F0(Y)
“we looked at all the data”
F0
Ul
F0(S)
Ul
F0(Y)
E[β |F0]
≠
E[β |F0(S)]
Population
Question
Hypothesis
Experimental Design
Experimentor
Data
Analysis Plan
Analyst
Code
Estimate
Claim Patil, Peng and Leek biorXiv 2016
Population
Question
Hypothesis
Experimental Design
Experimentor
Data
Analysis Plan
Analyst
Code
Estimate
Claim Patil, Peng and Leek biorXiv 2016
F0
Ul
F0(1P,Q(H))
F0(1ED(E))
F0(1ED;E(D))
F0(1AP;A(C))
F0(1C(A*))
UlUlUlUl
Population
Question
Hypothesis
Experimental Design
Experimentor
Data
Analysis Plan
Analyst
Code
Estimate
Claim Patil, Peng and Leek biorXiv 2016
F0
Ul
F0(1P,Q(H))
F0(1ED(E))
F0(1ED;E(D))
F0(1AP;A(C))
F0(1C(A*))
UlUlUlUl
A theoretical model
Data
Slide courtesy Hadley Wickham
Who?
What?
When?
Why?
Where?
How? Slide courtesy Hadley Wickham
Who?
What?
When?
Why?
Where?
How? Where Ingo is working
Who?
What?
When?
Why?
Where?
How? Slide courtesy Hadley Wickham
Base R
Lassodplyr
googlesheets
ppt
Who?
What?
When?
Why?
Where?
How? Slide courtesy Hadley Wickham
Bad life choices?
Sparsity!
David Robinson
told me
Spreadsheets 
Hedgemony
Cleveland and McGill JASA 1984
Leek & Peng 2015 PNAS
Experiment
Leek and Peng, Science 2015
Population
Question
Hypothesis
Experimental Design
Experimentor
Data
Analysis Plan
Analyst
Code
Estimate
Claim
E[S| F0(1c(W))
We take a random sample of individuals in a
population and identify whether they smoke
and if they have cancer. We observe that there
is a strong relationship between whether a
person in the sample smoked or whether they
have lung cancer. We claim that smoking is
related to lung cancer in the larger population.
79% 17%
Inferential
vs
Causal
n=47,141
We take a random sample of individuals in a
population and identify whether they smoke
and if they have cancer. We observe that there
is a strong relationship between whether a
person in the sample smoked or whether they
have lung cancer. We claim that smoking is
related to lung cancer in the larger population.
We explain we think that the reason for this
relationship is because cigarette smoke
contains known carcinogens such as benzene,
which make cells in lungs become cancerous.
65% 32 %
Inferential
vs
Causal
n=47,141
Experiment
Population
Question
Hypothesis
Experimental Design
Experimentor
Data
Analysis Plan
Analyst
Code
Estimate
Claim
E[Est| F0(1c(A))
69% vs 40%
n=1,985
Experiment
E[Claim | F0(1set(base)(A))]
-
E[Claim | F0(1set(ggplot2)(A))]
Population
Question
Hypothesis
Experimental Design
Experimentor
Data
Analysis Plan
Analyst
Code
Estimate
Claim
1. Make a plot that answers the question: what is the
relationship between mean covered charges
(Average.Covered.Charges) and mean total payments
(Average.Total.Payments) in New York?
2. Make a plot (possibly multi-panel) that answers the
question: how does the relationship between mean
covered charges (Average.Covered.Charges) and mean
total payments (Average.Total.Payments) vary by
medical condition (DRG.Definition) and the state in which
care was received (Provider.State)?
Use only the [ggplot2/base R] graphics system (not
base R or lattice) to make your figure.
“Does the plot clearly show the
relationship between mean covered
charges (Average.Covered.Charges)
and mean total payments
(Average.Total.Payments) in New
York?”
G: 5/22 (23%) vs. B: 5/12
(42%)
“Does the plot clearly show the relationship
between mean covered charges
(Average.Covered.Charges) and mean total
payments (Average.Total.Payments) vary by
medical condition (DRG.Definition) and the state
in which care was received (Provider.State)?”
G: 7/22 (32%) vs. B: 5/12 (42%)
“Is the plot visually pleasing?”
G: 21/22 (95%) vs. B: 10/12 (83%)
G: 20/22 (91%) vs. B: 8/12 (67%)
“Do the plot text and labels use full
words instead of abbreviations?”
G: 21/22 (95%) vs. B: 12/12 (100%)
G: 11/22 (50%) vs. B: 5/12 (42%)
A theoretical model
Data
Data science
as a
Science
(DSaaS) @jtleek
Data science as a science

More Related Content

What's hot

Bringing bioinformatics into the library
Bringing bioinformatics into the libraryBringing bioinformatics into the library
Bringing bioinformatics into the library
C. Tobin Magle
 
Digital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible researchDigital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible research
SC CTSI at USC and CHLA
 
PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increa...
PLoS ONE Piwowar:    Sharing Detailed Research Data Is Associated with Increa...PLoS ONE Piwowar:    Sharing Detailed Research Data Is Associated with Increa...
PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increa...
Heather Piwowar
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
Paul Groth
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
Richard Layton
 
Introduction to Systematic Reviews (Oslo)
Introduction to Systematic Reviews (Oslo)Introduction to Systematic Reviews (Oslo)
Introduction to Systematic Reviews (Oslo)
jstaaks
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
jstaaks
 
The Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for ScienceThe Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for Science
Paul Groth
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Data
gpano
 
A replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable scienceA replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable science
Björn Brembs
 
We need to solve more that just our access problems
We need to solve more that just our access problemsWe need to solve more that just our access problems
We need to solve more that just our access problems
Björn Brembs
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
William Gunn
 
Why canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarshipWhy canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarship
Björn Brembs
 
Open access repositories
Open access repositoriesOpen access repositories
Open access repositoriesIryna Kuchma
 
What are we? Statistical Ecologists or Ecological Statisticians?
What are we?  Statistical Ecologists or Ecological Statisticians?What are we?  Statistical Ecologists or Ecological Statisticians?
What are we? Statistical Ecologists or Ecological Statisticians?
Bob O'Hara
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
Carole Goble
 
Urban Data Science at UW
Urban Data Science at UWUrban Data Science at UW
Urban Data Science at UW
University of Washington
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 

What's hot (20)

Bringing bioinformatics into the library
Bringing bioinformatics into the libraryBringing bioinformatics into the library
Bringing bioinformatics into the library
 
Digital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible researchDigital Scholar Webinar: Open reproducible research
Digital Scholar Webinar: Open reproducible research
 
PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increa...
PLoS ONE Piwowar:    Sharing Detailed Research Data Is Associated with Increa...PLoS ONE Piwowar:    Sharing Detailed Research Data Is Associated with Increa...
PLoS ONE Piwowar: Sharing Detailed Research Data Is Associated with Increa...
 
From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
Introduction to Systematic Reviews (Oslo)
Introduction to Systematic Reviews (Oslo)Introduction to Systematic Reviews (Oslo)
Introduction to Systematic Reviews (Oslo)
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
 
The Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for ScienceThe Challenge of Deeper Knowledge Graphs for Science
The Challenge of Deeper Knowledge Graphs for Science
 
Natural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual DataNatural Language Processing on Non-Textual Data
Natural Language Processing on Non-Textual Data
 
A replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable scienceA replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable science
 
We need to solve more that just our access problems
We need to solve more that just our access problemsWe need to solve more that just our access problems
We need to solve more that just our access problems
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Why canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarshipWhy canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarship
 
Open access repositories
Open access repositoriesOpen access repositories
Open access repositories
 
What are we? Statistical Ecologists or Ecological Statisticians?
What are we?  Statistical Ecologists or Ecological Statisticians?What are we?  Statistical Ecologists or Ecological Statisticians?
What are we? Statistical Ecologists or Ecological Statisticians?
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
Urban Data Science at UW
Urban Data Science at UWUrban Data Science at UW
Urban Data Science at UW
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
And the survey says
And the survey saysAnd the survey says
And the survey says
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 

Viewers also liked

Leek romesf-2015
Leek romesf-2015Leek romesf-2015
Leek romesf-2015
jtleek
 
R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016
Lou Bajuk
 
EARL Sept 2016 R consortium
EARL Sept 2016 R consortiumEARL Sept 2016 R consortium
EARL Sept 2016 R consortium
Lou Bajuk
 
Applying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time ApplicationsApplying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time Applications
Lou Bajuk
 
Flash talk about Johns Hopkins Biostatistics Genomics Group
Flash talk about Johns Hopkins Biostatistics Genomics GroupFlash talk about Johns Hopkins Biostatistics Genomics Group
Flash talk about Johns Hopkins Biostatistics Genomics Group
jtleek
 
2013-false-promises-failure-of-secure-communities
2013-false-promises-failure-of-secure-communities2013-false-promises-failure-of-secure-communities
2013-false-promises-failure-of-secure-communitiesSteven Held
 
Opinionated Analysis Development -- rstudio::conf
Opinionated Analysis Development -- rstudio::confOpinionated Analysis Development -- rstudio::conf
Opinionated Analysis Development -- rstudio::conf
Hilary Parker
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R Language
Lou Bajuk
 
INFOGRAPHIC: How does coconut oil stack up against other oils?
INFOGRAPHIC: How does coconut oil stack up against other oils?INFOGRAPHIC: How does coconut oil stack up against other oils?
INFOGRAPHIC: How does coconut oil stack up against other oils?
Food Insight
 
Hivtreatmentdecember2011 111204184012 Phpapp02
Hivtreatmentdecember2011 111204184012 Phpapp02Hivtreatmentdecember2011 111204184012 Phpapp02
Hivtreatmentdecember2011 111204184012 Phpapp02Positive Life
 
Newer drugs approved by US-FDA - Rxvichu!!!
Newer drugs approved by US-FDA - Rxvichu!!!Newer drugs approved by US-FDA - Rxvichu!!!
Newer drugs approved by US-FDA - Rxvichu!!!
RxVichuZ
 
The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...
The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...
The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...CDC NPIN
 
Hepatitis C Drugs - Evidence to Demonstrate Effectiveness & Value
Hepatitis C Drugs - Evidence to Demonstrate Effectiveness & ValueHepatitis C Drugs - Evidence to Demonstrate Effectiveness & Value
Hepatitis C Drugs - Evidence to Demonstrate Effectiveness & Value
Center for Medical Technology Policy
 
Medication application k.bolser
Medication application k.bolserMedication application k.bolser
Medication application k.bolser
kbolser
 
ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...
ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...
ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...
Association of the British Pharmaceutical Industry (ABPI)
 
Breast_Slide_Deck
Breast_Slide_DeckBreast_Slide_Deck
Breast_Slide_DeckAli Adnan
 
Breast Cancer Treatment detection and Cure
Breast Cancer Treatment detection and CureBreast Cancer Treatment detection and Cure
Breast Cancer Treatment detection and Cure
Wpratikhsahospital
 
Medicine Conference - Depression
Medicine Conference - DepressionMedicine Conference - Depression
Medicine Conference - Depression
Dr. David Straker
 
Reflective side
Reflective sideReflective side
Reflective side
Hena Jawaid
 
H I V E D 8.03.09
H I V  E D 8.03.09H I V  E D 8.03.09
H I V E D 8.03.09
Jason Leider
 

Viewers also liked (20)

Leek romesf-2015
Leek romesf-2015Leek romesf-2015
Leek romesf-2015
 
R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016R in BI and Streaming Applications for useR 2016
R in BI and Streaming Applications for useR 2016
 
EARL Sept 2016 R consortium
EARL Sept 2016 R consortiumEARL Sept 2016 R consortium
EARL Sept 2016 R consortium
 
Applying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time ApplicationsApplying the R Language to BI and Real Time Applications
Applying the R Language to BI and Real Time Applications
 
Flash talk about Johns Hopkins Biostatistics Genomics Group
Flash talk about Johns Hopkins Biostatistics Genomics GroupFlash talk about Johns Hopkins Biostatistics Genomics Group
Flash talk about Johns Hopkins Biostatistics Genomics Group
 
2013-false-promises-failure-of-secure-communities
2013-false-promises-failure-of-secure-communities2013-false-promises-failure-of-secure-communities
2013-false-promises-failure-of-secure-communities
 
Opinionated Analysis Development -- rstudio::conf
Opinionated Analysis Development -- rstudio::confOpinionated Analysis Development -- rstudio::conf
Opinionated Analysis Development -- rstudio::conf
 
Real time applications using the R Language
Real time applications using the R LanguageReal time applications using the R Language
Real time applications using the R Language
 
INFOGRAPHIC: How does coconut oil stack up against other oils?
INFOGRAPHIC: How does coconut oil stack up against other oils?INFOGRAPHIC: How does coconut oil stack up against other oils?
INFOGRAPHIC: How does coconut oil stack up against other oils?
 
Hivtreatmentdecember2011 111204184012 Phpapp02
Hivtreatmentdecember2011 111204184012 Phpapp02Hivtreatmentdecember2011 111204184012 Phpapp02
Hivtreatmentdecember2011 111204184012 Phpapp02
 
Newer drugs approved by US-FDA - Rxvichu!!!
Newer drugs approved by US-FDA - Rxvichu!!!Newer drugs approved by US-FDA - Rxvichu!!!
Newer drugs approved by US-FDA - Rxvichu!!!
 
The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...
The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...
The Updated CDC’s Compendium of Evidence-based Behavioral Interventions for R...
 
Hepatitis C Drugs - Evidence to Demonstrate Effectiveness & Value
Hepatitis C Drugs - Evidence to Demonstrate Effectiveness & ValueHepatitis C Drugs - Evidence to Demonstrate Effectiveness & Value
Hepatitis C Drugs - Evidence to Demonstrate Effectiveness & Value
 
Medication application k.bolser
Medication application k.bolserMedication application k.bolser
Medication application k.bolser
 
ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...
ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...
ABPI Conference 2016 - Richard Bergström on ''Work in partnership for better ...
 
Breast_Slide_Deck
Breast_Slide_DeckBreast_Slide_Deck
Breast_Slide_Deck
 
Breast Cancer Treatment detection and Cure
Breast Cancer Treatment detection and CureBreast Cancer Treatment detection and Cure
Breast Cancer Treatment detection and Cure
 
Medicine Conference - Depression
Medicine Conference - DepressionMedicine Conference - Depression
Medicine Conference - Depression
 
Reflective side
Reflective sideReflective side
Reflective side
 
H I V E D 8.03.09
H I V  E D 8.03.09H I V  E D 8.03.09
H I V E D 8.03.09
 

Similar to Data science as a science

2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
c.titus.brown
 
Analysis Of Research
Analysis Of ResearchAnalysis Of Research
Analysis Of Research
Samantha Caldwell
 
Chapter 1 introduction to statistics.
Chapter 1 introduction to statistics.Chapter 1 introduction to statistics.
Chapter 1 introduction to statistics.
OliviaNightingale2
 
Arjun Manrai - National Academies Talk - June 6, 2019
Arjun Manrai - National Academies Talk - June 6, 2019Arjun Manrai - National Academies Talk - June 6, 2019
Arjun Manrai - National Academies Talk - June 6, 2019
Arjun Manrai
 
Discover Data Portal
Discover Data PortalDiscover Data Portal
Discover Data PortalTom Loughran
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
c.titus.brown
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
Claudia Wagner
 
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET Journal
 
Protein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors ApproachProtein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors Approach
Gualberto Asencio Cortés
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
c.titus.brown
 
Towards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataTowards reproducibility and maximally-open data
Towards reproducibility and maximally-open data
Pablo Bernabeu
 
ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75Sam Yang
 
Chapter 0: the what and why of statistics
Chapter 0: the what and why of statisticsChapter 0: the what and why of statistics
Chapter 0: the what and why of statistics
Christian Robert
 
Big Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchBig Data and its Role in Biomedical Research
Big Data and its Role in Biomedical Research
Philip Bourne
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
Jim Bagrow
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
Austin Benson
 
A Lecture on Sample Size and Statistical Inference for Health Researchers
A Lecture on Sample Size and Statistical Inference for Health ResearchersA Lecture on Sample Size and Statistical Inference for Health Researchers
A Lecture on Sample Size and Statistical Inference for Health Researchers
Dr Arindam Basu
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...
“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...
“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...
Christine Luk
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
mikaelhuss
 

Similar to Data science as a science (20)

2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
Analysis Of Research
Analysis Of ResearchAnalysis Of Research
Analysis Of Research
 
Chapter 1 introduction to statistics.
Chapter 1 introduction to statistics.Chapter 1 introduction to statistics.
Chapter 1 introduction to statistics.
 
Arjun Manrai - National Academies Talk - June 6, 2019
Arjun Manrai - National Academies Talk - June 6, 2019Arjun Manrai - National Academies Talk - June 6, 2019
Arjun Manrai - National Academies Talk - June 6, 2019
 
Discover Data Portal
Discover Data PortalDiscover Data Portal
Discover Data Portal
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
IRJET- Big Data and Bayes Theorem used Analyze the Student’s Performance in E...
 
Protein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors ApproachProtein Distance Map Prediction based on a Nearest Neighbors Approach
Protein Distance Map Prediction based on a Nearest Neighbors Approach
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
 
Towards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataTowards reproducibility and maximally-open data
Towards reproducibility and maximally-open data
 
ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75
 
Chapter 0: the what and why of statistics
Chapter 0: the what and why of statisticsChapter 0: the what and why of statistics
Chapter 0: the what and why of statistics
 
Big Data and its Role in Biomedical Research
Big Data and its Role in Biomedical ResearchBig Data and its Role in Biomedical Research
Big Data and its Role in Biomedical Research
 
An information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networksAn information-theoretic, all-scales approach to comparing networks
An information-theoretic, all-scales approach to comparing networks
 
Simplicial closure & higher-order link prediction
Simplicial closure & higher-order link predictionSimplicial closure & higher-order link prediction
Simplicial closure & higher-order link prediction
 
A Lecture on Sample Size and Statistical Inference for Health Researchers
A Lecture on Sample Size and Statistical Inference for Health ResearchersA Lecture on Sample Size and Statistical Inference for Health Researchers
A Lecture on Sample Size and Statistical Inference for Health Researchers
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...
“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...
“The Epistemic Cultures of Single Molecule Biophysics: Participation, Observa...
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 

Recently uploaded

ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 

Recently uploaded (20)

ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 

Data science as a science