Your SlideShare is downloading. ×
0
expl
ore
analyze
t
en
s
re
p

your data
Guillaume Calmettes
“Bonjour”, I am Guillaume!
Sacre Bleu!

Bordeaux

gcalmettes@mednet.ucla.edu
Office: MRL 3645
Disclaimer

I am not a
statistician
Statistics are scary

Statistics

(You at the beginning of the talk)
Statistics are scary
not so

Statistics

(You at the middle of the talk)
Statistics are scary
cool

Statistics

(You at the end of the talk)
Statistics are scary
cool

We have to deal with
them anyways, so we
had better enjoy them!

Statistics

(You at the end of...
Press the 	

t-test button and
you’ll be done!

Did you check
the normality of
your data first?
Why should you care about statistics?

http://www.nature.com/nature/authors/gta/2e_Statistical_checklist.pdf
Why should you care about statistics?
Advances in Physiological Education

“Explorations in Statistics” series (2008-prese...
Why should you care about statistics?
“Statistical Perspectives” series (2011-present)	

(Gordon Drummond)
The Journal of ...
Why should you care about statistics?

Importance of being uncertain – September 2013

How samples are used to estimate po...
Why should you care about statistics?

“Journals […] fail to exert sufficient scrutiny over the results
that they publish”
...
Look at

your data
A picture is worth a thousand words

John Snow
(1813-1858)
Location of deaths in the 1854 London Cholera Epidemic
Why visualize your data?
The Anscombe’s quartet example
Dataset #1

Dataset #2

Dataset #3

Dataset #4

x

y

x

y

x

y

...
Why visualize your data?
The Anscombe’s quartet example
Property in each case

Value

Mean of x

9 (exact)

Variance of x
...
Why visualize your data?
The Anscombe’s quartet example
Dataset #1

Dataset #2

Dataset #3

Dataset #4

Anscombe, F. J. (1...
Why visualize your data?
The Anscombe’s quartet example
Dataset #1

Dataset #2

Dataset #3

Dataset #4

Anscombe, F. J. (1...
Visualize your data in their raw form!
Aim for revelation rather than mere summary
A great graphic with raw data will reve...
If you are still not convinced …
Mean: 16 / Stdv: 5
If you are still not convinced …
Mean: 16 / Stdv: 5
If you are still not convinced …
Mean: 16 / Stdv: 5
e
WBM secondary transplantation
(16 weeks)

Daniel’s Journal Club pape...
Avoid making bar graphs
“To maintain the highest level of trustworthiness of data,
we are encouraging authors to display d...
Avoid making bar graphs

Error bars

Different types, different meanings
100
SORRY
,
WE JUST

75

YOU...

• descriptive st...
Avoid making bar graphs

Error bars

Different types, different meanings

• descriptive statistics (Range, SD)
• inferenti...
Avoid making bar graphs
Mean and Standard deviation are only useful in the	

context of a “normal distribution”
95%

µ

95...
Avoid making bar graphs
symmetrical
distribution

skewed
distribution

Data presentation to reveal the distribution of the...
Avoid making bar graphs
symmetrical
distribution

skewed
distribution

• First set: Gaussian (or normal) distribution (sym...
Avoid making bar graphs
Don't tell me no one warned you before!

Bar graph

Dynamite plunger
Summary
Why visualize your data?

For others ...
Providing a narrative for the reader
But primarily for you ...
Looking fo...
Chose the right descriptor for

your data
Averages can be misleading
Averages can be misleading
Averages can be misleading
Averages can be misleading
Is the mean always a good descriptor?
# of children per household in China (2012)

• mean: 1.35

http://www.globalhealthfa...
Is the mean always a good descriptor?
# of children per household in China (2012)

• mean: 1.35	

• median: 1
more represe...
Any measure is wrong!
“Whenever you make a measurement, you must
know the uncertainty otherwise it is meaningless”
Walter ...
Any measure is wrong!
“Whenever you make a measurement, you must
know the uncertainty otherwise it is meaningless”
Walter ...
Any measure is wrong!
“Whenever you make a measurement, you must
know the uncertainty otherwise it is meaningless”
Walter ...
The Bootstrap: origin

Modern electronic computation has encouraged a host of new statistical methods
that require fewer d...
Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an

Calmettes G. and al. (2012), “Making do with what we have: use y...
Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2

A2
an
a1
an
a1...
Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2

A2
an
a1
an
a1...
Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2

A2
an
a1
an
a1...
Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2

A2
an
a1
an
a1...
Analyze

your data
Choose your statistical test wisely
Authors Guidelines
Every paper that contains statistical testing should state
[...] a ...
The simple case (How to)
mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male
The simple case (How to)
Distribution of the data?

mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

visua...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

visua...
The simple case (How to)
Distribution of the data?
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

visua...
The simple case (How to)
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male

D...
The simple case (How to)
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male

D...
The simple case (How to)
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male

D...
The simple case (How to)
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male

D...
The simple case (How to)
difference/ci	

51.2 [50.4, 51.9]
mean/std	

135.9 ± 19.0
Female

mean/std	

187.0 ± 19.8
Male

D...
Usually it is not so simple
The “not so simple” case

S1

S2
The “not so simple” case

S1

S2
The “not so simple” case
S1

S2

S1

S2
The “not so simple” case
S1

S2

Shapiro-Wilk test:
S1 p-value: 7.4e-05
S2 p-value: 6.7e-06

S1

S2
What to do?
What to do?
For the t-test:	

!

Non parametric
alternatives

• Mann-Whitney U	

(independant)	

!

• Wilcoxon	


(dependa...
Choose a new statistical hero
Bootstrapman

t-test
Computing the bootstrap p-value
Are the two samples different?
Observed difference = 0.44
Computing the bootstrap p-value
Are the two samples different?
Observed difference = 0.44

If the two samples were from th...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

a4 b5 bn
b3 a b2 an b4
1b
a...
Computing the bootstrap p-value
A0
a1 a4
a5 a2
a3 an

D0 = mA-mB
(0.44)

B0
b2 b3 b1
b4 b5 bn

MW: p = 0.0169
171	

 = 0.0...
Summary
How do my data look like?
Distribution?

• visual inspection (hist. / QQ plot)
• normality test

What do I want to...
The dark side of the

p-value
Statistical significance
“The effect of the drug was statistically significant.”
Statistical significance
“The effect of the drug was statistically significant.”

so what?
Statistical significance (example)
“The percentage of neurons showing cue-related activity
increased with training in the ...
Statistical significance (example)
“The percentage of neurons showing cue-related activity
increased with training in the ...
Statistical significance (example)
“The percentage of neurons showing cue-related activity
increased with training in the ...
Statistical significance (example)
“The percentage of neurons showing cue-related activity
increased with training in the ...
Statistical significance (example)
“The percentage of neurons showing cue-related activity
increased with training in the ...
P-values do not convey information
Mean: 16
SD: 5

Mean: 20
SD: 5
Difference = 4
p-value = 0.1090
P-values do not convey information
Mean: 16
SD: 5

Mean: 20
SD: 5
Difference = 4
p-value = 0.1090
0.0367
P-values do not convey information
Mean: 16
SD: 5

Mean: 20
SD: 5
Difference = 4
p-value = 0.1090
0.0367
0.0009
P-values do not convey information
Fact: Most applied scientists use p-values as a measure of evidence
and of the size of ...
Report effect size and CIs instead
P-value is function of the sample size
Measured Effect Size:
difference = 0.018 mV
Amplitude (mV)

Control

Atropine

0.5 ...
P-value is function of the sample size
Measured Effect Size:
difference = 0.018 mV
Amplitude (mV)

Control

Atropine

0.5 ...
P-value is function of the sample size
P (t-test)

100

not significant

10–2

significant
10–4
101

102

103

Hedges' g

0....
Bootstrap effect size and 95% CIs
a1 a2 a4
a5 a3 an

a5
a1
a5
a3

a3
a7
a1
a4

a2
a2
a9
a1

a6
a3
a4
a3

A

b1 b2 b4
b5 b3...
Bootstrap effect size and 95% CIs
a1 a2 a4
a5 a3 an

a5
a1
a5
a3

a3
a7
a1
a4

a2
a2
a9
a1

a6
a3
a4
a3

A

b1 b2 b4
b5 b3...
Bootstrap effect size and 95% CIs
a1 a2 a4
a5 a3 an

a5
a1
a5
a3

a3
a7
a1
a4

a2
a2
a9
a1

a6
a3
a4
a3

A

b1 b2 b4
b5 b3...
Bootstrap effect size and 95% CIs
a1 a2 a4
a5 a3 an

a5
a1
a5
a3

a3
a7
a1
a4

a2
a2
a9
a1

a6
a3
a4
a3

A

b1 b2 b4
b5 b3...
Bootstrap effect size and 95% CIs
Do the 95% confidence intervals of
the observed effect size include
zero (no difference)...
Statistical vs Biological

significance
Statistical vs Biological significance
“The P value reported by tests is a probabilistic significance, not a
biological on...
Statistical vs Biological significance
Statistical significance has a meaning in a specific context

No change
Small chang...
Statistical vs Biological significance
AB

PD
LP

LP 1

PY

LP 2
“Good enough” solutions

0.60

1,600

0.50

mRNA copy num...
Statistical vs Biological significance

Madhvani R.V. et al. (2011) "Shaping a new Ca2+ conductance to suppress early afte...
Statistical vs Biological significance
Breast cancer study	

Difference in cancer returning between control vs
low-fat die...
Statistical vs Biological significance
Breast cancer study	

Difference in cancer returning between control vs
low-fat die...
Beware of false positives

(from the authors)
Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taki...
Beware of false positives

Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the Post-Mort...
Beware of false positives

2012
Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the Post...
Beware of false positives

http://xkcd.com/882/
Present

your data
Know your audience
Know your audience
Who?
Why?
What?
How?
Know your audience
who is my audience? level of understanding?
Who? what do they already know?

Why?
What?
How?
Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? w...
Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? w...
Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? w...
Color blindness is a common disease
Males: one in 12 (8%) / Females: one in 200 (0.5%)
Color blindness is a common disease
“Anyone who needs to be convinced that making scientific
images more accessible is a w...
Making figures for color blind people

Wong, B. (2011). "Points of view: Color blindness". Nature Methods 8, 441
Making figures for color blind people

http://colororacle.org/
Making figures for color blind people

http://colororacle.org/
Telling stories with data
“The Martini Glass Structure”

http://vis.stanford.edu/files/2010-Narrative-InfoVis.pdf
Telling stories with data
“The Martini Glass Structure”
GUIDED
START

!

EXPLORE

NARRATIVE

http://vis.stanford.edu/files...
Aesthetic minimalism

Suda B. (2010). "A practical guide to Designing with Data"
Aesthetic minimalism

Suda B. (2010). "A practical guide to Designing with Data"
Aesthetic minimalism

Suda B. (2010). "A practical guide to Designing with Data"
Aesthetic minimalism

Suda B. (2010). "A practical guide to Designing with Data"
Aesthetic minimalism

Suda B. (2010). "A practical guide to Designing with Data"
Aesthetic minimalism

Suda B. (2010). "A practical guide to Designing with Data"
Common mistakes in data reporting

Welcome to the FOX “Dishonest Charts” gallery
Common mistakes in data reporting
Common mistakes in data reporting
E. Tufte’s “Lie Factor”
Make things appear to be “better” than they are
by fiddling with...
Common mistakes in data reporting
Common mistakes in data reporting
Common mistakes in data reporting
Common mistakes in data reporting
Common mistakes in data reporting
Common mistakes in data reporting
Fig 1I

“We found that relative to WT mice, the luminal
microbiota of Il10−/− mice exhib...
Common mistakes in data reporting
A
B
C
D
E
Common mistakes in data reporting
A
B
C
D
E

20%

20%
20%

20%
20%
Common mistakes in data reporting
Common mistakes in data reporting
Common mistakes in data reporting
Percent Return on Investment
40
30
20
10
0
year1

40

year2

year3

Group
year4 Group A ...
Thank you!

“The important thing is not to stop questioning.
Curiosity has its own reason for existing”
- Albert Einstein-
Upcoming SlideShare
Loading in...5
×

Explore, Analyze and Present your data

612

Published on

Published in: Education, Technology
1 Comment
3 Likes
Statistics
Notes
  • Great File, Like It, Thanks!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
612
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
21
Comments
1
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Explore, Analyze and Present your data"

  1. 1. expl ore analyze t en s re p your data Guillaume Calmettes
  2. 2. “Bonjour”, I am Guillaume! Sacre Bleu! Bordeaux gcalmettes@mednet.ucla.edu Office: MRL 3645
  3. 3. Disclaimer I am not a statistician
  4. 4. Statistics are scary Statistics (You at the beginning of the talk)
  5. 5. Statistics are scary not so Statistics (You at the middle of the talk)
  6. 6. Statistics are scary cool Statistics (You at the end of the talk)
  7. 7. Statistics are scary cool We have to deal with them anyways, so we had better enjoy them! Statistics (You at the end of the talk)
  8. 8. Press the t-test button and you’ll be done! Did you check the normality of your data first?
  9. 9. Why should you care about statistics? http://www.nature.com/nature/authors/gta/2e_Statistical_checklist.pdf
  10. 10. Why should you care about statistics? Advances in Physiological Education “Explorations in Statistics” series (2008-present) (Douglas Curran-Everett)
  11. 11. Why should you care about statistics? “Statistical Perspectives” series (2011-present) (Gordon Drummond) The Journal of Physiology Experimental Physiology The British Journal of Pharmacology Microcirculation The British Journal of Nutrition http://jp.physoc.org/cgi/collection/stats_reporting
  12. 12. Why should you care about statistics? Importance of being uncertain – September 2013
 How samples are used to estimate population statistics and what this means in terms of uncertainty. Error Bars – October 2013
 The use of error bars to represent uncertainty and advice on how to interpret them. Significance, P values and t-tests – November 2013
 Introduction to the concept of statistical significance and the one-sample t-test. http://blogs.nature.com/methagora/2013/08/giving_statistics_the_attention_it_deserves.html
  13. 13. Why should you care about statistics? “Journals […] fail to exert sufficient scrutiny over the results that they publish” “Nature research journals will introduce editorial measures to address the problem by improving the consistency and quality of reporting in life-sciences articles” “We will examine statistics more closely and encourage authors to be transparent, for example by including their raw data”
  14. 14. Look at your data
  15. 15. A picture is worth a thousand words John Snow (1813-1858) Location of deaths in the 1854 London Cholera Epidemic
  16. 16. Why visualize your data? The Anscombe’s quartet example Dataset #1 Dataset #2 Dataset #3 Dataset #4 x y x y x y x y 10 8.04 10 9.14 10 7.46 8 6.58 8 6.95 8 8.14 8 6.77 8 5.76 13 7.58 13 8.74 13 12.74 8 7.71 9 8.81 9 8.77 9 7.11 8 8.84 11 8.33 11 9.26 11 7.81 8 8.47 14 9.96 14 8.1 14 8.84 8 7.04 6 7.24 6 6.13 6 6.08 8 5.25 4 4.26 4 3.1 4 5.39 19 12.5 12 10.84 12 9.13 12 8.15 8 5.56 7 4.82 7 7.26 7 6.42 8 7.91 5 5.68 5 4.74 5 5.73 8 6.89 Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
  17. 17. Why visualize your data? The Anscombe’s quartet example Property in each case Value Mean of x 9 (exact) Variance of x 11 (exact) Mean of y 7.5 Variance of y 4.122 or 4.127 Correlation of x and y 0.816 Linear regression line y = 3.00 + 0.500x Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
  18. 18. Why visualize your data? The Anscombe’s quartet example Dataset #1 Dataset #2 Dataset #3 Dataset #4 Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
  19. 19. Why visualize your data? The Anscombe’s quartet example Dataset #1 Dataset #2 Dataset #3 Dataset #4 Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
  20. 20. Visualize your data in their raw form! Aim for revelation rather than mere summary A great graphic with raw data will reveal unexpected patterns and invites us to make comparisons we might not have thought of beforehand.
  21. 21. If you are still not convinced … Mean: 16 / Stdv: 5
  22. 22. If you are still not convinced … Mean: 16 / Stdv: 5
  23. 23. If you are still not convinced … Mean: 16 / Stdv: 5 e WBM secondary transplantation (16 weeks) Daniel’s Journal Club paper Donor engraftment (%) 80 P < 0.05 60 40 20 0 flDMR/+ DMR/+ mH19
  24. 24. Avoid making bar graphs “To maintain the highest level of trustworthiness of data, we are encouraging authors to display data in their raw form and not in a fashion that conceals their variance. Presenting data as columns with error bars (dynamite plunger plots) conceals data. We recommend that individual data be presented as dot plots shown next to the average for the group with appropriate error bars (Figure 1).” Rockman H.A. (2012). "Great expectations". J Clin Invest 122 (4): 1133
  25. 25. Avoid making bar graphs Error bars Different types, different meanings 100 SORRY , WE JUST 75 YOU... • descriptive statistics (Range, SD) • inferential statistics (SE, CI) 50 25 0 Cumming, G. et al. (2007). "Error bars in experimental biology". J Cell Biol 177 (1): 7–11
  26. 26. Avoid making bar graphs Error bars Different types, different meanings • descriptive statistics (Range, SD) • inferential statistics (SE, CI) Often, they also imply a symmetrical distribution of the data. Cumming, G. et al. (2007). "Error bars in experimental biology". J Cell Biol 177 (1): 7–11
  27. 27. Avoid making bar graphs Mean and Standard deviation are only useful in the context of a “normal distribution” 95% µ 95% of a normal distribution lies within two standard deviations (σ) of the mean (µ)
  28. 28. Avoid making bar graphs symmetrical distribution skewed distribution Data presentation to reveal the distribution of the data • Display data in their raw form. • A dot plot is a good start. • “Dynamite plunger plots” conceal data. • Check the pattern of distribution of the values.
  29. 29. Avoid making bar graphs symmetrical distribution skewed distribution • First set: Gaussian (or normal) distribution (symmetrically distributed) • Second set: right skewed, lognormal (few large values) “ This type of distribution of values is quite common in biology (ex: plasma concentrations of immune or inflammatory mediators)” “Plunger plots only: who would know that the values were skewed – ... ... and that the common statistical tests would be inappropriate?”
  30. 30. Avoid making bar graphs Don't tell me no one warned you before! Bar graph Dynamite plunger
  31. 31. Summary Why visualize your data? For others ... Providing a narrative for the reader But primarily for you ... Looking for patterns and relationships Summarize complex data structures Help avoid erroneous conclusions based upon questionable or unexpected data
  32. 32. Chose the right descriptor for your data
  33. 33. Averages can be misleading
  34. 34. Averages can be misleading
  35. 35. Averages can be misleading
  36. 36. Averages can be misleading
  37. 37. Is the mean always a good descriptor? # of children per household in China (2012) • mean: 1.35 http://www.globalhealthfacts.org/data/topic/map.aspx?ind=87
  38. 38. Is the mean always a good descriptor? # of children per household in China (2012) • mean: 1.35 • median: 1 more representative of the “typical” family (One child policy) http://www.globalhealthfacts.org/data/topic/map.aspx?ind=87
  39. 39. Any measure is wrong! “Whenever you make a measurement, you must know the uncertainty otherwise it is meaningless” Walter Lewis (MIT) 183.3cm 185.7cm http://www.youtube.com/watch?v=JUxHebuXviM
  40. 40. Any measure is wrong! “Whenever you make a measurement, you must know the uncertainty otherwise it is meaningless” Walter Lewis (MIT) The same concept applies when you report your data! Provide the uncertainty of your descriptor hint: this is NOT the standard deviation
  41. 41. Any measure is wrong! “Whenever you make a measurement, you must know the uncertainty otherwise it is meaningless” Walter Lewis (MIT) The same concept applies when you report your data! Provide the uncertainty of your descriptor hint: this is NOT the standard deviation Report the Confidence Interval of your descriptor
  42. 42. The Bootstrap: origin Modern electronic computation has encouraged a host of new statistical methods that require fewer distributional assumptions than their predecessors and can be applied to more complicated statistical estimators. These methods allow [...] to explore and describe data and draw valid statistical inferences without the usual concerns for mathematical tractability. Efron B. and Tibshirani R. (1991), Science, Jul 26;253(5018):390-5
  43. 43. Computing the bootstrap 95% CI A0 (m0) a1 a4 a5 a2 a3 an Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):3403-3406
  44. 44. Computing the bootstrap 95% CI A0 (m0) a1 a4 a5 a2 a3 an A1 A2 a4 a5 a3 a2 a1 an a2 a1 a2 a3 a1 a5 mA1 mA2 A2 an a1 an a1 a3 a4 mA3 A2 a4 a3 an a5 a1 a3 mA4 ... Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):3403-3406
  45. 45. Computing the bootstrap 95% CI A0 (m0) a1 a4 a5 a2 a3 an A1 A2 a4 a5 a3 a2 a1 an a2 a1 a2 a3 a1 a5 mA1 mA2 A2 an a1 an a1 a3 a4 mA3 A2 a4 a3 an a5 a1 a3 mA4 ... ... Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):3403-3406
  46. 46. Computing the bootstrap 95% CI A0 (m0) a1 a4 a5 a2 a3 an A1 A2 a4 a5 a3 a2 a1 an a2 a1 a2 a3 a1 a5 mA1 mA2 A2 an a1 an a1 a3 a4 mA3 A2 a4 a3 an a5 a1 a3 mA4 ... Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):3403-3406
  47. 47. Computing the bootstrap 95% CI A0 (m0) a1 a4 a5 a2 a3 an A1 A2 a4 a5 a3 a2 a1 an a2 a1 a2 a3 a1 a5 mA1 mA2 A2 an a1 an a1 a3 a4 mA3 A2 a4 a3 an a5 a1 a3 mA4 ... 5.18 [4.91, 4.47] Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):3403-3406
  48. 48. Analyze your data
  49. 49. Choose your statistical test wisely Authors Guidelines Every paper that contains statistical testing should state [...] a justification for the use of that test (including, for example, a discussion of the normality of the data when the test is appropriate only for normal data), [...], whether the tests were one-tailed or two-tailed, and the actual P value for each test (not merely "significant" or "P < 0.5"). http://www.nature.com/nature/authors/gta/#a5.6
  50. 50. The simple case (How to) mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male
  51. 51. The simple case (How to) Distribution of the data? mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male
  52. 52. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male
  53. 53. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male • fit of the histogram
  54. 54. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male • fit of the histogram
  55. 55. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 • fit of the histogram • QQ plot Male ith point A(i) Theoretical quantiles of the distribution Φ −1 i − 3/8 n + 1/4
  56. 56. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male • fit of the histogram • QQ plot not “normal”
  57. 57. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 • fit of the histogram • QQ plot Female Male Male
  58. 58. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female visual inspection mean/std 187.0 ± 19.8 • fit of the histogram • QQ plot Female Male Male
  59. 59. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female visual inspection mean/std test 187.0 ± 19.8 Male • fit of the histogram • QQ plot • Shapiro-Wilk test
  60. 60. The simple case (How to) Distribution of the data? difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female visual inspection mean/std test 187.0 ± 19.8 Male • fit of the histogram • QQ plot • Shapiro-Wilk test Null Hypothesis for the SW test: Data are normally distributed Female p-value: 0.9195 Male p-value: 0.3866
  61. 61. The simple case (How to) difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male Distribution of the data? Normally distributed
  62. 62. The simple case (How to) difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male Distribution of the data? Normally distributed
  63. 63. The simple case (How to) difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male Distribution of the data? Normally distributed
  64. 64. The simple case (How to) difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male Distribution of the data? Normally distributed Statistical test? t-test
  65. 65. The simple case (How to) difference/ci 51.2 [50.4, 51.9] mean/std 135.9 ± 19.0 Female mean/std 187.0 ± 19.8 Male Distribution of the data? Normally distributed Statistical test? t-test Null Hypothesis for the t-test: Data belong to the same population t-test p-value < 2.2e-16
  66. 66. Usually it is not so simple
  67. 67. The “not so simple” case S1 S2
  68. 68. The “not so simple” case S1 S2
  69. 69. The “not so simple” case S1 S2 S1 S2
  70. 70. The “not so simple” case S1 S2 Shapiro-Wilk test: S1 p-value: 7.4e-05 S2 p-value: 6.7e-06 S1 S2
  71. 71. What to do?
  72. 72. What to do? For the t-test: ! Non parametric alternatives • Mann-Whitney U (independant) ! • Wilcoxon (dependant)
  73. 73. Choose a new statistical hero Bootstrapman t-test
  74. 74. Computing the bootstrap p-value Are the two samples different? Observed difference = 0.44
  75. 75. Computing the bootstrap p-value Are the two samples different? Observed difference = 0.44 If the two samples were from the same population, what would the probabilities be that the observed difference was from chance alone?
  76. 76. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn
  77. 77. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5
  78. 78. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1
  79. 79. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1 D0 = 0.44 D1 = -0.83
  80. 80. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A2 B2 a1 b5 b3 a1 a4 an a2 b1 b5 b5 b1 b5 mA2 mB2 D2 = mA2-mB2 D0 = 0.44 D1 = -0.83 D2 = 0.84
  81. 81. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1 Repeat 10000 times (D1 ... D10000)
  82. 82. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1 Repeat 10000 times (D1 ... D10000) How many pseudo-differences are greater or equal than the observed difference D0 ? (0.44)
  83. 83. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1 How many pseudo-differences are greater or equal than the observed difference D0 ? Repeat 10000 times (D1 ... D10000) (0.44) 9829<D0 171>D0
  84. 84. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1 How many pseudo-differences are greater or equal than the observed difference D0 ? 171 = 0.0171 p= 10000 (one-tailed) Repeat 10000 times (D1 ... D10000) (0.44) 9829<D0 171>D0
  85. 85. Computing the bootstrap p-value A0 a1 a4 a5 a2 a3 an D0 = mA-mB (0.44) B0 b2 b3 b1 b4 b5 bn MW: p = 0.0169 171 = 0.0171 p= 10000 (one-tailed) a4 b5 bn b3 a b2 an b4 1b a2 1 a3 a5 A1 B1 a4 b5 b3 b2 a1 an a2 b1 b2 a3 b1 a5 mA1 mB1 D1 = mA1-mB1 How many pseudo-differences are greater or equal than the observed difference D0 ? Repeat 10000 times (D1 ... D10000) (0.44) 9829<D0 171>D0
  86. 86. Summary How do my data look like? Distribution? • visual inspection (hist. / QQ plot) • normality test What do I want to compare? • parametric test Right statistical test? • non parametric test • resampling statistics
  87. 87. The dark side of the p-value
  88. 88. Statistical significance “The effect of the drug was statistically significant.”
  89. 89. Statistical significance “The effect of the drug was statistically significant.” so what?
  90. 90. Statistical significance (example) “The percentage of neurons showing cue-related activity increased with training in the mutant mice (P<0.05) but not in the control mice (P>0.05).”
  91. 91. Statistical significance (example) “The percentage of neurons showing cue-related activity increased with training in the mutant mice (P<0.05) but not in the control mice (P>0.05).” Training has a larger effect in the mutant mice than in the control mice!
  92. 92. Statistical significance (example) “The percentage of neurons showing cue-related activity increased with training in the mutant mice (P<0.05) but not in the control mice (P>0.05).” Training has a larger effect in the mutant mice than in the control mice!
  93. 93. Statistical significance (example) “The percentage of neurons showing cue-related activity increased with training in the mutant mice (P<0.05) but not in the control mice (P>0.05).” * Activity Extreme scenario: - training-induced activity barely reaches significance in mutant mice (e.g., 0.049) and barely fails to reach significance for control mice (e.g., 0.051) - + - + control mutant Does not test whether training effect for mutant mice differs statistically from that for control mice.
  94. 94. Statistical significance (example) “The percentage of neurons showing cue-related activity increased with training in the mutant mice (P<0.05) but not in the control mice (P>0.05).” When making a comparison between two effects, always report the statistical significance of their difference rather than the difference between significance levels. Nieuwenhuis S. and al. (2011), “Erroneous analyses of interactions in neuroscience: a problem of significance”, Nat Neuroscience, 14(9):1105-1107
  95. 95. P-values do not convey information Mean: 16 SD: 5 Mean: 20 SD: 5 Difference = 4 p-value = 0.1090
  96. 96. P-values do not convey information Mean: 16 SD: 5 Mean: 20 SD: 5 Difference = 4 p-value = 0.1090 0.0367
  97. 97. P-values do not convey information Mean: 16 SD: 5 Mean: 20 SD: 5 Difference = 4 p-value = 0.1090 0.0367 0.0009
  98. 98. P-values do not convey information Fact: Most applied scientists use p-values as a measure of evidence and of the size of the effect - The probability of hypotheses depends on much more than just the p-value. - This topic has renewed importance with the advent of the massive multiple testing often seen in genomics studies 8 “Manhattan plot” -log10(P) 6 4 2 Loannidis JP, (2005) PLoS Med 2(8):e124 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  99. 99. Report effect size and CIs instead
  100. 100. P-value is function of the sample size Measured Effect Size: difference = 0.018 mV Amplitude (mV) Control Atropine 0.5 mV 100 ms 0.4 0.2 0 control atropine (n=6777) (n=5272) Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887–94
  101. 101. P-value is function of the sample size Measured Effect Size: difference = 0.018 mV Amplitude (mV) Control Atropine 0.5 mV 100 ms p = 10-5 0.4 0.2 0 control atropine (n=6777) (n=5272) Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887–94
  102. 102. P-value is function of the sample size P (t-test) 100 not significant 10–2 significant 10–4 101 102 103 Hedges' g 0.4 0.2 0.018 mV 0 –0.2 –0.4 101 102 103 Sample size Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887–94
  103. 103. Bootstrap effect size and 95% CIs a1 a2 a4 a5 a3 an a5 a1 a5 a3 a3 a7 a1 a4 a2 a2 a9 a1 a6 a3 a4 a3 A b1 b2 b4 b5 b3 bn etc... a1 a1 (10000 times) a8 a6 b4 b2 b2 b1 b7 b5 b3 b4 b2 b1 b1 b1 b3 b8 b4 b5 B etc... b1 b1 (10000 times) b2 b4 etc... etc... mA1 mA2 mA3 mA4 mA5 E1 E2 (mA1-mB1 ) (mA1-mB1 ) ... mB1 mB2 mB3 mB4 mB5 E10000 (mA10000-mB10000 )
  104. 104. Bootstrap effect size and 95% CIs a1 a2 a4 a5 a3 an a5 a1 a5 a3 a3 a7 a1 a4 a2 a2 a9 a1 a6 a3 a4 a3 A b1 b2 b4 b5 b3 bn etc... a1 a1 (10000 times) a8 a6 (0.44) b4 b2 b2 b1 b7 b5 b3 b4 b2 b1 b1 b1 b3 b8 b4 b5 B etc... b1 b1 (10000 times) b2 b4 etc... etc... mA1 mA2 mA3 mA4 mA5 E1 E2 (mA1-mB1 ) (mA1-mB1 ) ... mB1 mB2 mB3 mB4 mB5 E10000 (mA10000-mB10000 )
  105. 105. Bootstrap effect size and 95% CIs a1 a2 a4 a5 a3 an a5 a1 a5 a3 a3 a7 a1 a4 a2 a2 a9 a1 a6 a3 a4 a3 A b1 b2 b4 b5 b3 bn etc... a1 a1 (10000 times) a8 a6 (0.44) b4 b2 b2 b1 b7 b5 b3 b4 b2 b1 b1 b1 b3 b8 b4 b5 B etc... b1 b1 (10000 times) b2 b4 etc... etc... mA1 mA2 mA3 mA4 mA5 E1 E2 (mA1-mB1 ) (mA1-mB1 ) ... mB1 mB2 mB3 mB4 mB5 E10000 (mA10000-mB10000 )
  106. 106. Bootstrap effect size and 95% CIs a1 a2 a4 a5 a3 an a5 a1 a5 a3 a3 a7 a1 a4 a2 a2 a9 a1 a6 a3 a4 a3 A b1 b2 b4 b5 b3 bn etc... a1 a1 (10000 times) a8 250th a6 (0.44) b4 b2 b 9750th2 b1 b7 b5 b3 b4 b2 b1 b1 b1 b3 b8 b4 b5 B etc... b1 b1 (10000 times) b2 b4 etc... etc... mA1 mA2 mA3 mA4 mA5 E1 E2 (mA1-mB1 ) (mA1-mB1 ) ... mB1 mB2 mB3 mB4 mB5 E10000 (mA10000-mB10000 )
  107. 107. Bootstrap effect size and 95% CIs Do the 95% confidence intervals of the observed effect size include zero (no difference)? 0.44 [0.042, 0.853] Eff. size = 0.44 A B 250th 9750th
  108. 108. Statistical vs Biological significance
  109. 109. Statistical vs Biological significance “The P value reported by tests is a probabilistic significance, not a biological one.” “Statistical significance suggests but does not imply biological significance.” Krzywinski M and Altman N (2013) "Points of significance: Significance, P values and t-tests”. Nature Methods 10, 1041–1042
  110. 110. Statistical vs Biological significance Statistical significance has a meaning in a specific context No change Small change Large change Biological consequences?
  111. 111. Statistical vs Biological significance AB PD LP LP 1 PY LP 2 “Good enough” solutions 0.60 1,600 0.50 mRNA copy number Conductances at +15 mV (µS/nF) Somato-gastric ganglion 0.40 0.30 0.20 0.10 0 1,400 1,200 1,000 800 600 400 200 Kd K Ca A-type 0 shab BK-KC shal Schulz D.J. et al. (2006) "Variable channel expression in identified single and electrically coupled neurons in different animals". Nat Neurosci. 9: 356– 362
  112. 112. Statistical vs Biological significance Madhvani R.V. et al. (2011) "Shaping a new Ca2+ conductance to suppress early afterdepolarizations in cardiac myocytes". J Physiol 589(Pt 24):6081-92
  113. 113. Statistical vs Biological significance Breast cancer study Difference in cancer returning between control vs low-fat diet groups. Authors conclusions: People with low-fat diets had a 25% less chance of cancer returning
  114. 114. Statistical vs Biological significance Breast cancer study Difference in cancer returning between control vs low-fat diet groups. Authors conclusions: People with low-fat diets had a 25% less chance of cancer returning Actual return rates: - control: 12.4% - low-fat diet: 9.8% Difference 2.6% 2.6 9.8 = 26.5%
  115. 115. Beware of false positives (from the authors) Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction”. JSUR, 2010. 1(1):1-5
  116. 116. Beware of false positives Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction”. JSUR, 2010. 1(1):1-5
  117. 117. Beware of false positives 2012 Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the Post-Mortem Atlantic Salmon: An Argument For Proper Multiple Comparisons Correction”. JSUR, 2010. 1(1):1-5
  118. 118. Beware of false positives http://xkcd.com/882/
  119. 119. Present your data
  120. 120. Know your audience
  121. 121. Know your audience Who? Why? What? How?
  122. 122. Know your audience who is my audience? level of understanding? Who? what do they already know? Why? What? How?
  123. 123. Know your audience who is my audience? level of understanding? Who? what do they already know? why am I presenting? Why? what do my audience want to achieve? What? How?
  124. 124. Know your audience who is my audience? level of understanding? Who? what do they already know? why am I presenting? Why? what do my audience want to achieve? what do I want my audience to know? What? which story will captivate the audience? How?
  125. 125. Know your audience who is my audience? level of understanding? Who? what do they already know? why am I presenting? Why? what do my audience want to achieve? what do I want my audience to know? What? which story will captivate the audience? what medium will support the message the best? How? what format/layout will appeal to the audience?
  126. 126. Color blindness is a common disease Males: one in 12 (8%) / Females: one in 200 (0.5%)
  127. 127. Color blindness is a common disease “Anyone who needs to be convinced that making scientific images more accessible is a worthwhile task [...]: if your next grant or manuscript submission contains color figures, what if some of your reviewers are color blind? Will they be able to appreciate your figures? Considering the competition for funding and for publication, can you afford the possibility of frustrating your audience? The solution is at hand." Clarke, M. (2007). "Making figures comprehensible for color-blind readers" Nature blog (http://blogs.nature.com/nautilus/2007/02/post_4.html)
  128. 128. Making figures for color blind people Wong, B. (2011). "Points of view: Color blindness". Nature Methods 8, 441
  129. 129. Making figures for color blind people http://colororacle.org/
  130. 130. Making figures for color blind people http://colororacle.org/
  131. 131. Telling stories with data “The Martini Glass Structure” http://vis.stanford.edu/files/2010-Narrative-InfoVis.pdf
  132. 132. Telling stories with data “The Martini Glass Structure” GUIDED START ! EXPLORE NARRATIVE http://vis.stanford.edu/files/2010-Narrative-InfoVis.pdf
  133. 133. Aesthetic minimalism Suda B. (2010). "A practical guide to Designing with Data"
  134. 134. Aesthetic minimalism Suda B. (2010). "A practical guide to Designing with Data"
  135. 135. Aesthetic minimalism Suda B. (2010). "A practical guide to Designing with Data"
  136. 136. Aesthetic minimalism Suda B. (2010). "A practical guide to Designing with Data"
  137. 137. Aesthetic minimalism Suda B. (2010). "A practical guide to Designing with Data"
  138. 138. Aesthetic minimalism Suda B. (2010). "A practical guide to Designing with Data"
  139. 139. Common mistakes in data reporting Welcome to the FOX “Dishonest Charts” gallery
  140. 140. Common mistakes in data reporting
  141. 141. Common mistakes in data reporting E. Tufte’s “Lie Factor” Make things appear to be “better” than they are by fiddling with the scales of things
  142. 142. Common mistakes in data reporting
  143. 143. Common mistakes in data reporting
  144. 144. Common mistakes in data reporting
  145. 145. Common mistakes in data reporting
  146. 146. Common mistakes in data reporting
  147. 147. Common mistakes in data reporting Fig 1I “We found that relative to WT mice, the luminal microbiota of Il10−/− mice exhibited a ~100-fold increase in E. coli (Fig. 1I)” Arthur et al, (2012) Science 5;338(6103):120-3
  148. 148. Common mistakes in data reporting A B C D E
  149. 149. Common mistakes in data reporting A B C D E 20% 20% 20% 20% 20%
  150. 150. Common mistakes in data reporting
  151. 151. Common mistakes in data reporting
  152. 152. Common mistakes in data reporting Percent Return on Investment 40 30 20 10 0 year1 40 year2 year3 Group year4 Group A B Percent Return on Investment Group A 30 Group B 20 10 0 year1 year2 year3 year4
  153. 153. Thank you! “The important thing is not to stop questioning. Curiosity has its own reason for existing” - Albert Einstein-
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×