1. expl
ore
analyze
t
en
s
re
p
your data
Guillaume Calmettes
2. “Bonjour”, I am Guillaume!
Sacre Bleu!
Bordeaux
gcalmettes@mednet.ucla.edu
Office: MRL 3645
3. Disclaimer
I am not a
statistician
4. Statistics are scary
Statistics
(You at the beginning of the talk)
5. Statistics are scary
not so
Statistics
(You at the middle of the talk)
6. Statistics are scary
cool
Statistics
(You at the end of the talk)
7. Statistics are scary
cool
We have to deal with
them anyways, so we
had better enjoy them!
Statistics
(You at the end of the talk)
8. Press the
ttest button and
you’ll be done!
Did you check
the normality of
your data ﬁrst?
9. Why should you care about statistics?
http://www.nature.com/nature/authors/gta/2e_Statistical_checklist.pdf
10. Why should you care about statistics?
Advances in Physiological Education
“Explorations in Statistics” series (2008present)
(Douglas CurranEverett)
11. Why should you care about statistics?
“Statistical Perspectives” series (2011present)
(Gordon Drummond)
The Journal of Physiology
Experimental Physiology
The British Journal of Pharmacology
Microcirculation
The British Journal of Nutrition
http://jp.physoc.org/cgi/collection/stats_reporting
12. Why should you care about statistics?
Importance of being uncertain – September 2013
How samples are used to estimate population statistics and what this means in terms of
uncertainty.
Error Bars – October 2013
The use of error bars to represent uncertainty and advice on how to interpret them.
Signiﬁcance, P values and ttests – November 2013
Introduction to the concept of statistical signiﬁcance and the onesample ttest.
http://blogs.nature.com/methagora/2013/08/giving_statistics_the_attention_it_deserves.html
13. Why should you care about statistics?
“Journals […] fail to exert sufﬁcient scrutiny over the results
that they publish”
“Nature research journals will introduce editorial measures to
address the problem by improving the consistency and quality of
reporting in lifesciences articles”
“We will examine statistics more closely and encourage authors
to be transparent, for example by including their raw data”
14. Look at
your data
15. A picture is worth a thousand words
John Snow
(18131858)
Location of deaths in the 1854 London Cholera Epidemic
16. Why visualize your data?
The Anscombe’s quartet example
Dataset #1
Dataset #2
Dataset #3
Dataset #4
x
y
x
y
x
y
x
y
10
8.04
10
9.14
10
7.46
8
6.58
8
6.95
8
8.14
8
6.77
8
5.76
13
7.58
13
8.74
13 12.74
8
7.71
9
8.81
9
8.77
9
7.11
8
8.84
11
8.33
11
9.26
11
7.81
8
8.47
14
9.96
14
8.1
14
8.84
8
7.04
6
7.24
6
6.13
6
6.08
8
5.25
4
4.26
4
3.1
4
5.39
19
12.5
12 10.84
12
9.13
12
8.15
8
5.56
7
4.82
7
7.26
7
6.42
8
7.91
5
5.68
5
4.74
5
5.73
8
6.89
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
17. Why visualize your data?
The Anscombe’s quartet example
Property in each case
Value
Mean of x
9 (exact)
Variance of x
11 (exact)
Mean of y
7.5
Variance of y
4.122 or 4.127
Correlation of x and y
0.816
Linear regression line
y = 3.00 + 0.500x
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
18. Why visualize your data?
The Anscombe’s quartet example
Dataset #1
Dataset #2
Dataset #3
Dataset #4
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
19. Why visualize your data?
The Anscombe’s quartet example
Dataset #1
Dataset #2
Dataset #3
Dataset #4
Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician 27 (1): 17–21
20. Visualize your data in their raw form!
Aim for revelation rather than mere summary
A great graphic with raw data will reveal
unexpected patterns and invites us to
make comparisons we might not have
thought of beforehand.
21. If you are still not convinced …
Mean: 16 / Stdv: 5
22. If you are still not convinced …
Mean: 16 / Stdv: 5
23. If you are still not convinced …
Mean: 16 / Stdv: 5
e
WBM secondary transplantation
(16 weeks)
Daniel’s Journal Club paper
Donor engraftment (%)
80
P < 0.05
60
40
20
0
ﬂDMR/+
DMR/+
mH19
24. Avoid making bar graphs
“To maintain the highest level of trustworthiness of data,
we are encouraging authors to display data in their raw
form and not in a fashion that conceals their variance.
Presenting data as columns with error bars (dynamite
plunger plots) conceals data. We recommend that
individual data be presented as dot plots shown next to
the average for the group with appropriate error bars
(Figure 1).”
Rockman H.A. (2012). "Great expectations". J Clin Invest 122 (4): 1133
25. Avoid making bar graphs
Error bars
Different types, different meanings
100
SORRY
,
WE JUST
75
YOU...
• descriptive statistics (Range, SD)
• inferential statistics (SE, CI)
50
25
0
Cumming, G. et al. (2007). "Error bars in experimental biology". J Cell Biol 177 (1): 7–11
26. Avoid making bar graphs
Error bars
Different types, different meanings
• descriptive statistics (Range, SD)
• inferential statistics (SE, CI)
Often, they also imply a
symmetrical distribution of the
data.
Cumming, G. et al. (2007). "Error bars in experimental biology". J Cell Biol 177 (1): 7–11
27. Avoid making bar graphs
Mean and Standard deviation are only useful in the
context of a “normal distribution”
95%
µ
95% of a normal distribution lies within two
standard deviations (σ) of the mean (µ)
28. Avoid making bar graphs
symmetrical
distribution
skewed
distribution
Data presentation to reveal the distribution of the data
• Display data in their raw form.
• A dot plot is a good start.
• “Dynamite plunger plots” conceal data.
• Check the pattern of distribution of the values.
29. Avoid making bar graphs
symmetrical
distribution
skewed
distribution
• First set: Gaussian (or normal) distribution (symmetrically distributed)
• Second set: right skewed, lognormal (few large values)
“ This type of distribution of values is quite common in biology (ex: plasma concentrations
of immune or inflammatory mediators)”
“Plunger plots only: who would know that the values were skewed – ...
... and that the common statistical tests would be inappropriate?”
30. Avoid making bar graphs
Don't tell me no one warned you before!
Bar graph
Dynamite plunger
31. Summary
Why visualize your data?
For others ...
Providing a narrative for the reader
But primarily for you ...
Looking for patterns and relationships
Summarize complex data structures
Help avoid erroneous conclusions based upon questionable or
unexpected data
32. Chose the right descriptor for
your data
33. Averages can be misleading
34. Averages can be misleading
35. Averages can be misleading
36. Averages can be misleading
37. Is the mean always a good descriptor?
# of children per household in China (2012)
• mean: 1.35
http://www.globalhealthfacts.org/data/topic/map.aspx?ind=87
38. Is the mean always a good descriptor?
# of children per household in China (2012)
• mean: 1.35
• median: 1
more representative of the
“typical” family (One child policy)
http://www.globalhealthfacts.org/data/topic/map.aspx?ind=87
39. Any measure is wrong!
“Whenever you make a measurement, you must
know the uncertainty otherwise it is meaningless”
Walter Lewis (MIT)
183.3cm
185.7cm
http://www.youtube.com/watch?v=JUxHebuXviM
40. Any measure is wrong!
“Whenever you make a measurement, you must
know the uncertainty otherwise it is meaningless”
Walter Lewis (MIT)
The same concept applies when you
report your data!
Provide the uncertainty of your descriptor
hint: this is NOT the standard deviation
41. Any measure is wrong!
“Whenever you make a measurement, you must
know the uncertainty otherwise it is meaningless”
Walter Lewis (MIT)
The same concept applies when you
report your data!
Provide the uncertainty of your descriptor
hint: this is NOT the standard deviation
Report the Conﬁdence Interval of your descriptor
42. The Bootstrap: origin
Modern electronic computation has encouraged a host of new statistical methods
that require fewer distributional assumptions than their predecessors and
can be applied to more complicated statistical estimators. These methods allow
[...] to explore and describe data and draw valid statistical inferences without the
usual concerns for mathematical tractability.
Efron B. and Tibshirani R. (1991), Science, Jul 26;253(5018):3905
43. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):34033406
44. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):34033406
45. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
...
Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):34033406
46. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):34033406
47. Computing the bootstrap 95% CI
A0 (m0)
a1 a4
a5 a2
a3 an
A1 A2
a4 a5
a3 a2
a1 an
a2 a1
a2 a3
a1 a5
mA1 mA2
A2
an
a1
an
a1
a3
a4
mA3
A2
a4
a3
an
a5
a1
a3
mA4
...
5.18 [4.91, 4.47]
Calmettes G. and al. (2012), “Making do with what we have: use your bootstrap”, J Physiol, 590(15):34033406
48. Analyze
your data
49. Choose your statistical test wisely
Authors Guidelines
Every paper that contains statistical testing should state
[...] a justification for the use of that test (including, for
example, a discussion of the normality of the data when the
test is appropriate only for normal data), [...], whether the
tests were onetailed or twotailed, and the actual P value
for each test (not merely "significant" or "P < 0.5").
http://www.nature.com/nature/authors/gta/#a5.6
50. The simple case (How to)
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
51. The simple case (How to)
Distribution of the data?
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
52. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
53. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
• ﬁt of the histogram
54. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
• ﬁt of the histogram
55. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
• ﬁt of the histogram
• QQ plot
Male
ith point
A(i)
Theoretical quantiles of the distribution
Φ
−1
i − 3/8
n + 1/4
56. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
• ﬁt of the histogram
• QQ plot
not “normal”
57. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
• ﬁt of the histogram
• QQ plot
Female
Male
Male
58. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
visual
inspection
mean/std
187.0 ± 19.8
• ﬁt of the histogram
• QQ plot
Female
Male
Male
59. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
visual
inspection
mean/std
test
187.0 ± 19.8
Male
• ﬁt of the histogram
• QQ plot
• ShapiroWilk test
60. The simple case (How to)
Distribution of the data?
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
visual
inspection
mean/std
test
187.0 ± 19.8
Male
• ﬁt of the histogram
• QQ plot
• ShapiroWilk test
Null Hypothesis for the SW test:
Data are normally distributed
Female
pvalue: 0.9195
Male
pvalue: 0.3866
61. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
62. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
63. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
64. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
Statistical test?
ttest
65. The simple case (How to)
difference/ci
51.2 [50.4, 51.9]
mean/std
135.9 ± 19.0
Female
mean/std
187.0 ± 19.8
Male
Distribution of the data?
Normally distributed
Statistical test?
ttest
Null Hypothesis for the ttest:
Data belong to the same population
ttest
pvalue < 2.2e16
66. Usually it is not so simple
67. The “not so simple” case
S1
S2
68. The “not so simple” case
S1
S2
69. The “not so simple” case
S1
S2
S1
S2
70. The “not so simple” case
S1
S2
ShapiroWilk test:
S1 pvalue: 7.4e05
S2 pvalue: 6.7e06
S1
S2
71. What to do?
72. What to do?
For the ttest:
!
Non parametric
alternatives
• MannWhitney U
(independant)
!
• Wilcoxon
(dependant)
73. Choose a new statistical hero
Bootstrapman
ttest
74. Computing the bootstrap pvalue
Are the two samples different?
Observed difference = 0.44
75. Computing the bootstrap pvalue
Are the two samples different?
Observed difference = 0.44
If the two samples were from the same population,
what would the probabilities be that the observed
difference was from chance alone?
82. Computing the bootstrap pvalue
A0
a1 a4
a5 a2
a3 an
D0 = mAmB
(0.44)
B0
b2 b3 b1
b4 b5 bn
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1mB1
Repeat
10000 times
(D1 ... D10000)
How many pseudodifferences are
greater or equal than the observed
difference D0 ?
(0.44)
83. Computing the bootstrap pvalue
A0
a1 a4
a5 a2
a3 an
D0 = mAmB
(0.44)
B0
b2 b3 b1
b4 b5 bn
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1mB1
How many pseudodifferences are
greater or equal than the observed
difference D0 ?
Repeat
10000 times
(D1 ... D10000)
(0.44)
9829<D0
171>D0
84. Computing the bootstrap pvalue
A0
a1 a4
a5 a2
a3 an
D0 = mAmB
(0.44)
B0
b2 b3 b1
b4 b5 bn
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1mB1
How many pseudodifferences are
greater or equal than the observed
difference D0 ?
171
= 0.0171
p=
10000
(onetailed)
Repeat
10000 times
(D1 ... D10000)
(0.44)
9829<D0
171>D0
85. Computing the bootstrap pvalue
A0
a1 a4
a5 a2
a3 an
D0 = mAmB
(0.44)
B0
b2 b3 b1
b4 b5 bn
MW: p = 0.0169
171
= 0.0171
p=
10000
(onetailed)
a4 b5 bn
b3 a b2 an b4
1b
a2 1 a3 a5
A1
B1
a4
b5
b3
b2
a1
an
a2
b1
b2
a3
b1
a5
mA1
mB1
D1 = mA1mB1
How many pseudodifferences are
greater or equal than the observed
difference D0 ?
Repeat
10000 times
(D1 ... D10000)
(0.44)
9829<D0
171>D0
86. Summary
How do my data look like?
Distribution?
• visual inspection (hist. / QQ plot)
• normality test
What do I want to compare?
• parametric test
Right statistical test? • non parametric test
• resampling statistics
87. The dark side of the
pvalue
88. Statistical significance
“The effect of the drug was statistically signiﬁcant.”
89. Statistical significance
“The effect of the drug was statistically signiﬁcant.”
so what?
90. Statistical significance (example)
“The percentage of neurons showing cuerelated activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).”
91. Statistical significance (example)
“The percentage of neurons showing cuerelated activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).”
Training has a larger effect in the mutant
mice than in the control mice!
92. Statistical significance (example)
“The percentage of neurons showing cuerelated activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).”
Training has a larger effect in the mutant
mice than in the control mice!
93. Statistical significance (example)
“The percentage of neurons showing cuerelated activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).”
*
Activity
Extreme scenario:
 traininginduced activity barely reaches
signiﬁcance in mutant mice (e.g., 0.049) and
barely fails to reach signiﬁcance for control
mice (e.g., 0.051)

+

+
control
mutant
Does not test whether training effect for mutant mice differs
statistically from that for control mice.
94. Statistical significance (example)
“The percentage of neurons showing cuerelated activity
increased with training in the mutant mice (P<0.05) but
not in the control mice (P>0.05).”
When making a comparison between two
effects, always report the statistical
signiﬁcance of their difference rather than
the difference between signiﬁcance levels.
Nieuwenhuis S. and al. (2011), “Erroneous analyses of interactions in neuroscience: a problem of significance”,
Nat Neuroscience, 14(9):11051107
95. Pvalues do not convey information
Mean: 16
SD: 5
Mean: 20
SD: 5
Difference = 4
pvalue = 0.1090
96. Pvalues do not convey information
Mean: 16
SD: 5
Mean: 20
SD: 5
Difference = 4
pvalue = 0.1090
0.0367
97. Pvalues do not convey information
Mean: 16
SD: 5
Mean: 20
SD: 5
Difference = 4
pvalue = 0.1090
0.0367
0.0009
98. Pvalues do not convey information
Fact: Most applied scientists use pvalues as a measure of evidence
and of the size of the effect
 The probability of hypotheses depends on much more than just the pvalue.
 This topic has renewed importance with the advent of the massive multiple
testing often seen in genomics studies
8
“Manhattan plot”
log10(P)
6
4
2
Loannidis JP, (2005) PLoS Med 2(8):e124
0
1
2
3
4
5
6
7
8
9
10 11 12
13 14 15 16 17 18 19
20
99. Report effect size and CIs instead
100. Pvalue is function of the sample size
Measured Effect Size:
difference = 0.018 mV
Amplitude (mV)
Control
Atropine
0.5 mV
100 ms
0.4
0.2
0
control
atropine
(n=6777) (n=5272)
Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887–94
101. Pvalue is function of the sample size
Measured Effect Size:
difference = 0.018 mV
Amplitude (mV)
Control
Atropine
0.5 mV
100 ms
p = 105
0.4
0.2
0
control
atropine
(n=6777) (n=5272)
Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887–94
102. Pvalue is function of the sample size
P (ttest)
100
not signiﬁcant
10–2
signiﬁcant
10–4
101
102
103
Hedges' g
0.4
0.2
0.018 mV
0
–0.2
–0.4
101
102
103
Sample size
Hentschke, H. et al. (2011). "Computation of measures of effect size for neuroscience data sets". Eur J Neurosci. 34(12):1887–94
107. Bootstrap effect size and 95% CIs
Do the 95% confidence intervals of
the observed effect size include
zero (no difference)?
0.44 [0.042, 0.853]
Eff. size = 0.44
A
B
250th
9750th
108. Statistical vs Biological
significance
109. Statistical vs Biological significance
“The P value reported by tests is a probabilistic significance, not a
biological one.”
“Statistical significance suggests but does not imply biological
significance.”
Krzywinski M and Altman N (2013) "Points of significance: Significance, P values and ttests”.
Nature Methods 10, 1041–1042
110. Statistical vs Biological significance
Statistical significance has a meaning in a specific context
No change
Small change
Large change
Biological consequences?
111. Statistical vs Biological significance
AB
PD
LP
LP 1
PY
LP 2
“Good enough” solutions
0.60
1,600
0.50
mRNA copy number
Conductances at +15 mV (µS/nF)
Somatogastric ganglion
0.40
0.30
0.20
0.10
0
1,400
1,200
1,000
800
600
400
200
Kd
K Ca
Atype
0
shab
BKKC
shal
Schulz D.J. et al. (2006) "Variable channel expression in identified single and electrically coupled neurons
in different animals". Nat Neurosci. 9: 356– 362
112. Statistical vs Biological significance
Madhvani R.V. et al. (2011) "Shaping a new Ca2+ conductance to suppress early afterdepolarizations in
cardiac myocytes". J Physiol 589(Pt 24):608192
113. Statistical vs Biological significance
Breast cancer study
Difference in cancer returning between control vs
lowfat diet groups.
Authors conclusions:
People with lowfat diets had a 25% less chance of cancer returning
114. Statistical vs Biological significance
Breast cancer study
Difference in cancer returning between control vs
lowfat diet groups.
Authors conclusions:
People with lowfat diets had a 25% less chance of cancer returning
Actual return rates:
 control: 12.4%
 lowfat diet: 9.8%
Difference
2.6%
2.6
9.8 =
26.5%
115. Beware of false positives
(from the authors)
Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the PostMortem Atlantic
Salmon: An Argument For Proper Multiple Comparisons Correction”. JSUR, 2010. 1(1):15
116. Beware of false positives
Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the PostMortem Atlantic
Salmon: An Argument For Proper Multiple Comparisons Correction”. JSUR, 2010. 1(1):15
117. Beware of false positives
2012
Bennett C. et al. (2010) “Neural Correlates of Interspecies Perspective Taking in the PostMortem Atlantic
Salmon: An Argument For Proper Multiple Comparisons Correction”. JSUR, 2010. 1(1):15
118. Beware of false positives
http://xkcd.com/882/
119. Present
your data
120. Know your audience
121. Know your audience
Who?
Why?
What?
How?
122. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
Why?
What?
How?
123. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? what do my audience want to achieve?
What?
How?
124. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? what do my audience want to achieve?
what do I want my audience to know?
What? which story will captivate the audience?
How?
125. Know your audience
who is my audience? level of understanding?
Who? what do they already know?
why am I presenting?
Why? what do my audience want to achieve?
what do I want my audience to know?
What? which story will captivate the audience?
what medium will support the message the best?
How? what format/layout will appeal to the audience?
126. Color blindness is a common disease
Males: one in 12 (8%) / Females: one in 200 (0.5%)
127. Color blindness is a common disease
“Anyone who needs to be convinced that making scientific
images more accessible is a worthwhile task [...]: if your next
grant or manuscript submission contains color figures, what if
some of your reviewers are color blind? Will they be able to
appreciate your figures? Considering the competition for funding
and for publication, can you afford the possibility of frustrating
your audience? The solution is at hand."
Clarke, M. (2007). "Making figures comprehensible for colorblind readers" Nature blog
(http://blogs.nature.com/nautilus/2007/02/post_4.html)
128. Making figures for color blind people
Wong, B. (2011). "Points of view: Color blindness". Nature Methods 8, 441
129. Making figures for color blind people
http://colororacle.org/
130. Making figures for color blind people
http://colororacle.org/
131. Telling stories with data
“The Martini Glass Structure”
http://vis.stanford.edu/files/2010NarrativeInfoVis.pdf
132. Telling stories with data
“The Martini Glass Structure”
GUIDED
START
!
EXPLORE
NARRATIVE
http://vis.stanford.edu/files/2010NarrativeInfoVis.pdf
133. Aesthetic minimalism
Suda B. (2010). "A practical guide to Designing with Data"
134. Aesthetic minimalism
Suda B. (2010). "A practical guide to Designing with Data"
135. Aesthetic minimalism
Suda B. (2010). "A practical guide to Designing with Data"
136. Aesthetic minimalism
Suda B. (2010). "A practical guide to Designing with Data"
137. Aesthetic minimalism
Suda B. (2010). "A practical guide to Designing with Data"
138. Aesthetic minimalism
Suda B. (2010). "A practical guide to Designing with Data"
139. Common mistakes in data reporting
Welcome to the FOX “Dishonest Charts” gallery
140. Common mistakes in data reporting
141. Common mistakes in data reporting
E. Tufte’s “Lie Factor”
Make things appear to be “better” than they are
by fiddling with the scales of things
142. Common mistakes in data reporting
143. Common mistakes in data reporting
144. Common mistakes in data reporting
145. Common mistakes in data reporting
146. Common mistakes in data reporting
147. Common mistakes in data reporting
Fig 1I
“We found that relative to WT mice, the luminal
microbiota of Il10−/− mice exhibited a ~100fold
increase in E. coli (Fig. 1I)”
Arthur et al, (2012) Science 5;338(6103):1203
148. Common mistakes in data reporting
A
B
C
D
E
149. Common mistakes in data reporting
A
B
C
D
E
20%
20%
20%
20%
20%
150. Common mistakes in data reporting
151. Common mistakes in data reporting
152. Common mistakes in data reporting
Percent Return on Investment
40
30
20
10
0
year1
40
year2
year3
Group
year4 Group A B
Percent Return on Investment
Group A
30
Group B
20
10
0
year1
year2
year3
year4
153. Thank you!
“The important thing is not to stop questioning.
Curiosity has its own reason for existing”
 Albert Einstein