Successfully reported this slideshow.                                                 Upcoming SlideShare
×

of                                                 Upcoming SlideShare
Next

Share

# Statistics for non statisticians

Presented March 10, 2018 at Analytics>Forward. A very brief introduction to some statistical ideas for an audience of data analysts who are not statisticians.

### Related Books

#### Free with a 30 day trial from Scribd

See all
• Be the first to like this

### Statistics for non statisticians

1. 1. Statistics for Non-Statisticians Gerald Belton
2. 2. Statistics: A science of collection, presentation, analysis, and interpretation of numerical data.
3. 3. The science of statistics is based on probability.
4. 4. Discrete distributions describe data that can only take specific values.
5. 5. A coin toss is an example of a Bernoulli distribution. 0 0.1 0.2 0.3 0.4 0.5 0.6 Heads Tails Probability Bernoulli Distribution
6. 6. A Binomial Distribution results from multiple coin tosses. 0.001 0.0098 0.0439 0.1172 0.2051 0.2461 0.2051 0.1172 0.0439 0.0098 0.001 0 0.05 0.1 0.15 0.2 0.25 0 1 2 3 4 5 6 7 8 9 10 Probability Number of heads Tossing a coin 10 times
7. 7. Rolling one die can be described with a Uniform distribution. 0.167 0.167 0.167 0.167 0.167 0.167 0.000 0.050 0.100 0.150 0.200 0.250 1 2 3 4 5 6 Probability Number Rolled Rolling one die 0.028 0.056 0.083 0.111 0.139 0.167 0.139 0.111 0.083 0.056 0.028 0.000 0.050 0.100 0.150 0.200 0.250 2 3 4 5 6 7 8 9 10 11 12 Probability Number Rolled Rolling two dice
8. 8. Continuous distributions describe data that can take infinitely many values.
9. 9. Rainfall amounts follow an exponential distribution.
10. 10. The Normal Distribution is a very special continuous distribution. 1 2𝜋𝜎2 𝑒 − (𝑥−𝜇)2 2𝜎2
11. 11. Lots of real-world measures are “sort of” normally distributed.
12. 12. Here’s an idealized normal distribution.
13. 13. Here’s an idealized normal distribution.
14. 14. 68% Here’s an idealized normal distribution. σ
15. 15. Here’s an idealized normal distribution. 95% 99.7%
16. 16. Central Limit Theorem makes other distributions “act normal.”
17. 17. Descriptive statistics tell us about the world.
18. 18. Visualizations quickly convey information.
19. 19. Census Map of NC
20. 20. Florence Nightingale
21. 21. Florence Nightingale
22. 22. Minard’s Map
23. 23. Numerical descriptions provide more detail.
24. 24. Location: Mean, Median, Mode
25. 25. Spread: Variance, Std Dev
26. 26. Five number summary > summary(GaltonFathers\$father) Min. 1st Qu. Median Mean 3rd Qu. Max. 62.00 68.00 69.50 69.32 71.00 78.50 >
27. 27. We have tools for looking at the relationship between variables.
28. 28. Correlation
29. 29. Not Causation!
30. 30. Spurious Correlation Example
31. 31. Statistical Inference uses properties of a sample to explain a population. Population Sample StatisticsParameters Sampling Technique Inference
32. 32. Sampling is extremely important.
33. 33. Online Survey Example
34. 34. Simple Random Sample vs. Stratified Random Sample
35. 35. Sample Size vs Precision
36. 36. We use data to build models of reality.
37. 37. Confidence Intervals, Hypothesis Testing, p- value • Null Hypothesis: What we are hoping to disprove. • Alternative Hypothesis: What we hope to prove. • P-value: The probability of observing results at least as extreme as these, if the null hypothesis is true.
38. 38. When we get it wrong α β
39. 39. Another way to remember it
40. 40. Significance is important, but significant results might not be.
41. 41. Significant <> Important
42. 42. P-hacking: false significance Goodheart’s Law: when a measure become a target, it is no longer a measure
43. 43. Measuring weirdness
44. 44. Measuring Weirdness
45. 45. Measuring Weirdness in two dimensions
46. 46. Probability Descriptive Statistics Inference Questions?

Total views

182

On Slideshare

0

From embeds

0

Number of embeds

0

0

Shares

0