Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Statistics for non statisticians

27 views

Published on

Presented March 10, 2018 at Analytics>Forward. A very brief introduction to some statistical ideas for an audience of data analysts who are not statisticians.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Statistics for non statisticians

  1. 1. Statistics for Non-Statisticians Gerald Belton
  2. 2. Statistics: A science of collection, presentation, analysis, and interpretation of numerical data.
  3. 3. The science of statistics is based on probability.
  4. 4. Discrete distributions describe data that can only take specific values.
  5. 5. A coin toss is an example of a Bernoulli distribution. 0 0.1 0.2 0.3 0.4 0.5 0.6 Heads Tails Probability Bernoulli Distribution
  6. 6. A Binomial Distribution results from multiple coin tosses. 0.001 0.0098 0.0439 0.1172 0.2051 0.2461 0.2051 0.1172 0.0439 0.0098 0.001 0 0.05 0.1 0.15 0.2 0.25 0 1 2 3 4 5 6 7 8 9 10 Probability Number of heads Tossing a coin 10 times
  7. 7. Rolling one die can be described with a Uniform distribution. 0.167 0.167 0.167 0.167 0.167 0.167 0.000 0.050 0.100 0.150 0.200 0.250 1 2 3 4 5 6 Probability Number Rolled Rolling one die 0.028 0.056 0.083 0.111 0.139 0.167 0.139 0.111 0.083 0.056 0.028 0.000 0.050 0.100 0.150 0.200 0.250 2 3 4 5 6 7 8 9 10 11 12 Probability Number Rolled Rolling two dice
  8. 8. Continuous distributions describe data that can take infinitely many values.
  9. 9. Rainfall amounts follow an exponential distribution.
  10. 10. The Normal Distribution is a very special continuous distribution. 1 2𝜋𝜎2 𝑒 − (𝑥−𝜇)2 2𝜎2
  11. 11. Lots of real-world measures are “sort of” normally distributed.
  12. 12. Here’s an idealized normal distribution.
  13. 13. Here’s an idealized normal distribution.
  14. 14. 68% Here’s an idealized normal distribution. σ
  15. 15. Here’s an idealized normal distribution. 95% 99.7%
  16. 16. Central Limit Theorem makes other distributions “act normal.”
  17. 17. Descriptive statistics tell us about the world.
  18. 18. Visualizations quickly convey information.
  19. 19. Census Map of NC
  20. 20. Florence Nightingale
  21. 21. Florence Nightingale
  22. 22. Minard’s Map
  23. 23. Numerical descriptions provide more detail.
  24. 24. Location: Mean, Median, Mode
  25. 25. Spread: Variance, Std Dev
  26. 26. Five number summary > summary(GaltonFathers$father) Min. 1st Qu. Median Mean 3rd Qu. Max. 62.00 68.00 69.50 69.32 71.00 78.50 >
  27. 27. We have tools for looking at the relationship between variables.
  28. 28. Correlation
  29. 29. Not Causation!
  30. 30. Spurious Correlation Example
  31. 31. Statistical Inference uses properties of a sample to explain a population. Population Sample StatisticsParameters Sampling Technique Inference
  32. 32. Sampling is extremely important.
  33. 33. Online Survey Example
  34. 34. Simple Random Sample vs. Stratified Random Sample
  35. 35. Sample Size vs Precision
  36. 36. We use data to build models of reality.
  37. 37. Confidence Intervals, Hypothesis Testing, p- value • Null Hypothesis: What we are hoping to disprove. • Alternative Hypothesis: What we hope to prove. • P-value: The probability of observing results at least as extreme as these, if the null hypothesis is true.
  38. 38. When we get it wrong α β
  39. 39. Another way to remember it
  40. 40. Significance is important, but significant results might not be.
  41. 41. Significant <> Important
  42. 42. P-hacking: false significance Goodheart’s Law: when a measure become a target, it is no longer a measure
  43. 43. Measuring weirdness
  44. 44. Measuring Weirdness
  45. 45. Measuring Weirdness in two dimensions
  46. 46. Probability Descriptive Statistics Inference Questions?
  47. 47. Contact me: email: gerald.belton@gmail.com website: http://www.geraldbelton.com LinkedIn: https://www.linkedin.com/in/beltongerald/

×