Basics in Epidemiology & Biostatistics
Hashem Alhashemi MD, MPH, FRCPC
Assistant Professor, KSAU-HS
• Large samples > 30.
• Normally distributed.
• Descriptive statistics:
Range, Mean, SD.
Non-parametric data
• For small s...
Non-parametric data
The Mean
• It sums all the values (great digital summary ).
• But, it will be affected by extreme values. So, it is not
a ...
Stander Deviation
Average of differences from the mean (Squared-SS)
Sample set:
1 ,2 ,3 ,4 , 5 ,6 ,7
X = 28/7= 4
Number of...
Differences?
Similar
< +/- 1𝛔
Slightly
Different
Very
Different
Extremely
Different
(0.02)
> +/-2𝛔
(0.001)
>+/- 3𝛔
<+/-2𝛔
• Zdistribution, is a hypothetical population
(model) with a 𝛍 of 0, & 𝛔 1.
• Six (𝛔 ) make up 0.997 of the area under the...
• God knows every thing.
• Dose not need to take samples.
• Commits no mistakes.
Central Limit Theorem
• The mean of all possible sample means will be
approximately equal to the mean of the
population.
•...
• tdistribution, is a hypothetical population (model)
with a 𝛍 of 0, & 𝛔 1 , (Degrees of freedom= n-1).
• Six (𝛔 ) make up...
Similar
<+/-1 SE
Slightly
Different
Very
Different
Extremely
Different
(0.02)
> +/-2 SE
(0.001)
>+/- 3 SE
<+/-2 SE
Stander Error
SE is the unit for error in estimating the population mean.
SE is the unit for deviation of all possible sam...
The Average Idea
SE Stander ErrorS Stander DeviationX mean
A unit for Error in
estimation of the
population mean.
A unit o...
A Fancy World made of
%s & Averages
Biostatistics
Sample size
Estimate
Calculate
Calculate (SE)
?
?
Estimate
95% Confidence Interval (C.I)
SE
Stander of
Error
+/- 2 SE
μ
π
Ω
λ
Estimate Margin of Error
X
P
OR
Rate
General formula
SD vs SE
• Standard Deviation calculates the variability of the
data within a sample in relation to the sample mean .
• St...
A Fancy World made of
Biostatistics
Averages & %s
Population (descriptive) :
• Calculate Mean
μ (measures)
• Calculate proportion
𝛑 (counts)
• Calculate Stander deviation
σ...
END
• Large samples > 30.
• Normally distributed.
• Descriptive statistics:
Range, Mean, SD.
Non-parametric data
• For small s...
Non-parametric data
• For small samples and variables that are not
normally distributed.
• No basic assumptions (distribut...
Count
Quantitative
Data
Discrete
Continuous
Binomial (Binary) :
Sex
Ratio (real zero) /
Interval (no zero)
Temperature/BP
...
Non-parametric data
• For small samples and variables that are not
normally distributed.
• No basic assumptions (distribut...
Differences?
Objectives
• Definitions.
• Types of Data.
• Data summaries.
• Mean Χ , Stander deviation S.
• Stander Error SE, Confidenc...
Quantitative
Data
Discrete
Continuous
Dichotomous:
Binary: Sex
Multichotomous:
1-No order : Race
2-Ordinal: Education
Nume...
Quantitative
Data
Discrete
Continuous
Categorical :
1- Di-chotomous:
Sex
2- Multi-chotomous:
Race,Education
Numerical:
num...
Summaries
Visual Numerical
X, 𝛍, s, 𝛔Histogram
P, 𝛑, s, 𝛔Bar & Pie Chart (Counts)
Categories
(Measures)
Any value
Data Presentation
%%
Normality & Approximation to Normality
Why?
Approximation to Normality
• If choices are equally likely to happen
• If repeated numerous number of times
• It will look...
Normality & Approximation to Normality
Clinical Relevance?
Choices equally likely to happen…..
i.e. Out come of interest probability is unknown
(Research ethics)
Repeated numerous n...
The Bell / Normal curve
Stander deviation(SD)/ sample curve
True error (SE)/ population curve
• Was first discovered by Ab...
De Moivre had hoped for a chair of
mathematics, but foreigners were at a
disadvantage, so although he was free
from religi...
Largest Value - Smallest Value
SD estimate
Upcoming SlideShare
Loading in...5
×

Basics in Epidemiology & Biostatistics 2 RSS6 2014

290

Published on

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
290
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Basics in Epidemiology & Biostatistics 2 RSS6 2014

  1. 1. Basics in Epidemiology & Biostatistics Hashem Alhashemi MD, MPH, FRCPC Assistant Professor, KSAU-HS
  2. 2. • Large samples > 30. • Normally distributed. • Descriptive statistics: Range, Mean, SD. Non-parametric data • For small samples & variables that are not normally distributed. • No basic assumptions (distribution free). • Descriptive statistics: Range, Rank, Median, & the interquartile range. (the middle 50 = Q3-Q1). • Median is the middle number in a ranked list of numbers (regardless of its frequency). Parametric data
  3. 3. Non-parametric data
  4. 4. The Mean • It sums all the values (great digital summary ). • But, it will be affected by extreme values. So, it is not a good summary if your data is not normal (symmetrical bell shape). • The sum of data differences above and below the mean will equal = 0. "‫الوسط‬ ‫األمور‬ ‫خير‬ ‫شطط‬ ‫التناهي‬ ‫حب‬"
  5. 5. Stander Deviation Average of differences from the mean (Squared-SS) Sample set: 1 ,2 ,3 ,4 , 5 ,6 ,7 X = 28/7= 4 Number of differences = 6 (n-1) Stander deviation Unit of deviation of data from the Mean
  6. 6. Differences?
  7. 7. Similar < +/- 1𝛔 Slightly Different Very Different Extremely Different (0.02) > +/-2𝛔 (0.001) >+/- 3𝛔 <+/-2𝛔
  8. 8. • Zdistribution, is a hypothetical population (model) with a 𝛍 of 0, & 𝛔 1. • Six (𝛔 ) make up 0.997 of the area under the curve Z distribution Parametric Data Population %
  9. 9. • God knows every thing. • Dose not need to take samples. • Commits no mistakes.
  10. 10. Central Limit Theorem • The mean of all possible sample means will be approximately equal to the mean of the population. • The distribution of all possible sample means will be normal. • If you limit your prediction to the center, you will be ok (averages are normally distributed) (1777 – 1855) "‫الوسط‬ ‫األمور‬ ‫خير‬ ‫شطط‬ ‫التناهي‬ ‫حب‬" Carl Friedrich Gauss
  11. 11. • tdistribution, is a hypothetical population (model) with a 𝛍 of 0, & 𝛔 1 , (Degrees of freedom= n-1). • Six (𝛔 ) make up 0.997 of the area under the curve t distribution Parametric Data Sample Sampling distribution %
  12. 12. Similar <+/-1 SE Slightly Different Very Different Extremely Different (0.02) > +/-2 SE (0.001) >+/- 3 SE <+/-2 SE
  13. 13. Stander Error SE is the unit for error in estimating the population mean. SE is the unit for deviation of all possible samples means from the population mean. SE is the unit for average difference of all possible samples means from population mean. n because S is a root product of the variance.
  14. 14. The Average Idea SE Stander ErrorS Stander DeviationX mean A unit for Error in estimation of the population mean. A unit of Deviation of the data from the sample mean. Average A unit for Deviation of all possible samples means from the population mean. A unit for Average of differences of the data from the sample mean. A unit for Average of differences of all possible samples means from population mean.
  15. 15. A Fancy World made of %s & Averages Biostatistics
  16. 16. Sample size Estimate Calculate Calculate (SE) ? ? Estimate
  17. 17. 95% Confidence Interval (C.I) SE Stander of Error +/- 2 SE μ π Ω λ Estimate Margin of Error X P OR Rate General formula
  18. 18. SD vs SE • Standard Deviation calculates the variability of the data within a sample in relation to the sample mean . • Standard Error estimates the variability of all possible samples means in relation to the population mean. So, it helps identify the % of data above and below a certain measurement. So, it helps identify the degree of error in your estimation.
  19. 19. A Fancy World made of Biostatistics Averages & %s
  20. 20. Population (descriptive) : • Calculate Mean μ (measures) • Calculate proportion 𝛑 (counts) • Calculate Stander deviation σ • Calculate Parameters: μ & 𝛑 Sample (Inferential) : • Estimate Sample size • Calculate Mean X • Calculate Stander deviation S • Calculate Stander error SE & 95% C.I (Confidence Interval) • Calculate Statistics Difference between studying populations & samples: Estimate Parameters: μ & 𝛑
  21. 21. END
  22. 22. • Large samples > 30. • Normally distributed. • Descriptive statistics: Range, Mean, SD. Non-parametric data • For small samples & variables that are not normally distributed. • No basic assumptions (distribution free). • Descriptive statistics: Range, Rank, Median, & the interquartile range. (the middle 50 = Q3-Q1). • Median is the middle number in a ranked list of numbers. Parametric data
  23. 23. Non-parametric data • For small samples and variables that are not normally distributed. • No basic assumptions (distribution free). • Descriptive statistics: Range, Rank, Median, and the interquartile range (the middle 50 = Q3-Q1).
  24. 24. Count Quantitative Data Discrete Continuous Binomial (Binary) : Sex Ratio (real zero) / Interval (no zero) Temperature/BP Multinomial : 1-Categorical : Race 2-Ordinal: Education 3-Numerical: number pregnancies/residents Measure
  25. 25. Non-parametric data • For small samples and variables that are not normally distributed. • No basic assumptions (distribution free). • Descriptive statistics: Range, Rank, Median, and the interquartile range (the middle 50 = Q3-Q1).
  26. 26. Differences?
  27. 27. Objectives • Definitions. • Types of Data. • Data summaries. • Mean Χ , Stander deviation S. • Stander Error SE, Confidence interval C.I of μ .
  28. 28. Quantitative Data Discrete Continuous Dichotomous: Binary: Sex Multichotomous: 1-No order : Race 2-Ordinal: Education Numerical: number pregnancies/residents Ratio (real zero) / Interval (no zero) Temperature/BP (Non-Parametric Data)
  29. 29. Quantitative Data Discrete Continuous Categorical : 1- Di-chotomous: Sex 2- Multi-chotomous: Race,Education Numerical: number of pregnancies/residents Ratio (real zero) / Interval (no zero) Temperature/BP Types of Data Count Non-Parametric Data Parametric Data Parametric Data
  30. 30. Summaries Visual Numerical X, 𝛍, s, 𝛔Histogram P, 𝛑, s, 𝛔Bar & Pie Chart (Counts) Categories (Measures) Any value
  31. 31. Data Presentation %%
  32. 32. Normality & Approximation to Normality Why?
  33. 33. Approximation to Normality • If choices are equally likely to happen • If repeated numerous number of times • It will look normal. • Whether it was a coin or a dice (Di-chotomous or Multi-chotomous)
  34. 34. Normality & Approximation to Normality Clinical Relevance?
  35. 35. Choices equally likely to happen….. i.e. Out come of interest probability is unknown (Research ethics) Repeated numerous number of times…. i.e. Large sample size Normality assumption helps us predict the Probability of our outcome
  36. 36. The Bell / Normal curve Stander deviation(SD)/ sample curve True error (SE)/ population curve • Was first discovered by Abraham de Moivre in 1733. • The one who was able to reproduce it and identified it as the normal distribution (error curve) was Gauss in 1809.
  37. 37. De Moivre had hoped for a chair of mathematics, but foreigners were at a disadvantage, so although he was free from religious discrimination, he still suffered discrimination as a Frenchman in England. Born 1667 in Champagne, France Died 1754 in London, England
  38. 38. Largest Value - Smallest Value SD estimate
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×