Your SlideShare is downloading. ×
0
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

# Dscriptive statistics

1,540

Published on

Published in: Health & Medicine
0 Comments
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

No Downloads
Views
Total Views
1,540
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
28
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. Prepared By: Dr.Anees AlSaadi Community Medicine Department December 2013 1
• 2. Data Summarization Descriptive statistics: • Continuous Data Description: – Measures of Data Center : • Mean, Median and Mode / definition. • Practical Exercise. – Measures of data variability: • Standard deviation(variance)/ Range. • Practical Exercise. – Normal Distribution Curve. 2
• 3. Measures of Center: • Synonyms: – Measure of central tendency. – Measures of location. • Identification of the center of the distribution of observations OR the middle or average or typical value. 3
• 4. Measures of Center: Mean • Arithmetic average for all observations. Median • The middle observation of ordered data. Mode • Most frequently observed value(s) 4
• 5. Measures of Center: Sample Mean: • The most commonly used measure of location. • Called Arithmetic average. 5
• 6. Measures of Center: How to Calculate Sample Mean: • Add up data, then divided by sample size (n). • (n) is the number of observations. 6
• 7. Measures of Center: How to Calculate Sample Mean Example: These are systolic blood pressure in (mmHg) 120,80,90,110,95. X1 =120, X2 =80 … X5 =95 Mean is calculated by adding up the five vales and dividing by 5. 7
• 8. Measures of Center: How to Calculate Sample Mean ‾ X= 120+80+90+110+95/5= 99mmHg. 8
• 9. Measures of Center: Sample Mean Example: Calculate the sample mean for number of open heart surgeries done by 7 cardiothoracic surgeons in Hamad hospital during last moth. Where, Dr.A did 4, Dr.B 3, Dr.C 6, Dr.D 5, Dr. E 4, Dr. F 3 and Dr.G 5. 4.28 surgeries. 9
• 10. Measures of Center: Sample Mean Example: The most important feature of the mean is sensitivity to the extreme values (outlier) 10
• 11. Measures of Center: Sample Median Is the middle number also called 50th percentile. 11
• 12. Measures of Center: How to Identify Sample Median • Order observations from smallest to largest. • Find the observation in the middle of the data. • Median is the observation in the middle. 12
• 13. Measures of Center: How to Identify Sample Median Sample Median Example: Identify the median for the following set of observations: – 90,80, 200,95, 110. 95 13
• 14. Measures of Center: How to Identify Sample Median Sample Median Example: • Identify the median for the following set of observations: – 90, 80, 120, 95, 125, 110. Position n+1/2 102 14
• 15. Measures of Center: Sample Median Features: Not affected by the extreme values. Less efficient to summarize the data statistically. 15
• 16. Measures of Center: Sample Mode • The most commonly occurring value in dataset. • Not all datasets have a mode. • Unimodal distribution: one mode in the dataset. • Bimodal distribution: two modes in the dataset. 16
• 17. Measures of Center: How to Identify Sample Mode • Arrange the data from small to greater values. • The most commonly / repeated value is the sample mode. 17
• 18. Measures of Center: How to Calculate Sample Median Sample Mode Example: {15, 33, 65, 32, 78, 94, 33, 110, 11, 46, 33} {11, 15, 32, 33, 33, 33, 46, 65, 78, 94, 110} Mode is 33 18
• 19. Measures of Center: Sample Mode Feature Not affected by the extreme values. Less efficient to summarize the data statistically. 19
• 20. Practical Exercise This dataset is the number of hysterectomy performed by female doctors in HMC; {44, 37, 86, 50, 20, 25, 28, 25, 31, 33, 85, 59, 27, 34, 36} find the mean, median and mode? 20
• 21. Data Summarization Descriptive statistics: • Continuous Data Description: – Measures of Data Center : • Mean, Median and Mode / definition. • Practical Exercise. – Measures of data variability (dispersion) : • Standard deviation(variance)/Range/ Interquartile range. • Practical Exercise. – Normal Distribution Curve. 21
• 22. Measures of Data Dispersion • Data dispersion = data spread. • Data dispersion: – Range. – Interquartile range. – Variance. – Standard Deviation. 22
• 23. Measures of Data Dispersion Range: • Is equal to largest ( Maximum) value minus smallest (Minimum) value. • Easy to calculate but it gives no idea about the values between the Max and Min. 23
• 24. Measures of Data Dispersion Range: Range Example: Calculate the range for the following dataset; {40, 28, 42, 30, 31, 38,100, 20, 48, 50, 51, 30} Range is 100-20=80 24
• 25. Measures of Data Dispersion Range Feature: Range is affected by the extreme of values. 25
• 26. Measures of Data Dispersion Interquartile Range • Quartiles: the 25th , 50th , 75th percentiles of the data. • Interquartile range is the distance between the 25th and 75th percentile. 26
• 27. Measures of Data Dispersion Interquartile Range Max • Max, Min,, 1st , 3rd quartiles and median are used to make box-plot (five number summary) 75th Percentile Median 50th Percentile 25th Percentile Min 27
• 28. Measures of Data Dispersion Interquartile Range • Quartiles are number that divide the dataset into four quarters with 25% of observations in each quarter • Q1 lower quartile 25% of observations below and 75% above it. • Q2 median and 50% observations on each side of it. • Q3 upper quartile 25% of observations above and 75% below it. Q3 Q2 Q1 28
• 29. Measures of Data Dispersion How to Find Interquartile Range • Arrange the data from the smallest to the largest. • Divide the data into two parts. • Define Q1 as the median of the lower half of the data. • Define Q3 as the median of the lower half of the data. • Interquartile range is the Q3-Q1. 29
• 30. Measures of Data Dispersion How to Find Interquartile Range Interquartile Range Example: {20, 28, 30, 30, 31, 38, 40, 42, 48, 50, 51, 100} {20, 28, 30, 30, 31, 38, 40, 42, 48, 50, 51, 100} Q1=25th percentile= 30 Q3=75th percentile= 49 Interquartile Range (IQR)= Q3-Q1=19 30
• 31. Practical Exercise 31
• 32. Measures of Data Dispersion Variance: • Is the averaged squared deviation from the mean. • The units of measurement are those of the original data squared. • Variance: S2 or ϭ2 32
• 33. Measures of Data Dispersion Variance: 33
• 34. Measures of Data Dispersion Standard Deviation: • Is the square root of the variance (S or ϭ) 34
• 35. Practical Exercise 35
• 36. Measures of Data Dispersion Standard Deviation: • Best used when mean is used as measure of center. • Standard Deviation = 0 indicates no spread all the data have the same value. • Is affected by extreme observations. 36
• 37. Measures of Data Dispersion Standard Deviation: 37
• 38. Choosing Measures of Center and Spread If the distribution is normal or symmetrical • Use mean and standard deviation. If the distribution is skewed OR has large outliers. • Use Median and range OR (IQR) If the distribution is bimodal • Use mode and range OR find out if the two modes represent two different groups and separate them 38
• 39. Characteristics of Measures of Spread Range IQR Standard Deviation Simple Resistance Non-Resistance Non-Resistance Used with the median Used with the mean IQR = 0 does not mean there is no spread Good for symmetrical distribution with no outliers Standard deviation of 0 means there is no spread.
• 40. Practical Exercise 40
• 41. 41