# Descriptive Statistics with R

Published in: Education
### Descriptive Statistics with R

1. 1. DescriptiveStatistics with 2012-10-12 @HSPH Kazuki Yoshida, M.D. MPH-CLE student FREEDOM TO  KNOW
2. 2. Group Website is at:http://rpubs.com/kaz_yos/useR_at_HSPH
3. 3. Previously in this groupn Introduction to Rn Reading Data into R (1)n Reading Data into R (2) Group Website: http://rpubs.com/kaz_yos/useR_at_HSPH
4. 4. Menun mean and sdn median, quantiles, IQR, max, min, and rangen skewness and kurtosisn smarter ways of doing these
5. 5. Ingredients Statistics Programmingn Summary statistics for n vector and data frame continuous data n DATA\$VAR extraction n Normal data n Indexing by [row,col] n Non-normal data n Various functions n Normality check n skewness(), kurtosis() n summary() n describe(), describeBy()
6. 6. Data loadedWhat’s next? http://echrblog.blogspot.com/2011/04/statistics-on-states-with-systemic-or.html
7. 7. Descriptive Statistics http://www.ehow.com/info_8650637_descriptive-statistical-methods.html
8. 8. Descriptive statistics is the describingdiscipline of quantitatively themain features of a collection of data http://en.wikipedia.org/wiki/Descriptive_statistics
9. 9. OpenR Studio
11. 11. Read in BONEDEN.DAT.txt Name it bone
12. 12. Accessing a single variable in data set dataset name variable nameDATA\$VAR e.g., mean(bone\$age)
13. 13. vector
14. 14. ?http://healthy-india.org/enviromentalhealth/ direct_indirect2.html
15. 15. DATA\$VAR is a vector 1 2 3 4 5 6 7 8 OR “A” “B” “C” “D” “E” “F” “G” “H” like strings with values attached
16. 16. Multiple vectors of same length tied together Tied hereDATA is a data frame 1 2 3 4 5 6 7 8 “A” “B” “C” “D” “E” “F” “G” “H 1 2 3 4 5 6 7 8 “A” “B” “C” “D” “E” “F” “G” “H 1 2 3 4 5 6 7 8 “A” “B” “C” “D” “E” “F” “G” “H
17. 17. Indexing: extraction of data from data frameExtract 1st to 15th rows Extract 1st to 12th columns bone[1:15 , 1:12] Colon in between Don’t forget comma
18. 18. age vector within bone data frame
19. 19. bone\$ageExtracted as a vector
20. 20. meanmean(x, trim = 0, na.rm = FALSE)
22. 22. sdsd(x, na.rm = FALSE)
24. 24. median median(x, na.rm = FALSE)
26. 26. 0th, 25th, 50th, 75th, and 100th percentiles by defaults quantile quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE, type = 7)
27. 27. Your turn adopted from Hadley Wickhamn What is the 25th and 75th percentiles of age?
28. 28. 75th percentile - 25th percentile IQRIQR(x, na.rm = FALSE, type = 7)
30. 30. maxmax(..., na.rm = FALSE)
31. 31. minmin(..., na.rm = FALSE)
33. 33. rangerange(..., na.rm = FALSE)
35. 35. We nowresort toexternalpackages
36. 36. Install and Load e1071, psych
37. 37. To load a package by command package name herelibrary(package) double quote “” can be omitted
38. 38. Assessment of normality