Categorical data with R

4,206 views

Published on

Published in: Education
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,206
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
104
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Categorical data with R

    1. 1. Tabulatingdata with 2012-10-22 @HSPHKazuki Yoshida, M.D. MPH-CLE student FREEDOM TO  KNOW
    2. 2. Group Website is at:http://rpubs.com/kaz_yos/useR_at_HSPH
    3. 3. Previously in this groupn Introductionn Reading Data into R (1)n Reading Data into R (2)n Descriptive statistics Group Website: http://rpubs.com/kaz_yos/useR_at_HSPH
    4. 4. Menun Categorical datan How to tabulaten Get sums and proportions
    5. 5. Ingredients Epi/Stat Programmingn Tables n data()n Cross tables n table(), summary()n Stratified tables n prop.table() n addmargins() n xtabs(), ftable() n gmodels::CrossTable() n epiR::epi.2by2() n Creating categorical variables
    6. 6. country race gender ethnicity Categorical datacancer stage education level disease severity
    7. 7. OpenR Studio
    8. 8. Install and Load vcd epiR
    9. 9. We will use “Arthritis” dataset in vcd packageLoad built-in dataset Named “Arthritis” data(Arthritis)
    10. 10. Indexing: extraction of data from data frameExtract 1st to 17th rows Show all columns Arthritis[1:17 , ] Colon in between Don’t forget comma
    11. 11. Treatment vector in Arthritis data frame Five vectors of same length tied together
    12. 12. summary of whole dataset summary summary(Arthritis)
    13. 13. Your turn adopted from Hadley Wickhamn summary(Arthritis)
    14. 14. Accessing a single variable in data set dataset name variable nameArthritis$Treatment
    15. 15. Arthritis$Treatmentfactor levels (categories)
    16. 16. Check factor levels of a vector levels levels(Arthritis$Treatment)
    17. 17. Your turn adopted from Hadley Wickhamn Arthritis$Improvedn levels(Arthritis$Improved)
    18. 18. This is an ordered factor
    19. 19. factor
    20. 20. factor is categorical variable in R
    21. 21. Create a singlevariable summary table table(Arthritis$Improved)
    22. 22. Your turn adopted from Hadley Wickhamn table(Arthritis$Improved)
    23. 23. Convert tables to proportions prop.table table(table.object)
    24. 24. Your turn adopted from Hadley Wickhamn Improved.cat <- table(Arthritis$Improved)n prop.table(Improved.cat)
    25. 25. Create cross tables xtabs xtabs(formula = ~ , data = Arthritis)
    26. 26. Your turn adopted from Hadley Wickhamn xtabs(~ Treatment +Improved, Arthritis)n xtabs(~ Treatment +Improved +Sex, Arthritis)
    27. 27. 2nd dimention 1stdimention 3rd dimention
    28. 28. Add margins to tables addmargins addmargins(table.object)
    29. 29. Your turn adopted from Hadley Wickhamn tab1 <- xtabs(~ Treatment +Improved, Arthritis)n addmargins(tab1)
    30. 30. Create flat tables Good for ≥ 3 dimentional ftable ftable(..., exclude = c(NA, NaN), row.vars = NULL, col.vars = NULL)
    31. 31. Your turn adopted from Hadley Wickhamn tab2 <- xtabs(~ Treatment +Improved +Sex, Arthritis)n ftable(tab2)
    32. 32. Proportions againprop.table table(cross.table.object)
    33. 33. Your turn adopted from Hadley Wickhamn tab3 <- xtabs(~ Treatment +Improved, Arthritis)n prop.table(tab3) # proportion to totaln prop.table(tab3, 1) # proportion to row sum 1st dimensionn prop.table(tab3, 2) # proportion to2nd dimension sum column
    34. 34. Chi-squared testchisq.test chisq.test(cross.table.object)
    35. 35. Fisher’s exact testfisher.test fisher.test(cross.table.object)
    36. 36. Your turn adopted from Hadley Wickhamn tab3 <- xtabs(~ Treatment +Improved, Arthritis)n chisq.test(tab3)n fisher.test(tab3)
    37. 37. SAS-like cross tables available in gmodels package CrossTable CrossTable(tab.2d)
    38. 38. Your turn adopted from Hadley Wickhamn tab3 <- xtabs(~ Treatment +Improved, Arthritis)n CrossTable(tab3)
    39. 39. 2x2 table with RR RD OR available in epiR package epi.2x2 epi.2x2(tab.2by2)
    40. 40. Your turn adopted from Hadley Wickhamn tab.2by2 <- xtabs(~ Sex +Treatment, Arthritis)n epi.2by2(tab.2by2, units = 1)
    41. 41. Creating factor
    42. 42. Data in Excel factor factor Integer
    43. 43. To convert number vector to factor vectordat$Stage <- factor(dat$Stage)
    44. 44. To convert back to numberdat$Stage <- as.numeric(as.character(dat$Stage))

    ×