Categorical
data with
FREEDOM TO KNOW
DRAFT
Group Website is at:
http://rpubs.com/kaz_yos/useR_at_HSPH
n Introduction
n Reading Data into R (1)
n Reading Data into R (2)
n Descriptive statistics
Previously in this group
Group Website: http://rpubs.com/kaz_yos/useR_at_HSPH
Menu
n Categorical data
n How to summarize
Ingredients
n n
n table(), summary()
n xtabs(), ftable()
n prop.table()
n margin.table()
n CrossTable()
Statistics
Summary statistics for
categorical data
Programming
Creating categorical var.
Categorical
gender
country
race
ethnicity
cancer stage
data
disease severity
education level
Open
Studio
vcd
Install and Load
Load built-in dataset Named “Arthritis”
data(Arthritis)
We will use “Arthritis” dataset in vcd package
Arthritis[1:17 , ]
Extract 1st to 17th rows Show all columns
Indexing: extraction of data from
data frame
Don’t forget comma
Colon in between
Treatment vector in Arthritis data frame
Five vectors of same
length tied together
summary(object)
summary.factor(vector)
Your turn
n Do summary() on Arthritis
adopted from Hadley Wickham
Accessing a single variable in data set
dataset name variable name
DATA$VAR
Arthritis$Treatment
factor levels (categories)
levels
levels(x)
Check factor levels of a vector
Your turn
n
n
Check levels of Improved vector
Also show contents of Improved vector
adopted from Hadley Wickham
This is an ordered factor
factor is categorical
variable in
length
length(x)
Check length of a vector
Your turn
n Check length of Improved vector
adopted from Hadley Wickham
table(..., exclude = if (useNA == "no") c(NA, NaN),
useNA = c("no", "ifany","always"), dnn =
list.names(...), deparse.level = 1)
Your turn
n Create one-variable table of Improved vector
adopted from Hadley Wickham
prop. table
table(..., exclude = if (useNA == "no") c(NA, NaN),
useNA = c("no", "ifany","always"), dnn =
list.names(...), deparse.level = 1)
Your turn
n
n
Improved.cat <- table(Arthritis$Improved)
Do prop.table(Improved.cat)
adopted from Hadley Wickham
xtabs(formula = ~., data = parent.frame(),
subset, sparse = FALSE, na.action, exclude =
c(NA, NaN), drop.unused.levels = FALSE)
Create cross tables
Your turn
n
n
do xtabs(~ Treatment +Improved, Arthritis)
Further stratify by Sex
adopted from Hadley Wickham
1st
dimention
2nd dimension
3rd
dimention
Create flat tables
Good for > 3 dimentional
ftable(..., exclude = c(NA, NaN),
row.vars = NULL, col.vars = NULL)
Your turn
n table3 <- xtabs(~ Treatment +Improved +Sex,
Arthritis)
n Do ftable(table3)
adopted from Hadley Wickham
Your turn
n table2d <- xtabs(~ Treatment +Improved,
Arthritis)
n
n
n
prop.table(table2d)
prop.table(table2d, 1)
prop.table(table2d, 2)
adopted from Hadley Wickham
CrossTable
CrossTable(x, y, digits=3, max.width =
5, expected=FALSE, prop.r=TRUE,
prop.c=TRUE, prop.t=TRUE,
prop.chisq=TRUE, chisq = FALSE,
fisher=FALSE, mcnemar=FALSE,
resid=FALSE, sresid=FALSE,
available in
gmodels package
SAS-like cross tables
Your turn
n table2d <- xtabs(~ Treatment +Improved,
Arthritis)
n CrossTable(table2d)
adopted from Hadley Wickham
categorical data analysis in r studioppt.pptx
categorical data analysis in r studioppt.pptx

categorical data analysis in r studioppt.pptx