• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Descriptive Statistics with R
 

Descriptive Statistics with R

on

  • 1,692 views

 

Statistics

Views

Total Views
1,692
Views on SlideShare
1,692
Embed Views
0

Actions

Likes
2
Downloads
36
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Descriptive Statistics with R Descriptive Statistics with R Presentation Transcript

    • DescriptiveStatistics with 2012-10-12 @HSPH Kazuki Yoshida, M.D. MPH-CLE student FREEDOM TO  KNOW
    • Group Website is at:http://rpubs.com/kaz_yos/useR_at_HSPH
    • Previously in this groupn Introduction to Rn Reading Data into R (1)n Reading Data into R (2) Group Website: http://rpubs.com/kaz_yos/useR_at_HSPH
    • Menun mean and sdn median, quantiles, IQR, max, min, and rangen skewness and kurtosisn smarter ways of doing these
    • Ingredients Statistics Programmingn Summary statistics for n vector and data frame continuous data n DATA$VAR extraction n Normal data n Indexing by [row,col] n Non-normal data n Various functions n Normality check n skewness(), kurtosis() n summary() n describe(), describeBy()
    • Data loadedWhat’s next? http://echrblog.blogspot.com/2011/04/statistics-on-states-with-systemic-or.html
    • Descriptive Statistics http://www.ehow.com/info_8650637_descriptive-statistical-methods.html
    • Descriptive statistics is the describingdiscipline of quantitatively themain features of a collection of data http://en.wikipedia.org/wiki/Descriptive_statistics
    • OpenR Studio
    • Download comma-separated and Excel Put them in folderBONEDEN.DAT.txt http://www.cengage.com/cgi-wadsworth/course_products_wp.pl? fid=M20bI&product_isbn_issn=9780538733496
    • Read in BONEDEN.DAT.txt Name it bone
    • Accessing a single variable in data set dataset name variable nameDATA$VAR e.g., mean(bone$age)
    • vector
    • ?http://healthy-india.org/enviromentalhealth/ direct_indirect2.html
    • DATA$VAR is a vector 1 2 3 4 5 6 7 8 OR “A” “B” “C” “D” “E” “F” “G” “H” like strings with values attached
    • Multiple vectors of same length tied together Tied hereDATA is a data frame 1 2 3 4 5 6 7 8 “A” “B” “C” “D” “E” “F” “G” “H 1 2 3 4 5 6 7 8 “A” “B” “C” “D” “E” “F” “G” “H 1 2 3 4 5 6 7 8 “A” “B” “C” “D” “E” “F” “G” “H
    • Indexing: extraction of data from data frameExtract 1st to 15th rows Extract 1st to 12th columns bone[1:15 , 1:12] Colon in between Don’t forget comma
    • age vector within bone data frame
    • bone$ageExtracted as a vector
    • meanmean(x, trim = 0, na.rm = FALSE)
    • Your turn adopted from Hadley Wickhamn What is the mean of age?
    • sdsd(x, na.rm = FALSE)
    • Your turn adopted from Hadley Wickhamn What is the sd of age?
    • median median(x, na.rm = FALSE)
    • Your turn adopted from Hadley Wickhamn What is the median of age?
    • 0th, 25th, 50th, 75th, and 100th percentiles by defaults quantile quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE, type = 7)
    • Your turn adopted from Hadley Wickhamn What is the 25th and 75th percentiles of age?
    • 75th percentile - 25th percentile IQRIQR(x, na.rm = FALSE, type = 7)
    • Your turn adopted from Hadley Wickhamn What is the IQR of age?
    • maxmax(..., na.rm = FALSE)
    • minmin(..., na.rm = FALSE)
    • Your turn adopted from Hadley Wickhamn What are the minimum and maximum of age?
    • rangerange(..., na.rm = FALSE)
    • Your turn adopted from Hadley Wickhamn What the range of age?
    • We nowresort toexternalpackages
    • Install and Load e1071, psych
    • To load a package by command package name herelibrary(package) double quote “” can be omitted
    • Assessment of normality
    • Load e1071 package
    • library(e1071)skewness skewness(x, na.rm = FALSE, type = 3) type = 2 SAS type = 1 Stata
    • library(e1071) kurtosis kurtosis(x, na.rm = FALSE, type = 3) type = 2 SAS type = 1 Stata
    • Your turn adopted from Hadley Wickhamn What are the skewness and kurtosis of age by the Stata-method?
    • Multiplevariables at once
    • summary summary(object, ...)
    • Your turn adopted from Hadley Wickhamn Try summary on the dataset (data frame).
    • Various summary measures library(psych) describe describe(x, na.rm = TRUE, interp = FALSE, skew = TRUE, ranges = TRUE,trim = .1, type = 3) type = 2 SAS type = 1 Stata
    • Your turn adopted from Hadley Wickhamn describe(bone[,-1], type = 2)
    • Groupwise summary library(psych) describeBydescribeBy(x, group=NULL,mat=FALSE,type=3,...) type = 2 SAS type = 1 Stata
    • Your turn adopted from Hadley Wickham bone data framewithout 1st columns zyg vector for grouping n describeBy(bone[ , c(-1)] , bone$zyg , type = 2) SAS method for skewness and kurtosis
    • Ingredients Statistics Programmingn Summary statistics for n vector and data frame continuous data n DATA$VAR extraction n Normal data n Indexing by [row,col] n Non-normal data n Various functions n Normality check n skewness(), kurtosis() n summary() n describe(), describeBy()