SlideShare a Scribd company logo
1 of 67
Download to read offline
Session III:
SAS Analysis Procedures
Introduction to SAS Procedures
• SAS data set information
– PROC CONTENTS
– PROC PRINT
• Descriptive statistics
– PROC MEANS
– PROC UNIVARIATE
– PROC FREQ
– PROC CORR
What does my SAS
Data Set Contain?
• How many observations?
• How many variables?
• What kinds of variables?
PROC CONTENTS
• Provides information about the contents of
a SAS data set.
• Example Syntax:
PROC CONTENTS <options>;
TITLE ‘Contents Listing’;
RUN;
PROC CONTENTS - Options
• DATA = <SAS data file>
• OUT = <SAS data file>
• DETAILS / NODETAILS
• ORDER=
• VARNUM
PROC CONTENTS
• Key items to look for:
• Data set name
• # of observations
• # of variables
• Date data set was created and last modified
• List of variables with type, format and label
PROC CONTENTS –
Example Program
*** This shows an example of PROC CONTENTS using the ***
*** example data set ***;
PROC CONTENTS DATA = sas.example;
TITLE 'Contents listing - Example data set';
RUN;
*** A TITLE statement includes the keyword TITLE and quotes ***
*** either single or double - that give the output some ***
*** meaningful title - best practice is to include ***
*** the name of the procedure and the data set being used ***;
PROC CONTENTS – Example Output
Data Set Name SAS.EXAMPLE Observations 203
Member Type DATA Variables 17
Engine V9 Indexes 0
Created Wed, Feb 25, 2015 11:32:49 AM Observation Length 104
Last Modified Wed, Feb 25, 2015 11:32:49 AM Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation WINDOWS_64
Encoding wlatin1 Western (Windows)
Engine/Host Dependent Information
Data Set Page Size 12288
Number of Data Set Pages 2
First Data Page 1
Max Obs per Page 117
Obs in First Data Page 90
Number of Data Set Repairs 0
Filename C:UsersschmidDesktopRandom SAS stuffexample.sas7bdat
Release Created 9.0301M2
Host Created X64_7PRO
PROC CONTENTS – Example Output
Alphabetic List of Variables and Attributes
# Variable Type Len Format Informat Label
1 ID Num 8 ID
5 age Num 8 age
2 date Num 8 DATE9. DATE9. date
4 gender Num 8 gender
14 livewdad Num 8 livewdad
13 livewmom Num 8 livewmom
12 momcode Char 1 $1. $1. momcode
11 momed Num 8 momed
17 nontrad Char 1 $1. $1. nontrad
3 race Num 8 race
6 sensation_seeking Char 1 $1. $1. sensation seeking
7 senseek1 Num 8 senseek1
8 senseek2 Num 8 senseek2
9 senseek3 Num 8 senseek3
10 senseek4 Num 8 senseek4
15 totfam Char 1 $1. $1. totfam
16 tradfam Char 1 $1. $1. tradfam
PROC PRINT
• PROC PRINT -> prints a list of observations
in a SAS data set.
• Example syntax:
PROC PRINT <options>;
WHERE condition;
VAR variable list;
BY variable list;
SUM variable list;
TITLE ‘Print’;
RUN;
PROC PRINT - Options
• DATA = <SAS Data Set>
• BLANKLINE = (n)
• DOUBLE
• HEADING
• LABEL
• NOOBS
• ROUND
• ROWS = PAGE
PROC PRINT – VAR statement
• Lists the variables to be printed.
• The VAR statement is optional.
• If omitted all the variables in the data set will
be printed.
• Variables are printed in the order listed in the
VAR statement.
Example: PROC PRINT syntax
*** This shows an example of PROC PRINT ***
*** using the VAR statement and only printing ***
*** only printing student ID, gender and race ***
*** using the example data set ***;
PROC PRINT DATA = sas.example NOOBS;
VAR id gender race;
TITLE 'Print list - studentid gender and race -
example data set';
RUN;
Example: PROC PRINT output
ID gender race
1188 1 1
1214 0 1
1146 0 2
1203 0 1
1218 1 2
1101 1 1
1102 1 1
1103 0 2
1104 0 2
1105 1 2
1106 0 1
1108 0 1
1109 1 2
1110 0 1
1112 0 2
1114 1 1
1115 1 1
1116 0 2
1118 1 3
PROC PRINT – BY Statement
• Prints data separately for each group in the
BY variable.
• The BY statement is optional
• When using the BY statement, the data
must first be sorted by the variable(s) listed
in the BY statement.
PROC PRINT – BY syntax
*** This is an example of a PROC PRINT using the BY statement ***;
PROC SORT DATA = sas.example;
BY age;
RUN;
PROC PRINT DATA= sas.example;
VAR senseek1--senseek4;
BY age;
TITLE 'PRINT LIST - senseek by age';
RUN;
PROC PRINT – BY OUTPUT
• Age = 11
• Age = 12
Obs senseek1 senseek2 senseek3 senseek4
38 5 3 2 2
39 5 5 1 5
40 4 4 2 4
Obs senseek1 senseek2 senseek3 senseek4
41 4 5 3 3
42 5 3 1 4
43 4 1 1 3
44 4 3 2 3
45 5 2 3 4
46 2 2 3 3
47 4 4 2 4
48 3 4 4 3
49 5 5 5 5
PROC PRINT – WHERE statement
• WHERE statement can be used to display a
subset of the data set.
• The WHERE statement works in the PROC step
as well as the DATA step
PROC PRINT – WHERE syntax
*** This is an example of a PROC PRINT using a WHERE ***
*** statement using the example data set ***;
DATA one;
SET sas.example;
PROC PRINT;
WHERE age = 14
VAR age senseek1--senseek4;
TITLE 'PRINT - Age 14 - sensation seeking using example data';
RUN;
PROC PRINT – WHERE OUTPUT
Obs age senseek1 senseek2 senseek3 senseek4
199 14 4 4 5 5
200 14 5 4 3 4
201 14 1 1 4 1
202 14 5 5 1 5
203 14 1 1 1 1
How To Obtain
Descriptive Statistics
• PROC MEANS
• PROC UNIVARIATE
• PROC FREQ
• PROC CORR
PROC MEANS
• Example Syntax:
PROC MEANS <options> <statistic keyword list>;
WHERE condition;
VAR variable list;
CLASS variable list;
BY variable list;
OUTPUT <OUT = SAS dataset>;
RUN;
PROC MEANS - Options
• DATA =
• Classification levels control
• Output control
• Output dataset control
• Statistical analysis control
PROC MEANS – STATISTIC KEYWORDS:
DEFAULT
• Statistics printed by default
• N – Number of observations
• MEAN – mean
• STD – standard deviation
• MIN – minimum value
• MAX – maximum value
PROC MEANS EXAMPLE 1
*** This program is a standard PROC ***
*** MEANS looking at four sensation ***
*** seeking items in the example dataset ***;
PROC MEANS DATA = sas.example;
VAR senseek1--senseek4;
TITLE 'Standard output for means using example
data';
RUN;
PROC MEANS EXAMPLE 1 - OUTPUT
Variable N Mean Std Dev Minimum Maximum
sseek1
sseek2
sseek3
sseek4
158
158
158
158
3.46
2.78
2.62
3.28
1.37
1.43
1.47
1.39
1.00
1.00
1.00
1.00
5.00
5.00
5.00
5.00
PROC MEANS – OTHER STATISTIC
KEYWORDS
• CLM = two sided
confidence limits
• Median = Median
• NMISS = Number of
missing values
• P10 = 10% quantile
• Q1 = 25% quantile
• Range = the range
• STDERR = Standard
error of the mean
• SUM = Sum
• VAR = Variance
• T = Student’s t
PROC MEANS – EXAMPLE 2
*** This program is a standard PROC MEANS on 2 items and doing a paired t-test ***;
DATA one;
SET sas.example;
PROC MEANS;
VAR senseek1 senseek2;
TITLE 'Means of the 2 sensation seeking items to be used in t test';
RUN;
DATA two;
SET one;
ATTRIB difseek label = 'Differences between senseek1 and senseek2';
difseek = senseek2 - senseek1;
PROC MEANS n mean stderr t prt;
VAR difseek;
TITLE 'T test - differences between senseek2 and senseek1';
TITLE2 'Example data set';
RUN;
PROC MEANS –
EXAMPLE 2 OUTPUT
Variable Label N Mean Std Dev Minimum Maximum
senseek1
senseek2
senseek1
senseek2
158
158
3.4620253
2.7848101
1.3760281
1.4336350
1.0000000
1.0000000
5.0000000
5.0000000
Analysis Variable : difseek Differences between senseek1 and
senseek2
N Mean Std Error t Value Pr > |t|
158 -0.6772152 0.1126051 -6.01 <.0001
PROC MEANS – CLASS Statement
• Class statement: calculates statistics for
each group in CLASS variable.
• CLASS variables can be numeric or
character.
• Data does not need to be sorted to use
CLASS statement.
PROC MEANS –CLASS syntax
*** This program includes a CLASS statement. ***
*** The CLASS statement creates separate analyses ***
*** for each category of data specified by the CLASS ***
*** statement. ***;
PROC MEANS DATA = sas.example;
CLASS age;
VAR senseek1--senseek4;
TITLE 'means of sensation seeking items by age';
TITLE2 'example data set';
RUN;
PROC MEANS – CLASS Output
age N Obs Variable Label N Mean Std Dev Minimum Maximum
11 3 senseek1
senseek2
senseek3
senseek4
senseek1
senseek2
senseek3
senseek4
3
3
3
3
4.6666667
4.0000000
1.6666667
3.6666667
0.5773503
1.0000000
0.5773503
1.5275252
4.0000000
3.0000000
1.0000000
2.0000000
5.0000000
5.0000000
2.0000000
5.0000000
12 128 senseek1
senseek2
senseek3
senseek4
senseek1
senseek2
senseek3
senseek4
121
121
121
121
3.4380165
2.7024793
2.5123967
3.2479339
1.3837905
1.4002656
1.4440435
1.3740511
1.0000000
1.0000000
1.0000000
1.0000000
5.0000000
5.0000000
5.0000000
5.0000000
13 30 senseek1
senseek2
senseek3
senseek4
senseek1
senseek2
senseek3
senseek4
28
28
28
28
3.5000000
2.9642857
3.1785714
3.4285714
1.2909944
1.5511559
1.5166623
1.3991683
1.0000000
1.0000000
1.0000000
1.0000000
5.0000000
5.0000000
5.0000000
5.0000000
14 5 senseek1
senseek2
senseek3
senseek4
senseek1
senseek2
senseek3
senseek4
5
5
5
5
3.2000000
3.0000000
2.8000000
3.2000000
2.0493902
1.8708287
1.7888544
2.0493902
1.0000000
1.0000000
1.0000000
1.0000000
5.0000000
5.0000000
5.0000000
5.0000000
PROC UNIVARIATE
• Provides descriptive statistics for numeric
variables (mean, standard deviation, range,
min, max, etc.)
• Provides more detailed information on the
distribution of a variable (extreme values,
plots of distribution, etc.)
PROC UNIVARIATE
• Example Syntax:
PROC UNIVARIATE <options>;
WHERE condition;
VAR variable list;
BY variable list;
FREQ variable list;
RUN;
PROC UNIVARIATE Syntax
*** This is an example of a standard PROC ***
*** UNIVARIATE program. It uses the variable - ***
*** mom's education - in the example dataset ***;
PROC UNIVARIATE data = sas.example;
VAR momed;
TITLE 'Univariate - mom's education - example
dataset';
RUN;
PROC UNIVARIATE – Output
Moments
N 155 Sum Weights 155
Mean 3.66451613 Sum Observations 568
Std Deviation 1.15813004 Variance 1.34126519
Skewness -0.7379811 Kurtosis -0.3846022
Uncorrected SS 2288 Corrected SS 206.554839
Coeff Variation 31.6039007 Std Error Mean 0.09302324
Basic Statistical Measures
Location Variability
Mean 3.664516 Std Deviation 1.15813
Median 4.000000 Variance 1.34127
Mode 4.000000 Range 4.00000
Interquartile Range 1.00000
Tests for Location: Mu0=0
Test Statistic p Value
Student's t t 39.39355 Pr > |t| <.0001
Sign M 77.5 Pr >= |M| <.0001
Signed Rank S 6045 Pr >= |S| <.0001
Quantiles (Definition 5)
Quantile Estimate
100% Max 5
99% 5
95% 5
90% 5
75% Q3 4
50% Median 4
25% Q1 3
10% 2
5% 1
1% 1
0% Min 1
PROC UNIVARIATE – Output (cont.)
Extreme Observations
Lowest Highest
Value Obs Value Obs
1 199 5 190
1 185 5 191
1 169 5 193
1 137 5 194
1 95 5 196
Missing Values
Missing
Value Count
Percent Of
All Obs
Missing
Obs
. 48 23.65 100.00
PROC UNIVARIATE - Plots
• Many different visualization options are
available using PROC UNIVARIATE and
coordinating statements
– HISTOGRAM
– PPPLOT
– PROBPLOT
– QQPLOT
– CDFPLOT
PROC UNIVARIATE – Plot Syntax
LIBNAME sas "C:UsersschmidDesktopRandom SAS stuff";
DATA one;
SET sas.example;
PROC UNIVARIATE PLOT;
VAR momed;
TITLE 'General plots given by univariate procedure for momed - SAS example data';
RUN;
PROC UNIVARIATE PLOT;
VAR momed;
HISTOGRAM;
TITLE 'Histogram given by univariate procedure for momed - SAS example data';
RUN;
PROC UNIVARIATE – Plot Output
PROC UNIVARIATE – Plot output
PROC FREQ
• Provides descriptive statistics in the form of
frequencies and crosstabulation tables.
• Provides statistics to analyze the relationships
between variables.
PROC FREQ
• Example Syntax:
PROC FREQ <options>;
BY variable list;
TABLES variable list </options>;
TEST <options>;
OUTPUT <OUT=DATA>;
RUN;
*If TABLES statement is omitted, one-way tables will be
generated for all variables.
PROC FREQ - Options
• COMPRESS
• DATA =
• FORMCHAR =
• NLEVELS
• NOPRINT
• ORDER=
• PAGE
PROC FREQ – Basic Table Syntax
*** This is a standard PROC FREQ program. ***
*** The variables used are race and gender ***
*** Refresher information on formats ***
*** Example dataset continues to be used ***;
PROC FORMAT;
VALUE gendfmt 1 = 'Male'
2 = 'Female';
VALUE racefmt 1 = 'AA'
2 = 'White'
3 = 'Hispanic'
4 = 'Multi'
5 = 'Other';
PROC FREQ;
TABLES gender race;
FORMAT gender gendfmt. race racefmt.;
TITLE 'Frequency: gender and race';
TITLE2 'DATA SET: example';
RUN;
PROC FREQ – Basic Table Output
gender
gender Frequency Percent
Cumulative
Frequency
Cumulative
Percent
Male 84 50.00 84 50.00
Female 84 50.00 168 100.00
Frequency Missing = 35
race
race Frequency Percent
Cumulative
Frequency
Cumulative
Percent
AA 79 47.02 79 47.02
White 68 40.48 147 87.50
Hispanic 12 7.14 159 94.64
Multi 5 2.98 164 97.62
Other 4 2.38 168 100.00
Frequency Missing = 35
PROC FREQ
• Provides various forms of crosstabulation
tables.
• One-way frequencies -> generates a table with the frequency of
the different values of a variable.
• Two-way crosstabulation table -> generates a frequency table with
the values of the two variables.
• N-way crosstabulation table -> generates a n-way frequency table
with the values of the n variables.
PROC FREQ – Crosstab Syntax
*** This program is an example of a crosstab ***
*** available as part of the PROC FREQ. ***
*** Variables used are gender and race ***
*** in the example dataset ***;
PROC FREQ;
TABLES race*gender;
FORMAT gender gendfmt. race racefmt.;
TITLE 'Crosstab - gender and race';
RUN;
PROC FREQ – Crosstab Output
Table of race by gender
race(race) gender(gender)
Frequency
Percent
Row Pct
Col Pct Male Female Total
AA 36
21.43
45.57
42.86
43
25.60
54.43
51.19
79
47.02
White 37
22.02
54.41
44.05
31
18.45
45.59
36.90
68
40.48
Hispanic 5
2.98
41.67
5.95
7
4.17
58.33
8.33
12
7.14
Multi 3
1.79
60.00
3.57
2
1.19
40.00
2.38
5
2.98
Other 3
1.79
75.00
3.57
1
0.60
25.00
1.19
4
2.38
Total 84
50.00
84
50.00
168
100.00
Frequency Missing = 35
PROC FREQ – TABLES Statement
Options
• LIST -> A list rather than a table.
• MISSING -> Missing values are included in
calculations of percentages.
• NOCOL -> Suppresses column percentages.
• NOROW -> Suppresses row percentages.
PROC FREQ – TABLES statement
options
• Agree -> Test and measures of classification
agreement.
• CHISQ -> Chi Square test of association
• CL -> Confidence limits
• CMH -> Mantel-Haenszel statistics
• MEASURES -> Association between variables
PROC FREQ – TABLES syntax
*** This program is an example of a crosstab using variables race and gender ***
*** Specifically, this shows an example of the options: LIST and MISSING ***;
LIBNAME sas " ";
PROC FORMAT;
VALUE gendfmt 1 = 'Male'
2 = 'Female';
VALUE racefmt 1 = 'AA'
2 = 'White'
3 = 'Hispanic'
4 = 'Multi'
5 = 'Other';
PROC FREQ data = sas.example;
TABLES race*gender/LIST MISSING;
FORMAT gender gendfmt. race racefmt.;
TITLE 'FREQ: Gender and race crosstab - SAS dataset - Example';
RUN;
PROC FREQ – TABLES Output
race gender Frequency Percent Cumulative
Frequency
Cumulative
Percent
. . 35 17.24 35 17.24
AA Female 36 17.73 71 34.98
AA Male 43 21.18 114 56.16
White Female 37 18.23 151 74.38
White Male 31 15.27 182 89.66
Hispanic Female 5 2.46 187 92.12
Hispanic Male 7 3.45 194 95.57
Multi Female 3 1.48 197 97.04
Multi Male 2 0.99 199 98.03
Other Female 1 0.49 200 98.52
6 Female 2 0.99 202 99.51
6 Male 1 0.49 203 100.00
PROC CORR
• Creates a correlation coefficient that measures
the relationship between two variables.
• Example Syntax:
PROC CORR <options>;
BY <variable list>;
VAR <variable list>;
WITH <variable list>;
RUN;
PROC CORR – Basic Syntax
*** This uses the example dataset to conduct a ***
*** PROC CORR. The correlation matrix includes: ***
*** race, gender, age and the four sensation ***
*** seeking items ***;
PROC CORR data = sas.example;
VAR race gender age senseek1--senseek4;
TITLE 'Correlation matrix of variables in example data set';
RUN;
PROC CORR –Output
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum Label
race 168 1.75000 0.97114 294.00000 1.00000 6.00000 race
gender 168 0.50000 0.50149 84.00000 0 1.00000 gender
age 166 12.22289 0.52080 2029 11.00000 14.00000 age
senseek1 158 3.46203 1.37603 547.00000 1.00000 5.00000 senseek1
senseek2 158 2.78481 1.43363 440.00000 1.00000 5.00000 senseek2
senseek3 158 2.62658 1.46937 415.00000 1.00000 5.00000 senseek3
senseek4 158 3.28481 1.38735 519.00000 1.00000 5.00000 senseek4
PROC CORR – Correlation Table
Pearson Correlation Coefficients
Prob > |r| under H0: Rho=0
Number of Observations
race gender age senseek1 senseek2 senseek3 senseek4
race
race
1.00000
168
-0.08607
0.2673
168
-0.03687
0.6372
166
0.23718
0.0027
158
-0.08809
0.2710
158
-0.05463
0.4954
158
0.07986
0.3186
158
gender
gender
-0.08607
0.2673
168
1.00000
168
0.24364
0.0016
166
0.08767
0.2734
158
0.23030
0.0036
158
0.09074
0.2568
158
0.06865
0.3914
158
age
age
-0.03687
0.6372
166
0.24364
0.0016
166
1.00000
166
-0.04656
0.5626
157
0.03032
0.7062
157
0.16658
0.0371
157
0.01723
0.8304
157
senseek1
senseek1
0.23718
0.0027
158
0.08767
0.2734
158
-0.04656
0.5626
157
1.00000
158
0.49306
<.0001
158
0.29064
0.0002
158
0.50117
<.0001
158
senseek2
senseek2
-0.08809
0.2710
158
0.23030
0.0036
158
0.03032
0.7062
157
0.49306
<.0001
158
1.00000
158
0.35771
<.0001
158
0.46334
<.0001
158
senseek3
senseek3
-0.05463
0.4954
158
0.09074
0.2568
158
0.16658
0.0371
157
0.29064
0.0002
158
0.35771
<.0001
158
1.00000
158
0.43057
<.0001
158
senseek4
senseek4
0.07986
0.3186
158
0.06865
0.3914
158
0.01723
0.8304
157
0.50117
<.0001
158
0.46334
<.0001
158
0.43057
<.0001
158
1.00000
158
PROC CORR - Options
• Data Options
– Input: DATA=
– OUTPUT: OUT(letter)=
• Statistical Options
– Types of correlation coefficients
– NOMISS
• Graphics
– PLOTS= scatter
• Printed Output
PROC CORR - ALPHA
• Only available as part of the Pearson
Correlation statistics
• Internal consistency test for scales of items
that appear to be latent constructs.
• Higher positive scales are better.
• How high is good enough? Depends on the
research.
• Missing data can cause error – use NOMISS
option.
PROC CORR – Alpha
*** This program will include a correlation matrix ***
*** and Cronbach's coefficient alpha to assess internal
reliability ***
*** using the example data set ***;
PROC CORR alpha nomiss data = sas.example;
VAR senseek1--senseek4;
TITLE 'Alpha - sensation seeking variables in example data
set';
RUN;
PROC CORR – Alpha Output
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum Label
senseek1 158 3.46203 1.37603 547.00000 1.00000 5.00000 senseek1
senseek2 158 2.78481 1.43363 440.00000 1.00000 5.00000 senseek2
senseek3 158 2.62658 1.46937 415.00000 1.00000 5.00000 senseek3
senseek4 158 3.28481 1.38735 519.00000 1.00000 5.00000 senseek4
Cronbach Coefficient Alpha
Variables Alpha
Raw 0.743971
Standardized 0.745506
PROC CORR – Alpha Output
Cronbach Coefficient Alpha with Deleted Variable
Deleted
Variable
Raw Variables Standardized Variables
Label
Correlation
with Total Alpha
Correlation
with Total Alpha
senseek1 0.545495 0.681070 0.547708 0.682299 senseek1
senseek2 0.561429 0.671476 0.563171 0.673517 senseek2
senseek3 0.443853 0.738883 0.443617 0.739238 senseek3
senseek4 0.606300 0.646596 0.606967 0.648182 senseek4
Pearson Correlation Coefficients, N = 158
Prob > |r| under H0: Rho=0
senseek1 senseek2 senseek3 senseek4
senseek1
senseek1
1.00000 0.49306
<.0001
0.29064
0.0002
0.50117
<.0001
senseek2
senseek2
0.49306
<.0001
1.00000 0.35771
<.0001
0.46334
<.0001
senseek3
senseek3
0.29064
0.0002
0.35771
<.0001
1.00000 0.43057
<.0001
senseek4
senseek4
0.50117
<.0001
0.46334
<.0001
0.43057
<.0001
1.00000
A Note About Missingness
• Whole courses can and have been taught
about missing data
• What about missing data and analysis?
• Know your data –> and that includes missing
data
• Talk to your team -> standards for handling
missing data
• Applications to correct for missingness
Basics of Output Delivery System
(ODS)
• Procedures only produce data.
• Output Delivery System (ODS) determines
where output should go and what it should
look like.
• Many different ways to display output.
• The example that I use the most – and will
describe here- creates RTF formatted
documents
ODS RTF (Rich Text Format)
• ODS RTF is an easy way to create output that can
be directly used in Word reports and PowerPoint
presentations
• Example Syntax:
ODS RTF <options>;
procedures to be run;
ODS RTF CLOSE;
• Key Options
– FILE = Where output is placed: “pathname of file.rtf”;
– STYLE = Style definitions; see documentation
ODS RTF Syntax
*** This example uses the ODS RTF commands to create ***
*** a RTF output of means. Notice the use of the STYLE = ***
*** options to set it up in a APA-like format ***;
ODS RTF FILE = "C:UsersschmidDesktopRandom SAS
stuffmeans.rtf" STYLE=JOURNAL;
PROC MEANS DATA = sas.example;
VAR senseek1--senseek4;
TITLE 'Standard output for means using example data';
RUN;
ODS RTF CLOSE;
ODS RTF Output
Variable Label N Mean Std Dev Minimum Maximum
senseek1
senseek2
senseek3
senseek4
senseek1
senseek2
senseek3
senseek4
158
158
158
158
3.4620253
2.7848101
2.6265823
3.2848101
1.3760281
1.4336350
1.4693652
1.3873485
1.0000000
1.0000000
1.0000000
1.0000000
5.0000000
5.0000000
5.0000000
5.0000000

More Related Content

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Workshop SAS FOR DATA MANAGEMENT, Session 3

  • 2. Introduction to SAS Procedures • SAS data set information – PROC CONTENTS – PROC PRINT • Descriptive statistics – PROC MEANS – PROC UNIVARIATE – PROC FREQ – PROC CORR
  • 3. What does my SAS Data Set Contain? • How many observations? • How many variables? • What kinds of variables?
  • 4. PROC CONTENTS • Provides information about the contents of a SAS data set. • Example Syntax: PROC CONTENTS <options>; TITLE ‘Contents Listing’; RUN;
  • 5. PROC CONTENTS - Options • DATA = <SAS data file> • OUT = <SAS data file> • DETAILS / NODETAILS • ORDER= • VARNUM
  • 6. PROC CONTENTS • Key items to look for: • Data set name • # of observations • # of variables • Date data set was created and last modified • List of variables with type, format and label
  • 7. PROC CONTENTS – Example Program *** This shows an example of PROC CONTENTS using the *** *** example data set ***; PROC CONTENTS DATA = sas.example; TITLE 'Contents listing - Example data set'; RUN; *** A TITLE statement includes the keyword TITLE and quotes *** *** either single or double - that give the output some *** *** meaningful title - best practice is to include *** *** the name of the procedure and the data set being used ***;
  • 8. PROC CONTENTS – Example Output Data Set Name SAS.EXAMPLE Observations 203 Member Type DATA Variables 17 Engine V9 Indexes 0 Created Wed, Feb 25, 2015 11:32:49 AM Observation Length 104 Last Modified Wed, Feb 25, 2015 11:32:49 AM Deleted Observations 0 Protection Compressed NO Data Set Type Sorted NO Label Data Representation WINDOWS_64 Encoding wlatin1 Western (Windows) Engine/Host Dependent Information Data Set Page Size 12288 Number of Data Set Pages 2 First Data Page 1 Max Obs per Page 117 Obs in First Data Page 90 Number of Data Set Repairs 0 Filename C:UsersschmidDesktopRandom SAS stuffexample.sas7bdat Release Created 9.0301M2 Host Created X64_7PRO
  • 9. PROC CONTENTS – Example Output Alphabetic List of Variables and Attributes # Variable Type Len Format Informat Label 1 ID Num 8 ID 5 age Num 8 age 2 date Num 8 DATE9. DATE9. date 4 gender Num 8 gender 14 livewdad Num 8 livewdad 13 livewmom Num 8 livewmom 12 momcode Char 1 $1. $1. momcode 11 momed Num 8 momed 17 nontrad Char 1 $1. $1. nontrad 3 race Num 8 race 6 sensation_seeking Char 1 $1. $1. sensation seeking 7 senseek1 Num 8 senseek1 8 senseek2 Num 8 senseek2 9 senseek3 Num 8 senseek3 10 senseek4 Num 8 senseek4 15 totfam Char 1 $1. $1. totfam 16 tradfam Char 1 $1. $1. tradfam
  • 10. PROC PRINT • PROC PRINT -> prints a list of observations in a SAS data set. • Example syntax: PROC PRINT <options>; WHERE condition; VAR variable list; BY variable list; SUM variable list; TITLE ‘Print’; RUN;
  • 11. PROC PRINT - Options • DATA = <SAS Data Set> • BLANKLINE = (n) • DOUBLE • HEADING • LABEL • NOOBS • ROUND • ROWS = PAGE
  • 12. PROC PRINT – VAR statement • Lists the variables to be printed. • The VAR statement is optional. • If omitted all the variables in the data set will be printed. • Variables are printed in the order listed in the VAR statement.
  • 13. Example: PROC PRINT syntax *** This shows an example of PROC PRINT *** *** using the VAR statement and only printing *** *** only printing student ID, gender and race *** *** using the example data set ***; PROC PRINT DATA = sas.example NOOBS; VAR id gender race; TITLE 'Print list - studentid gender and race - example data set'; RUN;
  • 14. Example: PROC PRINT output ID gender race 1188 1 1 1214 0 1 1146 0 2 1203 0 1 1218 1 2 1101 1 1 1102 1 1 1103 0 2 1104 0 2 1105 1 2 1106 0 1 1108 0 1 1109 1 2 1110 0 1 1112 0 2 1114 1 1 1115 1 1 1116 0 2 1118 1 3
  • 15. PROC PRINT – BY Statement • Prints data separately for each group in the BY variable. • The BY statement is optional • When using the BY statement, the data must first be sorted by the variable(s) listed in the BY statement.
  • 16. PROC PRINT – BY syntax *** This is an example of a PROC PRINT using the BY statement ***; PROC SORT DATA = sas.example; BY age; RUN; PROC PRINT DATA= sas.example; VAR senseek1--senseek4; BY age; TITLE 'PRINT LIST - senseek by age'; RUN;
  • 17. PROC PRINT – BY OUTPUT • Age = 11 • Age = 12 Obs senseek1 senseek2 senseek3 senseek4 38 5 3 2 2 39 5 5 1 5 40 4 4 2 4 Obs senseek1 senseek2 senseek3 senseek4 41 4 5 3 3 42 5 3 1 4 43 4 1 1 3 44 4 3 2 3 45 5 2 3 4 46 2 2 3 3 47 4 4 2 4 48 3 4 4 3 49 5 5 5 5
  • 18. PROC PRINT – WHERE statement • WHERE statement can be used to display a subset of the data set. • The WHERE statement works in the PROC step as well as the DATA step
  • 19. PROC PRINT – WHERE syntax *** This is an example of a PROC PRINT using a WHERE *** *** statement using the example data set ***; DATA one; SET sas.example; PROC PRINT; WHERE age = 14 VAR age senseek1--senseek4; TITLE 'PRINT - Age 14 - sensation seeking using example data'; RUN;
  • 20. PROC PRINT – WHERE OUTPUT Obs age senseek1 senseek2 senseek3 senseek4 199 14 4 4 5 5 200 14 5 4 3 4 201 14 1 1 4 1 202 14 5 5 1 5 203 14 1 1 1 1
  • 21. How To Obtain Descriptive Statistics • PROC MEANS • PROC UNIVARIATE • PROC FREQ • PROC CORR
  • 22. PROC MEANS • Example Syntax: PROC MEANS <options> <statistic keyword list>; WHERE condition; VAR variable list; CLASS variable list; BY variable list; OUTPUT <OUT = SAS dataset>; RUN;
  • 23. PROC MEANS - Options • DATA = • Classification levels control • Output control • Output dataset control • Statistical analysis control
  • 24. PROC MEANS – STATISTIC KEYWORDS: DEFAULT • Statistics printed by default • N – Number of observations • MEAN – mean • STD – standard deviation • MIN – minimum value • MAX – maximum value
  • 25. PROC MEANS EXAMPLE 1 *** This program is a standard PROC *** *** MEANS looking at four sensation *** *** seeking items in the example dataset ***; PROC MEANS DATA = sas.example; VAR senseek1--senseek4; TITLE 'Standard output for means using example data'; RUN;
  • 26. PROC MEANS EXAMPLE 1 - OUTPUT Variable N Mean Std Dev Minimum Maximum sseek1 sseek2 sseek3 sseek4 158 158 158 158 3.46 2.78 2.62 3.28 1.37 1.43 1.47 1.39 1.00 1.00 1.00 1.00 5.00 5.00 5.00 5.00
  • 27. PROC MEANS – OTHER STATISTIC KEYWORDS • CLM = two sided confidence limits • Median = Median • NMISS = Number of missing values • P10 = 10% quantile • Q1 = 25% quantile • Range = the range • STDERR = Standard error of the mean • SUM = Sum • VAR = Variance • T = Student’s t
  • 28. PROC MEANS – EXAMPLE 2 *** This program is a standard PROC MEANS on 2 items and doing a paired t-test ***; DATA one; SET sas.example; PROC MEANS; VAR senseek1 senseek2; TITLE 'Means of the 2 sensation seeking items to be used in t test'; RUN; DATA two; SET one; ATTRIB difseek label = 'Differences between senseek1 and senseek2'; difseek = senseek2 - senseek1; PROC MEANS n mean stderr t prt; VAR difseek; TITLE 'T test - differences between senseek2 and senseek1'; TITLE2 'Example data set'; RUN;
  • 29. PROC MEANS – EXAMPLE 2 OUTPUT Variable Label N Mean Std Dev Minimum Maximum senseek1 senseek2 senseek1 senseek2 158 158 3.4620253 2.7848101 1.3760281 1.4336350 1.0000000 1.0000000 5.0000000 5.0000000 Analysis Variable : difseek Differences between senseek1 and senseek2 N Mean Std Error t Value Pr > |t| 158 -0.6772152 0.1126051 -6.01 <.0001
  • 30. PROC MEANS – CLASS Statement • Class statement: calculates statistics for each group in CLASS variable. • CLASS variables can be numeric or character. • Data does not need to be sorted to use CLASS statement.
  • 31. PROC MEANS –CLASS syntax *** This program includes a CLASS statement. *** *** The CLASS statement creates separate analyses *** *** for each category of data specified by the CLASS *** *** statement. ***; PROC MEANS DATA = sas.example; CLASS age; VAR senseek1--senseek4; TITLE 'means of sensation seeking items by age'; TITLE2 'example data set'; RUN;
  • 32. PROC MEANS – CLASS Output age N Obs Variable Label N Mean Std Dev Minimum Maximum 11 3 senseek1 senseek2 senseek3 senseek4 senseek1 senseek2 senseek3 senseek4 3 3 3 3 4.6666667 4.0000000 1.6666667 3.6666667 0.5773503 1.0000000 0.5773503 1.5275252 4.0000000 3.0000000 1.0000000 2.0000000 5.0000000 5.0000000 2.0000000 5.0000000 12 128 senseek1 senseek2 senseek3 senseek4 senseek1 senseek2 senseek3 senseek4 121 121 121 121 3.4380165 2.7024793 2.5123967 3.2479339 1.3837905 1.4002656 1.4440435 1.3740511 1.0000000 1.0000000 1.0000000 1.0000000 5.0000000 5.0000000 5.0000000 5.0000000 13 30 senseek1 senseek2 senseek3 senseek4 senseek1 senseek2 senseek3 senseek4 28 28 28 28 3.5000000 2.9642857 3.1785714 3.4285714 1.2909944 1.5511559 1.5166623 1.3991683 1.0000000 1.0000000 1.0000000 1.0000000 5.0000000 5.0000000 5.0000000 5.0000000 14 5 senseek1 senseek2 senseek3 senseek4 senseek1 senseek2 senseek3 senseek4 5 5 5 5 3.2000000 3.0000000 2.8000000 3.2000000 2.0493902 1.8708287 1.7888544 2.0493902 1.0000000 1.0000000 1.0000000 1.0000000 5.0000000 5.0000000 5.0000000 5.0000000
  • 33. PROC UNIVARIATE • Provides descriptive statistics for numeric variables (mean, standard deviation, range, min, max, etc.) • Provides more detailed information on the distribution of a variable (extreme values, plots of distribution, etc.)
  • 34. PROC UNIVARIATE • Example Syntax: PROC UNIVARIATE <options>; WHERE condition; VAR variable list; BY variable list; FREQ variable list; RUN;
  • 35. PROC UNIVARIATE Syntax *** This is an example of a standard PROC *** *** UNIVARIATE program. It uses the variable - *** *** mom's education - in the example dataset ***; PROC UNIVARIATE data = sas.example; VAR momed; TITLE 'Univariate - mom's education - example dataset'; RUN;
  • 36. PROC UNIVARIATE – Output Moments N 155 Sum Weights 155 Mean 3.66451613 Sum Observations 568 Std Deviation 1.15813004 Variance 1.34126519 Skewness -0.7379811 Kurtosis -0.3846022 Uncorrected SS 2288 Corrected SS 206.554839 Coeff Variation 31.6039007 Std Error Mean 0.09302324 Basic Statistical Measures Location Variability Mean 3.664516 Std Deviation 1.15813 Median 4.000000 Variance 1.34127 Mode 4.000000 Range 4.00000 Interquartile Range 1.00000 Tests for Location: Mu0=0 Test Statistic p Value Student's t t 39.39355 Pr > |t| <.0001 Sign M 77.5 Pr >= |M| <.0001 Signed Rank S 6045 Pr >= |S| <.0001 Quantiles (Definition 5) Quantile Estimate 100% Max 5 99% 5 95% 5 90% 5 75% Q3 4 50% Median 4 25% Q1 3 10% 2 5% 1 1% 1 0% Min 1
  • 37. PROC UNIVARIATE – Output (cont.) Extreme Observations Lowest Highest Value Obs Value Obs 1 199 5 190 1 185 5 191 1 169 5 193 1 137 5 194 1 95 5 196 Missing Values Missing Value Count Percent Of All Obs Missing Obs . 48 23.65 100.00
  • 38. PROC UNIVARIATE - Plots • Many different visualization options are available using PROC UNIVARIATE and coordinating statements – HISTOGRAM – PPPLOT – PROBPLOT – QQPLOT – CDFPLOT
  • 39. PROC UNIVARIATE – Plot Syntax LIBNAME sas "C:UsersschmidDesktopRandom SAS stuff"; DATA one; SET sas.example; PROC UNIVARIATE PLOT; VAR momed; TITLE 'General plots given by univariate procedure for momed - SAS example data'; RUN; PROC UNIVARIATE PLOT; VAR momed; HISTOGRAM; TITLE 'Histogram given by univariate procedure for momed - SAS example data'; RUN;
  • 40. PROC UNIVARIATE – Plot Output
  • 41. PROC UNIVARIATE – Plot output
  • 42. PROC FREQ • Provides descriptive statistics in the form of frequencies and crosstabulation tables. • Provides statistics to analyze the relationships between variables.
  • 43. PROC FREQ • Example Syntax: PROC FREQ <options>; BY variable list; TABLES variable list </options>; TEST <options>; OUTPUT <OUT=DATA>; RUN; *If TABLES statement is omitted, one-way tables will be generated for all variables.
  • 44. PROC FREQ - Options • COMPRESS • DATA = • FORMCHAR = • NLEVELS • NOPRINT • ORDER= • PAGE
  • 45. PROC FREQ – Basic Table Syntax *** This is a standard PROC FREQ program. *** *** The variables used are race and gender *** *** Refresher information on formats *** *** Example dataset continues to be used ***; PROC FORMAT; VALUE gendfmt 1 = 'Male' 2 = 'Female'; VALUE racefmt 1 = 'AA' 2 = 'White' 3 = 'Hispanic' 4 = 'Multi' 5 = 'Other'; PROC FREQ; TABLES gender race; FORMAT gender gendfmt. race racefmt.; TITLE 'Frequency: gender and race'; TITLE2 'DATA SET: example'; RUN;
  • 46. PROC FREQ – Basic Table Output gender gender Frequency Percent Cumulative Frequency Cumulative Percent Male 84 50.00 84 50.00 Female 84 50.00 168 100.00 Frequency Missing = 35 race race Frequency Percent Cumulative Frequency Cumulative Percent AA 79 47.02 79 47.02 White 68 40.48 147 87.50 Hispanic 12 7.14 159 94.64 Multi 5 2.98 164 97.62 Other 4 2.38 168 100.00 Frequency Missing = 35
  • 47. PROC FREQ • Provides various forms of crosstabulation tables. • One-way frequencies -> generates a table with the frequency of the different values of a variable. • Two-way crosstabulation table -> generates a frequency table with the values of the two variables. • N-way crosstabulation table -> generates a n-way frequency table with the values of the n variables.
  • 48. PROC FREQ – Crosstab Syntax *** This program is an example of a crosstab *** *** available as part of the PROC FREQ. *** *** Variables used are gender and race *** *** in the example dataset ***; PROC FREQ; TABLES race*gender; FORMAT gender gendfmt. race racefmt.; TITLE 'Crosstab - gender and race'; RUN;
  • 49. PROC FREQ – Crosstab Output Table of race by gender race(race) gender(gender) Frequency Percent Row Pct Col Pct Male Female Total AA 36 21.43 45.57 42.86 43 25.60 54.43 51.19 79 47.02 White 37 22.02 54.41 44.05 31 18.45 45.59 36.90 68 40.48 Hispanic 5 2.98 41.67 5.95 7 4.17 58.33 8.33 12 7.14 Multi 3 1.79 60.00 3.57 2 1.19 40.00 2.38 5 2.98 Other 3 1.79 75.00 3.57 1 0.60 25.00 1.19 4 2.38 Total 84 50.00 84 50.00 168 100.00 Frequency Missing = 35
  • 50. PROC FREQ – TABLES Statement Options • LIST -> A list rather than a table. • MISSING -> Missing values are included in calculations of percentages. • NOCOL -> Suppresses column percentages. • NOROW -> Suppresses row percentages.
  • 51. PROC FREQ – TABLES statement options • Agree -> Test and measures of classification agreement. • CHISQ -> Chi Square test of association • CL -> Confidence limits • CMH -> Mantel-Haenszel statistics • MEASURES -> Association between variables
  • 52. PROC FREQ – TABLES syntax *** This program is an example of a crosstab using variables race and gender *** *** Specifically, this shows an example of the options: LIST and MISSING ***; LIBNAME sas " "; PROC FORMAT; VALUE gendfmt 1 = 'Male' 2 = 'Female'; VALUE racefmt 1 = 'AA' 2 = 'White' 3 = 'Hispanic' 4 = 'Multi' 5 = 'Other'; PROC FREQ data = sas.example; TABLES race*gender/LIST MISSING; FORMAT gender gendfmt. race racefmt.; TITLE 'FREQ: Gender and race crosstab - SAS dataset - Example'; RUN;
  • 53. PROC FREQ – TABLES Output race gender Frequency Percent Cumulative Frequency Cumulative Percent . . 35 17.24 35 17.24 AA Female 36 17.73 71 34.98 AA Male 43 21.18 114 56.16 White Female 37 18.23 151 74.38 White Male 31 15.27 182 89.66 Hispanic Female 5 2.46 187 92.12 Hispanic Male 7 3.45 194 95.57 Multi Female 3 1.48 197 97.04 Multi Male 2 0.99 199 98.03 Other Female 1 0.49 200 98.52 6 Female 2 0.99 202 99.51 6 Male 1 0.49 203 100.00
  • 54. PROC CORR • Creates a correlation coefficient that measures the relationship between two variables. • Example Syntax: PROC CORR <options>; BY <variable list>; VAR <variable list>; WITH <variable list>; RUN;
  • 55. PROC CORR – Basic Syntax *** This uses the example dataset to conduct a *** *** PROC CORR. The correlation matrix includes: *** *** race, gender, age and the four sensation *** *** seeking items ***; PROC CORR data = sas.example; VAR race gender age senseek1--senseek4; TITLE 'Correlation matrix of variables in example data set'; RUN;
  • 56. PROC CORR –Output Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum Label race 168 1.75000 0.97114 294.00000 1.00000 6.00000 race gender 168 0.50000 0.50149 84.00000 0 1.00000 gender age 166 12.22289 0.52080 2029 11.00000 14.00000 age senseek1 158 3.46203 1.37603 547.00000 1.00000 5.00000 senseek1 senseek2 158 2.78481 1.43363 440.00000 1.00000 5.00000 senseek2 senseek3 158 2.62658 1.46937 415.00000 1.00000 5.00000 senseek3 senseek4 158 3.28481 1.38735 519.00000 1.00000 5.00000 senseek4
  • 57. PROC CORR – Correlation Table Pearson Correlation Coefficients Prob > |r| under H0: Rho=0 Number of Observations race gender age senseek1 senseek2 senseek3 senseek4 race race 1.00000 168 -0.08607 0.2673 168 -0.03687 0.6372 166 0.23718 0.0027 158 -0.08809 0.2710 158 -0.05463 0.4954 158 0.07986 0.3186 158 gender gender -0.08607 0.2673 168 1.00000 168 0.24364 0.0016 166 0.08767 0.2734 158 0.23030 0.0036 158 0.09074 0.2568 158 0.06865 0.3914 158 age age -0.03687 0.6372 166 0.24364 0.0016 166 1.00000 166 -0.04656 0.5626 157 0.03032 0.7062 157 0.16658 0.0371 157 0.01723 0.8304 157 senseek1 senseek1 0.23718 0.0027 158 0.08767 0.2734 158 -0.04656 0.5626 157 1.00000 158 0.49306 <.0001 158 0.29064 0.0002 158 0.50117 <.0001 158 senseek2 senseek2 -0.08809 0.2710 158 0.23030 0.0036 158 0.03032 0.7062 157 0.49306 <.0001 158 1.00000 158 0.35771 <.0001 158 0.46334 <.0001 158 senseek3 senseek3 -0.05463 0.4954 158 0.09074 0.2568 158 0.16658 0.0371 157 0.29064 0.0002 158 0.35771 <.0001 158 1.00000 158 0.43057 <.0001 158 senseek4 senseek4 0.07986 0.3186 158 0.06865 0.3914 158 0.01723 0.8304 157 0.50117 <.0001 158 0.46334 <.0001 158 0.43057 <.0001 158 1.00000 158
  • 58. PROC CORR - Options • Data Options – Input: DATA= – OUTPUT: OUT(letter)= • Statistical Options – Types of correlation coefficients – NOMISS • Graphics – PLOTS= scatter • Printed Output
  • 59. PROC CORR - ALPHA • Only available as part of the Pearson Correlation statistics • Internal consistency test for scales of items that appear to be latent constructs. • Higher positive scales are better. • How high is good enough? Depends on the research. • Missing data can cause error – use NOMISS option.
  • 60. PROC CORR – Alpha *** This program will include a correlation matrix *** *** and Cronbach's coefficient alpha to assess internal reliability *** *** using the example data set ***; PROC CORR alpha nomiss data = sas.example; VAR senseek1--senseek4; TITLE 'Alpha - sensation seeking variables in example data set'; RUN;
  • 61. PROC CORR – Alpha Output Simple Statistics Variable N Mean Std Dev Sum Minimum Maximum Label senseek1 158 3.46203 1.37603 547.00000 1.00000 5.00000 senseek1 senseek2 158 2.78481 1.43363 440.00000 1.00000 5.00000 senseek2 senseek3 158 2.62658 1.46937 415.00000 1.00000 5.00000 senseek3 senseek4 158 3.28481 1.38735 519.00000 1.00000 5.00000 senseek4 Cronbach Coefficient Alpha Variables Alpha Raw 0.743971 Standardized 0.745506
  • 62. PROC CORR – Alpha Output Cronbach Coefficient Alpha with Deleted Variable Deleted Variable Raw Variables Standardized Variables Label Correlation with Total Alpha Correlation with Total Alpha senseek1 0.545495 0.681070 0.547708 0.682299 senseek1 senseek2 0.561429 0.671476 0.563171 0.673517 senseek2 senseek3 0.443853 0.738883 0.443617 0.739238 senseek3 senseek4 0.606300 0.646596 0.606967 0.648182 senseek4 Pearson Correlation Coefficients, N = 158 Prob > |r| under H0: Rho=0 senseek1 senseek2 senseek3 senseek4 senseek1 senseek1 1.00000 0.49306 <.0001 0.29064 0.0002 0.50117 <.0001 senseek2 senseek2 0.49306 <.0001 1.00000 0.35771 <.0001 0.46334 <.0001 senseek3 senseek3 0.29064 0.0002 0.35771 <.0001 1.00000 0.43057 <.0001 senseek4 senseek4 0.50117 <.0001 0.46334 <.0001 0.43057 <.0001 1.00000
  • 63. A Note About Missingness • Whole courses can and have been taught about missing data • What about missing data and analysis? • Know your data –> and that includes missing data • Talk to your team -> standards for handling missing data • Applications to correct for missingness
  • 64. Basics of Output Delivery System (ODS) • Procedures only produce data. • Output Delivery System (ODS) determines where output should go and what it should look like. • Many different ways to display output. • The example that I use the most – and will describe here- creates RTF formatted documents
  • 65. ODS RTF (Rich Text Format) • ODS RTF is an easy way to create output that can be directly used in Word reports and PowerPoint presentations • Example Syntax: ODS RTF <options>; procedures to be run; ODS RTF CLOSE; • Key Options – FILE = Where output is placed: “pathname of file.rtf”; – STYLE = Style definitions; see documentation
  • 66. ODS RTF Syntax *** This example uses the ODS RTF commands to create *** *** a RTF output of means. Notice the use of the STYLE = *** *** options to set it up in a APA-like format ***; ODS RTF FILE = "C:UsersschmidDesktopRandom SAS stuffmeans.rtf" STYLE=JOURNAL; PROC MEANS DATA = sas.example; VAR senseek1--senseek4; TITLE 'Standard output for means using example data'; RUN; ODS RTF CLOSE;
  • 67. ODS RTF Output Variable Label N Mean Std Dev Minimum Maximum senseek1 senseek2 senseek3 senseek4 senseek1 senseek2 senseek3 senseek4 158 158 158 158 3.4620253 2.7848101 2.6265823 3.2848101 1.3760281 1.4336350 1.4693652 1.3873485 1.0000000 1.0000000 1.0000000 1.0000000 5.0000000 5.0000000 5.0000000 5.0000000