3. 1.Reading Data
1.SAS
I. There are many way of data reading in SAS, but the most common one is
Data data_name; data name used for file
Input x y z; variables
Datalines;
1 2 4
4 5 7
. . . Copy and paste data here
;
Run;
II. Data can be read from the folder of computer.
data data_name;
infile 'C:Avi soy.txt';
input GE ENV YIELD;
run;
filename avi 'C:Avi soy.txt';
data data_name;
infile avi;
input GE ENV YIELD;
run;
III. Data can be imported using the import option under File tab.
2.R
> getwd() # get current working directory
> setwd("C:/MyFolder") # set working directory
First Set working directory to the folder where data is present
If data is in text format, then data can be read as:
I. mydata <- read.table(“data_name.txt”)
If data is in csv format, then data can be read as:
II. mydata <- read.csv(“data_name.csv”)
4. 2. Descriptive Statistics
1.SAS
Descriptive statistics in SAS can be computed using PROC UNIVARIATE
PROC UNIVARIATE DATA=data_name;
VAR yield; Descriptive statistics for variable
By ENV; By Group
Histogram;
RUN;
Other ways are PROC MEANS and PROC FREQ
PROC FREQ DATA=data_name;
TABLES yield ;
RUN;
PROC MEANS DATA=data_name;
CLASS GE;
VAR yield;
RUN;
2.R
Descriptive statistics in R can be computed by following ways;
summary(mydata)
There are some packages in R that can be loaded and used for DS such as Hmisc, pastecs,
and psych .
library(Hmisc)
describe(mydata)
library(pastecs)
stat.desc(mydata)
library(psych)
describe(mydata)
Note: First install these packages.
5. 3. Correlation and Covariance
There are different methods of correlations such as pearson, spearman or kendal.
1.SAS
Correlation and covariance in SAS for different methods:
proc corr cov data=data_name pearson spearman kendall hoeffding plots=all;
var x y z;
run;
2.R
Correlation and covariance in R for different methods:
cor(mydata, use= "complete.obs", method= pearson )
cov(mydata, use= "complete.obs", method= pearson )
Note: Here mydata is numeric data frame. Method can be changed.
There are some packages that can be loaded such as Hmisc package.
library(Hmisc)
rcorr(mydata, type="pearson") # type can be pearson or spearman
Correlation between two variables x and y
cor(x,y)
6. 4. Analysis of Variance (ANOVA)
1.SAS
PROC ANOVA, PROC GLM, PROC MIXED, PROC GLIMMIX, and PROC HPMIXED
can be used for ANOVA.
Proc ANOVA for non-missing data.
PROC GLM for fixed effect factors.
PROC MIXED, PROC GLIMMIX, and PROC HPMIXED for random and fixed effect
factors.
Proc ANOVA/GLM data=data_name;
Class factorvar;
Model responsevar= factorvar;
MEANS factorvars / BON T LSD TUKEY;
Run;
Proc MIXED/GLIMMIX/HPMIXED data=data_name;
Class factorvar;
Model responsevar= factorvar;
Random factorvar;
lsmeans A / adjust= BON T TUKEY;
Run;
Note: Select one model depending on your dataset.
2.R
Analysis of variance (ANOVA) can be computed in R using:
x <- aov(responvar ~ factorvar, data=mydata) #CBD
x <- aov(y ~ A + B, data=mydata) #RCBD
x <- aov(y ~ A + B + A:B, data=mydata) #factorial design
summary(x)
Multiple comparisons
TukeyHSD(x)
7. 5. Regression and Multiple Regressions
1.SAS
Simple Regression
Proc reg data=data_name;
model response_var = factor_var1;
run;
Multiple Regression
Proc reg data=data_name;
model response_var = factor_var1 factor_var2 factor_var3 factor_var4 ;
run;
2.R
Multiple Linear Regression
x<- lm(y ~ x1 + x2 + x3, data=mydata)
summary(fit) # show results