SlideShare a Scribd company logo
R Code
for
Descriptive
Statistics of
Phenotypic Data
Avjinder Singh Kaler
Steps
1. Reading data into R—the data file includes a header and
“NA” is a missing value.
data_in <‐‐‐ read.table (file="example.dat", header= T, na.string="NA")
2. Getting access to the data frame (all variables will relate to
this data frame).
attach(data_in)
3. Overview of data (mean, median, maximum, minimum, first
and third quartiles, number of missing values).
summary(data_in)
4. Calculating means and standard deviations (2 indicates that
the function is applied to all columns, na.rm=T means that
missing values are removed).
apply(data_in,2,mean,na.rm=T)
apply(data_in,2,sd,na.rm=T)
5. Means or standard deviations can be calculated for data
points grouped by a factor (e.g. year).
aggregate(data_in,list(data$year),FUN=mean)
aggregate(data_in,list(data$year),FUN=sd)
6. Frequencies for a single variable and across two variables.
table(sex)
table(wormy)
table(wormy,sex)
7. Histogram.
hist(FWEC)
8. xy scatter plots.
plot(FWEC)
9. Box plot.
boxplot(FWEC_sex,data=data_in, range=0)
10. Shapiro–Wilk’s tests to check normality of data
distribution.
shapiro.test(FWEC)
11. Checking data distribution with QQ plot—if data are
normally distributed, the plotted data and the line are well
aligned.
qqplot(FWEC)
qqline(FWEC)
12. Data transformation—log, square root, and cube root
transformation.
log_FWEC <‐‐‐ log(FWEC)
sqrt_FWEC <‐‐‐ sqrt(FWEC+1)
cbrt_FWEC <‐‐‐ (FWEC)^(1/3)
13. Box–Cox transformation.
#CodetofindsuitablelambdaforYtothepowerlambda
#download thelibrary(MASS)
#seq(min value, max value, step) defines the range
from which lambda is drawn
boxcox(FWEC_factor(sex)+factor(birth_rearing_
type), lambda = seq(0,1.0,0.01)
savePlot("boxcox","jpeg")
lambda = "insert maximum lambda value in graph here"
trans(FWEC) <- ((FWEC^lambda)-1)/lambda MASS library
14. Checking homogeneity of variances.
#download library (Rcmdr)
library(Rcmdr)
#run the Leven’s test, specifying the vector of
data y and group, the factor across which the variances
are tested (e.g., year)
leveneTest(y,group)
15. Fitting a linear model and ANOVA.
#need to load the "car" package for Type III ANOVA
library(car)
lmod <- lm(cbrt_FWEC_factor(sex))
#Type I ANOVA
anova(lm)
#Type III ANOVA---Note that the first letter in the
commandbelow has to be a capital "A" (ensure that
you loaded the "car" package as shown above)
Anova(lmod, type¼"III")
16. Addressing confounding of explanatory variables in a linear
model.
lmod1 <- lm(cbrt_FWEC_factor(sex)+factor(birth_
type)*factor(rearing_type))
lmod2 <-lm(cbrt_FWEC_factor(sex)+factor(birth_
rearing_type))
17. Check the difference with an ANOVA.
Anova(lmod1,type="III")
Anova(lmod2,type="III")
18. Model comparison using logistic regression for binary data.
logres <- glm(formula=wormy_status_factor(sex) +
factor(birth_rearing_type), family = binomial
(link="logit"))
#producing an analysis-of-deviance table to test
fixed effects
anova(logres,test="Chisq")
#produces the deviance of the model (the lower the
better the fit)
summary(glm(formula=wormy_status_factor(sex) +
factor(birth_rearing_type), family = binomial
(link="logit"))$deviance))
#the difference in deviance can be formally tested
with a loglikelihood ratio test
#install library(lme4)
library(lme4)
#comparing two nested models ("nested" means that
one has one more factor than the other)
logres1 <- lmer(wormy_status_factor(sex)), family
= "binomial", method="Laplace")
logres2 <- lmer(wormy_status_factor(sex) + factor
(birth_rearing_type), family = "binomial",
Method="Laplace")
anova(logres1,logres2)
#to assess the model, plot predicted probability
against observed proportion
#install library(languageR)
library(languageR)
plot.logistics.fit.fnc(logres1,logres2)
19. Model diagnostics.
#the following produces plot of residual vs. fitted
value, QQ plot, and scale-location plot of the
previously tested model 1 (lmod1)
plot(lmod1)
#assessing a logit model for binary data by plotting
the predicted probability against observed
proportions
#download library(languageR)
library(languageR)
plot.logistic.fit.fnc(logres1,data_in)
20. Extracting residuals and writing them to a file—assuming
lmod2 is the model of choice.
res_lmod2 <-residuals(lmod2)
write.table(res_lmod2,file¼"res_FWEC")

More Related Content

What's hot

Data handling in r
Data handling in rData handling in r
Data handling in r
Abhik Seal
 
Handling Missing Values
Handling Missing Values Handling Missing Values
Handling Missing Values
Rupak Roy
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
Florian Uhlitz
 
Data Visualization using base graphics
Data Visualization using base graphicsData Visualization using base graphics
Data Visualization using base graphics
Rupak Roy
 
Manipulating Data using DPLYR in R Studio
Manipulating Data using DPLYR in R StudioManipulating Data using DPLYR in R Studio
Manipulating Data using DPLYR in R Studio
Rupak Roy
 
Data manipulation with dplyr
Data manipulation with dplyrData manipulation with dplyr
Data manipulation with dplyr
Romain Francois
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors
krishna singh
 
Data tidying with tidyr meetup
Data tidying with tidyr  meetupData tidying with tidyr  meetup
Data tidying with tidyr meetup
Matthew Samelson
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
FAO
 
Export Data using R Studio
Export Data using R StudioExport Data using R Studio
Export Data using R Studio
Rupak Roy
 
Stata Cheat Sheets (all)
Stata Cheat Sheets (all)Stata Cheat Sheets (all)
Stata Cheat Sheets (all)
Laura Hughes
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in R
Jeffrey Breen
 
Introduction To R Language
Introduction To R LanguageIntroduction To R Language
Introduction To R Language
Gaurang Dobariya
 
R seminar dplyr package
R seminar dplyr packageR seminar dplyr package
R seminar dplyr package
Muhammad Nabi Ahmad
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using R
Yogesh Khandelwal
 
5. working on data using R -Cleaning, filtering ,transformation, Sampling
5. working on data using R -Cleaning, filtering ,transformation, Sampling5. working on data using R -Cleaning, filtering ,transformation, Sampling
5. working on data using R -Cleaning, filtering ,transformation, Sampling
krishna singh
 
R language introduction
R language introductionR language introduction
R language introduction
Shashwat Shriparv
 
5 R Tutorial Data Visualization
5 R Tutorial Data Visualization5 R Tutorial Data Visualization
5 R Tutorial Data Visualization
Sakthi Dasans
 
Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat Sheet
Laura Hughes
 

What's hot (19)

Data handling in r
Data handling in rData handling in r
Data handling in r
 
Handling Missing Values
Handling Missing Values Handling Missing Values
Handling Missing Values
 
Next Generation Programming in R
Next Generation Programming in RNext Generation Programming in R
Next Generation Programming in R
 
Data Visualization using base graphics
Data Visualization using base graphicsData Visualization using base graphics
Data Visualization using base graphics
 
Manipulating Data using DPLYR in R Studio
Manipulating Data using DPLYR in R StudioManipulating Data using DPLYR in R Studio
Manipulating Data using DPLYR in R Studio
 
Data manipulation with dplyr
Data manipulation with dplyrData manipulation with dplyr
Data manipulation with dplyr
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors
 
Data tidying with tidyr meetup
Data tidying with tidyr  meetupData tidying with tidyr  meetup
Data tidying with tidyr meetup
 
R data-import, data-export
R data-import, data-exportR data-import, data-export
R data-import, data-export
 
Export Data using R Studio
Export Data using R StudioExport Data using R Studio
Export Data using R Studio
 
Stata Cheat Sheets (all)
Stata Cheat Sheets (all)Stata Cheat Sheets (all)
Stata Cheat Sheets (all)
 
Grouping & Summarizing Data in R
Grouping & Summarizing Data in RGrouping & Summarizing Data in R
Grouping & Summarizing Data in R
 
Introduction To R Language
Introduction To R LanguageIntroduction To R Language
Introduction To R Language
 
R seminar dplyr package
R seminar dplyr packageR seminar dplyr package
R seminar dplyr package
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using R
 
5. working on data using R -Cleaning, filtering ,transformation, Sampling
5. working on data using R -Cleaning, filtering ,transformation, Sampling5. working on data using R -Cleaning, filtering ,transformation, Sampling
5. working on data using R -Cleaning, filtering ,transformation, Sampling
 
R language introduction
R language introductionR language introduction
R language introduction
 
5 R Tutorial Data Visualization
5 R Tutorial Data Visualization5 R Tutorial Data Visualization
5 R Tutorial Data Visualization
 
Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat Sheet
 

Viewers also liked

Tutorial for Circular and Rectangular Manhattan plots
Tutorial for Circular and Rectangular Manhattan plotsTutorial for Circular and Rectangular Manhattan plots
Tutorial for Circular and Rectangular Manhattan plots
Avjinder (Avi) Kaler
 
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypesGenome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
Avjinder (Avi) Kaler
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
Avjinder (Avi) Kaler
 
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Avjinder (Avi) Kaler
 
SAS and R Code for Basic Statistics
SAS and R Code for Basic StatisticsSAS and R Code for Basic Statistics
SAS and R Code for Basic Statistics
Avjinder (Avi) Kaler
 
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Avjinder (Avi) Kaler
 
Nutrient availability response to sulfur amendment in histosols having variab...
Nutrient availability response to sulfur amendment in histosols having variab...Nutrient availability response to sulfur amendment in histosols having variab...
Nutrient availability response to sulfur amendment in histosols having variab...
Avjinder (Avi) Kaler
 
R Code for EM Algorithm
R Code for EM AlgorithmR Code for EM Algorithm
R Code for EM Algorithm
Avjinder (Avi) Kaler
 
Seed rate calculation for experiment
Seed rate calculation for experimentSeed rate calculation for experiment
Seed rate calculation for experiment
Avjinder (Avi) Kaler
 

Viewers also liked (9)

Tutorial for Circular and Rectangular Manhattan plots
Tutorial for Circular and Rectangular Manhattan plotsTutorial for Circular and Rectangular Manhattan plots
Tutorial for Circular and Rectangular Manhattan plots
 
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypesGenome-wide association mapping of canopy wilting in diverse soybean genotypes
Genome-wide association mapping of canopy wilting in diverse soybean genotypes
 
R code for data manipulation
R code for data manipulationR code for data manipulation
R code for data manipulation
 
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
Genome-Wide Association Mapping of Carbon Isotope and Oxygen Isotope Ratios i...
 
SAS and R Code for Basic Statistics
SAS and R Code for Basic StatisticsSAS and R Code for Basic Statistics
SAS and R Code for Basic Statistics
 
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
Sugarcane yield and plant nutrient response to sulfur amended everglades hist...
 
Nutrient availability response to sulfur amendment in histosols having variab...
Nutrient availability response to sulfur amendment in histosols having variab...Nutrient availability response to sulfur amendment in histosols having variab...
Nutrient availability response to sulfur amendment in histosols having variab...
 
R Code for EM Algorithm
R Code for EM AlgorithmR Code for EM Algorithm
R Code for EM Algorithm
 
Seed rate calculation for experiment
Seed rate calculation for experimentSeed rate calculation for experiment
Seed rate calculation for experiment
 

Similar to R code descriptive statistics of phenotypic data by Avjinder Kaler

Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
University of Illinois,Chicago
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
University of Illinois,Chicago
 
INTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptxINTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptx
Dhananjaykumar464035
 
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners  By ScholarHat PDFData Structures And Algorithms Roadmap for Beginners  By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
Scholarhat
 
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSCBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
Gautham Rajesh
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
Golden Julie Jesus
 
3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx
3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx
3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx
LeenaChaudhari24
 
Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017
Parth Khare
 
R Get Started II
R Get Started IIR Get Started II
R Get Started II
Sankhya_Analytics
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
Mohammed El Rafie Tarabay
 
A brief introduction to apply functions
A brief introduction to apply functionsA brief introduction to apply functions
A brief introduction to apply functions
NIKET CHAURASIA
 
R data types
R   data typesR   data types
R data types
Learnbay Datascience
 
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
The Statistical and Applied Mathematical Sciences Institute
 
R programming slides
R  programming slidesR  programming slides
R programming slides
Pankaj Saini
 
R Cheat Sheet
R Cheat SheetR Cheat Sheet
R Cheat Sheet
Dr. Volkan OBAN
 
Data Management in R
Data Management in RData Management in R
Data Management in R
Sankhya_Analytics
 
Stata cheat sheet: data processing
Stata cheat sheet: data processingStata cheat sheet: data processing
Stata cheat sheet: data processing
Tim Essam
 
ARIMA Models - [Lab 3]
ARIMA Models - [Lab 3]ARIMA Models - [Lab 3]
ARIMA Models - [Lab 3]
Theodore Grammatikopoulos
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
Edureka!
 
Write a function called float dotproduct (links to an external site.)(float a...
Write a function called float dotproduct (links to an external site.)(float a...Write a function called float dotproduct (links to an external site.)(float a...
Write a function called float dotproduct (links to an external site.)(float a...
JenniferBall48
 

Similar to R code descriptive statistics of phenotypic data by Avjinder Kaler (20)

Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
 
INTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptxINTRODUCTION TO STATA.pptx
INTRODUCTION TO STATA.pptx
 
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners  By ScholarHat PDFData Structures And Algorithms Roadmap for Beginners  By ScholarHat PDF
Data Structures And Algorithms Roadmap for Beginners By ScholarHat PDF
 
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVSCBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
CBSE XII COMPUTER SCIENCE STUDY MATERIAL BY KVS
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx
3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx
3110003_PPS_GTU_Study_Material_Presentations_Unit-2_18122020041700AM (1).pptx
 
Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017
 
R Get Started II
R Get Started IIR Get Started II
R Get Started II
 
R for Statistical Computing
R for Statistical ComputingR for Statistical Computing
R for Statistical Computing
 
A brief introduction to apply functions
A brief introduction to apply functionsA brief introduction to apply functions
A brief introduction to apply functions
 
R data types
R   data typesR   data types
R data types
 
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
R Cheat Sheet
R Cheat SheetR Cheat Sheet
R Cheat Sheet
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
Stata cheat sheet: data processing
Stata cheat sheet: data processingStata cheat sheet: data processing
Stata cheat sheet: data processing
 
ARIMA Models - [Lab 3]
ARIMA Models - [Lab 3]ARIMA Models - [Lab 3]
ARIMA Models - [Lab 3]
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
 
Write a function called float dotproduct (links to an external site.)(float a...
Write a function called float dotproduct (links to an external site.)(float a...Write a function called float dotproduct (links to an external site.)(float a...
Write a function called float dotproduct (links to an external site.)(float a...
 

More from Avjinder (Avi) Kaler

Unleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Unleashing Real-World Simulations: A Python Tutorial by Avjinder KalerUnleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Unleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Avjinder (Avi) Kaler
 
Tutorial for Deep Learning Project with Keras
Tutorial for Deep Learning Project  with KerasTutorial for Deep Learning Project  with Keras
Tutorial for Deep Learning Project with Keras
Avjinder (Avi) Kaler
 
Tutorial for DBSCAN Clustering in Machine Learning
Tutorial for DBSCAN Clustering in Machine LearningTutorial for DBSCAN Clustering in Machine Learning
Tutorial for DBSCAN Clustering in Machine Learning
Avjinder (Avi) Kaler
 
Python Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdfPython Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdf
Avjinder (Avi) Kaler
 
Sql tutorial for select, where, order by, null, insert functions
Sql tutorial for select, where, order by, null, insert functionsSql tutorial for select, where, order by, null, insert functions
Sql tutorial for select, where, order by, null, insert functions
Avjinder (Avi) Kaler
 
Kaler et al 2018 euphytica
Kaler et al 2018 euphyticaKaler et al 2018 euphytica
Kaler et al 2018 euphytica
Avjinder (Avi) Kaler
 
Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...
Avjinder (Avi) Kaler
 
Genome wide association mapping
Genome wide association mappingGenome wide association mapping
Genome wide association mapping
Avjinder (Avi) Kaler
 
Population genetics
Population geneticsPopulation genetics
Population genetics
Avjinder (Avi) Kaler
 
Quantitative genetics
Quantitative geneticsQuantitative genetics
Quantitative genetics
Avjinder (Avi) Kaler
 
Abiotic stresses in plant
Abiotic stresses in plantAbiotic stresses in plant
Abiotic stresses in plant
Avjinder (Avi) Kaler
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
Avjinder (Avi) Kaler
 
Correlation in Statistics
Correlation in StatisticsCorrelation in Statistics
Correlation in Statistics
Avjinder (Avi) Kaler
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
Avjinder (Avi) Kaler
 
Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)
Avjinder (Avi) Kaler
 
Population and sample mean
Population and sample meanPopulation and sample mean
Population and sample mean
Avjinder (Avi) Kaler
 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
Avjinder (Avi) Kaler
 
Hypothesis and Test
Hypothesis and TestHypothesis and Test
Hypothesis and Test
Avjinder (Avi) Kaler
 
Normal and standard normal distribution
Normal and standard normal distributionNormal and standard normal distribution
Normal and standard normal distribution
Avjinder (Avi) Kaler
 

More from Avjinder (Avi) Kaler (19)

Unleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Unleashing Real-World Simulations: A Python Tutorial by Avjinder KalerUnleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
Unleashing Real-World Simulations: A Python Tutorial by Avjinder Kaler
 
Tutorial for Deep Learning Project with Keras
Tutorial for Deep Learning Project  with KerasTutorial for Deep Learning Project  with Keras
Tutorial for Deep Learning Project with Keras
 
Tutorial for DBSCAN Clustering in Machine Learning
Tutorial for DBSCAN Clustering in Machine LearningTutorial for DBSCAN Clustering in Machine Learning
Tutorial for DBSCAN Clustering in Machine Learning
 
Python Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdfPython Code for Classification Supervised Machine Learning.pdf
Python Code for Classification Supervised Machine Learning.pdf
 
Sql tutorial for select, where, order by, null, insert functions
Sql tutorial for select, where, order by, null, insert functionsSql tutorial for select, where, order by, null, insert functions
Sql tutorial for select, where, order by, null, insert functions
 
Kaler et al 2018 euphytica
Kaler et al 2018 euphyticaKaler et al 2018 euphytica
Kaler et al 2018 euphytica
 
Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...Association mapping identifies loci for canopy coverage in diverse soybean ge...
Association mapping identifies loci for canopy coverage in diverse soybean ge...
 
Genome wide association mapping
Genome wide association mappingGenome wide association mapping
Genome wide association mapping
 
Population genetics
Population geneticsPopulation genetics
Population genetics
 
Quantitative genetics
Quantitative geneticsQuantitative genetics
Quantitative genetics
 
Abiotic stresses in plant
Abiotic stresses in plantAbiotic stresses in plant
Abiotic stresses in plant
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Correlation in Statistics
Correlation in StatisticsCorrelation in Statistics
Correlation in Statistics
 
Simple linear regression
Simple linear regressionSimple linear regression
Simple linear regression
 
Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)Analysis of Variance (ANOVA)
Analysis of Variance (ANOVA)
 
Population and sample mean
Population and sample meanPopulation and sample mean
Population and sample mean
 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
 
Hypothesis and Test
Hypothesis and TestHypothesis and Test
Hypothesis and Test
 
Normal and standard normal distribution
Normal and standard normal distributionNormal and standard normal distribution
Normal and standard normal distribution
 

Recently uploaded

Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
Celine George
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
NgcHiNguyn25
 

Recently uploaded (20)

Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17How to Fix the Import Error in the Odoo 17
How to Fix the Import Error in the Odoo 17
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Life upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for studentLife upper-Intermediate B2 Workbook for student
Life upper-Intermediate B2 Workbook for student
 

R code descriptive statistics of phenotypic data by Avjinder Kaler

  • 2. Steps 1. Reading data into R—the data file includes a header and “NA” is a missing value. data_in <‐‐‐ read.table (file="example.dat", header= T, na.string="NA") 2. Getting access to the data frame (all variables will relate to this data frame). attach(data_in) 3. Overview of data (mean, median, maximum, minimum, first and third quartiles, number of missing values). summary(data_in) 4. Calculating means and standard deviations (2 indicates that the function is applied to all columns, na.rm=T means that missing values are removed). apply(data_in,2,mean,na.rm=T) apply(data_in,2,sd,na.rm=T) 5. Means or standard deviations can be calculated for data points grouped by a factor (e.g. year). aggregate(data_in,list(data$year),FUN=mean) aggregate(data_in,list(data$year),FUN=sd) 6. Frequencies for a single variable and across two variables. table(sex) table(wormy) table(wormy,sex)
  • 3. 7. Histogram. hist(FWEC) 8. xy scatter plots. plot(FWEC) 9. Box plot. boxplot(FWEC_sex,data=data_in, range=0) 10. Shapiro–Wilk’s tests to check normality of data distribution. shapiro.test(FWEC) 11. Checking data distribution with QQ plot—if data are normally distributed, the plotted data and the line are well aligned. qqplot(FWEC) qqline(FWEC) 12. Data transformation—log, square root, and cube root transformation. log_FWEC <‐‐‐ log(FWEC) sqrt_FWEC <‐‐‐ sqrt(FWEC+1) cbrt_FWEC <‐‐‐ (FWEC)^(1/3)
  • 4. 13. Box–Cox transformation. #CodetofindsuitablelambdaforYtothepowerlambda #download thelibrary(MASS) #seq(min value, max value, step) defines the range from which lambda is drawn boxcox(FWEC_factor(sex)+factor(birth_rearing_ type), lambda = seq(0,1.0,0.01) savePlot("boxcox","jpeg") lambda = "insert maximum lambda value in graph here" trans(FWEC) <- ((FWEC^lambda)-1)/lambda MASS library 14. Checking homogeneity of variances. #download library (Rcmdr) library(Rcmdr) #run the Leven’s test, specifying the vector of data y and group, the factor across which the variances are tested (e.g., year) leveneTest(y,group) 15. Fitting a linear model and ANOVA. #need to load the "car" package for Type III ANOVA library(car)
  • 5. lmod <- lm(cbrt_FWEC_factor(sex)) #Type I ANOVA anova(lm) #Type III ANOVA---Note that the first letter in the commandbelow has to be a capital "A" (ensure that you loaded the "car" package as shown above) Anova(lmod, type¼"III") 16. Addressing confounding of explanatory variables in a linear model. lmod1 <- lm(cbrt_FWEC_factor(sex)+factor(birth_ type)*factor(rearing_type)) lmod2 <-lm(cbrt_FWEC_factor(sex)+factor(birth_ rearing_type)) 17. Check the difference with an ANOVA. Anova(lmod1,type="III") Anova(lmod2,type="III") 18. Model comparison using logistic regression for binary data. logres <- glm(formula=wormy_status_factor(sex) + factor(birth_rearing_type), family = binomial (link="logit"))
  • 6. #producing an analysis-of-deviance table to test fixed effects anova(logres,test="Chisq") #produces the deviance of the model (the lower the better the fit) summary(glm(formula=wormy_status_factor(sex) + factor(birth_rearing_type), family = binomial (link="logit"))$deviance)) #the difference in deviance can be formally tested with a loglikelihood ratio test #install library(lme4) library(lme4) #comparing two nested models ("nested" means that one has one more factor than the other) logres1 <- lmer(wormy_status_factor(sex)), family = "binomial", method="Laplace") logres2 <- lmer(wormy_status_factor(sex) + factor (birth_rearing_type), family = "binomial", Method="Laplace") anova(logres1,logres2) #to assess the model, plot predicted probability against observed proportion #install library(languageR) library(languageR) plot.logistics.fit.fnc(logres1,logres2)
  • 7. 19. Model diagnostics. #the following produces plot of residual vs. fitted value, QQ plot, and scale-location plot of the previously tested model 1 (lmod1) plot(lmod1) #assessing a logit model for binary data by plotting the predicted probability against observed proportions #download library(languageR) library(languageR) plot.logistic.fit.fnc(logres1,data_in) 20. Extracting residuals and writing them to a file—assuming lmod2 is the model of choice. res_lmod2 <-residuals(lmod2) write.table(res_lmod2,file¼"res_FWEC")