This document provides an overview of common statistical tests used to analyze data, including the t-test, ANOVA, and ANCOVA. It describes the assumptions, test statistics, and SAS code for each test. The t-test is used to compare two population means or determine if two sets of data are significantly different. ANOVA examines differences among group means and can be one-way or two-way. ANCOVA combines aspects of ANOVA and regression by including categorical and continuous predictors to examine the influence of independent variables on a dependent variable while controlling for a covariate.
Assumptions of parametric and non-parametric tests
Testing the assumption of normality
Commonly used non-parametric tests
Applying tests in SPSS
Advantages of non-parametric tests
Limitations
Assumptions of parametric and non-parametric tests
Testing the assumption of normality
Commonly used non-parametric tests
Applying tests in SPSS
Advantages of non-parametric tests
Limitations
Today’s overwhelming number of techniques applicable to data analysis makes it extremely difficult to define the most beneficial approach while considering all the significant variables.
The analysis of variance has been studied from several approaches, the most common of which uses a linear model that relates the response to the treatments and blocks. Note that the model is linear in parameters but may be nonlinear across factor levels. Interpretation is easy when data is balanced across factors but much deeper understanding is needed for unbalanced data.
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.Analysis of variance (ANOVA) is an analysis tool used in statistics that splits an observed aggregate variability found inside a data set into two parts: systematic factors and random factors. The systematic factors have a statistical influence on the given data set, while the random factors do not. Analysts use the ANOVA test to determine the influence that independent variables have on the dependent variable in a regression study.
Sir Ronald Fisher pioneered the development of ANOVA for analyzing results of agricultural experiments.1 Today, ANOVA is included in almost every statistical package, which makes it accessible to investigators in all experimental sciences. It is easy to input a data set and run a simple ANOVA, but it is challenging to choose the appropriate ANOVA for different experimental designs, to examine whether data adhere to the modeling assumptions, and to interpret the results correctly. The purpose of this report, together with the next 2 articles in the Statistical Primer for Cardiovascular Research series, is to enhance understanding of ANVOA and to promote its successful use in experimental cardiovascular research. My colleagues and I attempt to accomplish those goals through examples and explanation, while keeping within reason the burden of notation, technical jargon, and mathematical equations.
01 parametric and non parametric statisticsVasant Kothari
Definition of Parametric and Non-parametric Statistics
Assumptions of Parametric and Non-parametric Statistics
Assumptions of Parametric Statistics
Assumptions of Non-parametric Statistics
Advantages of Non-parametric Statistics
Disadvantages of Non-parametric Statistical Tests
Parametric Statistical Tests for Different Samples
Parametric Statistical Measures for Calculating the Difference Between Means
Significance of Difference Between the Means of Two Independent Large and
Small Samples
Significance of the Difference Between the Means of Two Dependent Samples
Significance of the Difference Between the Means of Three or More Samples
Parametric Statistics Measures Related to Pearson’s ‘r’
Non-parametric Tests Used for Inference
This presentation contains information about Mann Whitney U test, what is it, when to use it and how to use it. I have also put an example so that it may help you to easily understand it.
It is most useful for the students of BBA for the subject of "Data Analysis and Modeling"/
It has covered the content of chapter- Data regression Model
Visit for more on www.ramkumarshah.com.np/
Today’s overwhelming number of techniques applicable to data analysis makes it extremely difficult to define the most beneficial approach while considering all the significant variables.
The analysis of variance has been studied from several approaches, the most common of which uses a linear model that relates the response to the treatments and blocks. Note that the model is linear in parameters but may be nonlinear across factor levels. Interpretation is easy when data is balanced across factors but much deeper understanding is needed for unbalanced data.
Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.Analysis of variance (ANOVA) is an analysis tool used in statistics that splits an observed aggregate variability found inside a data set into two parts: systematic factors and random factors. The systematic factors have a statistical influence on the given data set, while the random factors do not. Analysts use the ANOVA test to determine the influence that independent variables have on the dependent variable in a regression study.
Sir Ronald Fisher pioneered the development of ANOVA for analyzing results of agricultural experiments.1 Today, ANOVA is included in almost every statistical package, which makes it accessible to investigators in all experimental sciences. It is easy to input a data set and run a simple ANOVA, but it is challenging to choose the appropriate ANOVA for different experimental designs, to examine whether data adhere to the modeling assumptions, and to interpret the results correctly. The purpose of this report, together with the next 2 articles in the Statistical Primer for Cardiovascular Research series, is to enhance understanding of ANVOA and to promote its successful use in experimental cardiovascular research. My colleagues and I attempt to accomplish those goals through examples and explanation, while keeping within reason the burden of notation, technical jargon, and mathematical equations.
01 parametric and non parametric statisticsVasant Kothari
Definition of Parametric and Non-parametric Statistics
Assumptions of Parametric and Non-parametric Statistics
Assumptions of Parametric Statistics
Assumptions of Non-parametric Statistics
Advantages of Non-parametric Statistics
Disadvantages of Non-parametric Statistical Tests
Parametric Statistical Tests for Different Samples
Parametric Statistical Measures for Calculating the Difference Between Means
Significance of Difference Between the Means of Two Independent Large and
Small Samples
Significance of the Difference Between the Means of Two Dependent Samples
Significance of the Difference Between the Means of Three or More Samples
Parametric Statistics Measures Related to Pearson’s ‘r’
Non-parametric Tests Used for Inference
This presentation contains information about Mann Whitney U test, what is it, when to use it and how to use it. I have also put an example so that it may help you to easily understand it.
It is most useful for the students of BBA for the subject of "Data Analysis and Modeling"/
It has covered the content of chapter- Data regression Model
Visit for more on www.ramkumarshah.com.np/
Statistics for Anaesthesiologists covers basic to intermediate level statistics for researchers especially commonly used study designs or tests in Anaesthesiology research.
(Individuals With Disabilities Act Transformation Over the Years)DSilvaGraf83
(Individuals With Disabilities Act Transformation Over the Years)
Discussion Forum Instructions:
1. You must post at least three times each week.
2. Your initial post is due Tuesday of each week and the following two post are due before Sunday.
3. All post must be on separate days of the week.
4. Post must be at least 150 words and cite all of your references even it its the book.
Discussion Topic:
Describe how the lives of students with disabilities from culturally and/or linguistically diverse backgrounds have changed since the advent of IDEA. What do you feel are some things that can or should be implemented to better assist with students that have disabilities? Tell me about these ideas and how would you integrate them?
ANOVA
ANOVA
• Analysis of Variance
• Statistical method to analyzes variances to determine if the means from more than
two populations are the same
• compare the between-sample-variation to the within-sample-variation
• If the between-sample-variation is sufficiently large compared to the within-sample-
variation it is likely that the population means are statistically different
• Compares means (group differences) among levels of factors. No
assumptions are made regarding how the factors are related
• Residual related assumptions are the same as with simple regression
• Explanatory variables can be qualitative or quantitative but are categorized
for group investigations. These variables are often referred to as factors
with levels (category levels)
ANOVA Assumptions
• Assume populations , from which the response values for the groups
are drawn, are normally distributed
• Assumes populations have equal variances
• Can compare the ratio of smallest and largest sample standard deviations.
Between .05 and 2 are typically not considered evidence of a violation
assumption
• Assumes the response data are independent
• For large sample sizes, or for factor level sample sizes that are equal,
the ANOVA test is robust to assumption violations of normality and
unequal variances
ANOVA and Variance
Fixed or Random Factors
• A factor is fixed if its levels are chosen before the ANOVA investigation
begins
• Difference in groups are only investigated for the specific pre-selected factors
and levels
• A factor is random if its levels are choosen randomly from the
population before the ANOVA investigation begins
Randomization
• Assigning subjects to treatment groups or treatments to subjects
randomly reduces the chance of bias selecting results
ANOVA hypotheses statements
One-way ANOVA
One-Way ANOVA
Hypotheses statements
Test statistic
=
𝐵𝑒𝑡𝑤𝑒𝑒𝑛 𝐺𝑟𝑜𝑢𝑝 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝑊𝑖𝑡ℎ𝑖𝑛 𝐺𝑟𝑜𝑢𝑝 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
Under the null hypothesis both the between and within group variances estimate the
variance of the random error so the ratio is assumed to be close to 1.
Null Hypothesis
Alternate Hypothesis
One-Way ANOVA
One-Way ANOVA
One-Way ANOVA Excel Output
Treatme
(Individuals With Disabilities Act Transformation Over the Years)DMoseStaton39
(Individuals With Disabilities Act Transformation Over the Years)
Discussion Forum Instructions:
1. You must post at least three times each week.
2. Your initial post is due Tuesday of each week and the following two post are due before Sunday.
3. All post must be on separate days of the week.
4. Post must be at least 150 words and cite all of your references even it its the book.
Discussion Topic:
Describe how the lives of students with disabilities from culturally and/or linguistically diverse backgrounds have changed since the advent of IDEA. What do you feel are some things that can or should be implemented to better assist with students that have disabilities? Tell me about these ideas and how would you integrate them?
ANOVA
ANOVA
• Analysis of Variance
• Statistical method to analyzes variances to determine if the means from more than
two populations are the same
• compare the between-sample-variation to the within-sample-variation
• If the between-sample-variation is sufficiently large compared to the within-sample-
variation it is likely that the population means are statistically different
• Compares means (group differences) among levels of factors. No
assumptions are made regarding how the factors are related
• Residual related assumptions are the same as with simple regression
• Explanatory variables can be qualitative or quantitative but are categorized
for group investigations. These variables are often referred to as factors
with levels (category levels)
ANOVA Assumptions
• Assume populations , from which the response values for the groups
are drawn, are normally distributed
• Assumes populations have equal variances
• Can compare the ratio of smallest and largest sample standard deviations.
Between .05 and 2 are typically not considered evidence of a violation
assumption
• Assumes the response data are independent
• For large sample sizes, or for factor level sample sizes that are equal,
the ANOVA test is robust to assumption violations of normality and
unequal variances
ANOVA and Variance
Fixed or Random Factors
• A factor is fixed if its levels are chosen before the ANOVA investigation
begins
• Difference in groups are only investigated for the specific pre-selected factors
and levels
• A factor is random if its levels are choosen randomly from the
population before the ANOVA investigation begins
Randomization
• Assigning subjects to treatment groups or treatments to subjects
randomly reduces the chance of bias selecting results
ANOVA hypotheses statements
One-way ANOVA
One-Way ANOVA
Hypotheses statements
Test statistic
=
𝐵𝑒𝑡𝑤𝑒𝑒𝑛 𝐺𝑟𝑜𝑢𝑝 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝑊𝑖𝑡ℎ𝑖𝑛 𝐺𝑟𝑜𝑢𝑝 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒
Under the null hypothesis both the between and within group variances estimate the
variance of the random error so the ratio is assumed to be close to 1.
Null Hypothesis
Alternate Hypothesis
One-Way ANOVA
One-Way ANOVA
One-Way ANOVA Excel Output
Treatme
This Slides presents different types of Parametric Test- like
T-test,
Parametric Test,
Assumption of Parametric Test,
Paired T Test,
One Sample T Test,
ANOVA,
ANCOVA,
Regression,
Two Way ANOVA,
Repeated Measure ANOVA,
Multiple Regression
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
3. T - Test
• A t-test is any statistical hypothesis test in which the test statistic follows a Student's t-
distribution under the null hypothesis. It can be used to determine if two sets of data are
significantly different from each other.
• In probability and statistics, Student's t-distribution (or simply the t-distribution) is any
member of a family of continuous probability distributions that arises when estimating
the mean of a normally distributed population in situations where the sample size is
small and population standard deviation is unknown.
• Let 𝑋1, 𝑋2, … , 𝑋 𝑛 be independent and identically distributed as N(𝜇, 𝜎2
), i.e. this is a
sample of size n from a normally distributed population with expected value 𝜇 and
unknown variance 𝜎2
.
4. Assumptions
• Interval or ratio scale of measurement
• X follows a normal distribution with mean μ and variance σ2
• Z and s are independent,
• Where Z= Standard normal variable
s =The ratio of sample standard deviation over population
standard deviation
• Generally used when sample size is <30.
One sample T test statistic
• The random variable
T=
𝑋−𝜇
𝑠
𝑛
Where, 𝑆2
=
1
𝑛−1 𝑖=1
𝑛
(𝑋𝑖 − 𝑋)2
has a Student's t-distribution with n − 1 degrees of freedom.
5. SAS Code for One sample T test
Example:
title 'One-Sample t Test';
data time;
input time @@;
datalines;
43 90 84 87 116 95 86 99 93 92
121 71 66 98 79 102 60 112 105 98
;
run;
proc ttest h0=80 alpha=0.05;
var time;
run;
6. Output
Here, we see that the p value (0.0329) < α (0.05) hence we reject 𝐻 𝑜.
Hence we conclude that the mean weight of given sample is not equal to 80.
One-Sample t Test :
7. Types of Two sample T – Test
• Independent Two Sample T – Test:
The independent samples t-test is used when two separate sets
of independent and identically distributed samples are obtained,
from each of the two populations being compared.
• Dependent (Paired) Two Sample T – Test:
A typical example of the repeated measures t-test would be
where subjects are tested prior to a treatment, say for high blood
pressure, and the same subjects are tested again after treatment
with a blood-pressure lowering medication.
8. Assumptions of Independent two sample t-test
Given two groups 1 and 2 this test is only applicable when:
• The two sample sizes (that is, the number n of participants of each group) are equal
• The two samples have the same variance
• Samples have been randomly drawn independent of each other
Testing Assumptions
Normality:- Shapiro–Wilk or Kolmogorov–Smirnov test, or it can be
assessed graphically using a normal quantile plot
If the two groups are normally distributed then to check
equality of variance we use F - Test If normality is rejected then we use Wilcoxon Test.
Both the assumptions are satisfied thus we go
for two sample T - Test
If variance of two groups is not equal then
we use Welch Test
10. Independent Two sample T – test in SAS
Example:
data wt;
input group $ wt @@;
datalines;
Ind 85 Ind 70 Ind 64 Ind 87 Ind 76 Ind 95 Ind 67 Ind 87 Ind 93 Ind 82
Aus 90 Aus 95 Aus 103 Aus 107 Aus 95 Aus 112 Aus 98 Aus 92 Aus 98 Aus 115
;
run;
proc ttest h0=0 alpha=0.05;
class group;
var wt;
run;
12. Paired Two sample T – test in SAS
Example:
data wt;
input Before After @@;
datalines;
120 128 140 132 126 118 124 131 128 125 130 132 130 131 140 141 126 129 118 127
135 137 127 135
;
run;
proc ttest data=wt sides=2 alpha=0.05 h0=0;
title "Paired sample t-test";
paired Before * After;
run;
Output
The TTEST Procedure
Difference: Before - After
13. ANOVA
ANOVA – Analysis Of Variance
Analysis of variance (ANOVA) is a collection of statistical models used to
analyze the differences among group means.
14. • The mathematical model that describes the relationship between the response and
treatment for the one way ANOVA is given by
𝑌𝑖𝑗 = µ + α𝑖 + ε𝑖𝑗 , i = 1,2….k, j = 1,2,…... 𝑛𝑖
where, 𝑌𝑖𝑗 : 𝑗 𝑡ℎ
observation on 𝑖 𝑡ℎ
treatment,
µ : common effect for the whole experiment,
α𝑖 : 𝑖 𝑡ℎ
treatment effect,
ε𝑖𝑗 : random error.
One – Way ANOVA
Hypothesis of one – way ANOVA
H0: The means of all the groups are equal.
Vs
H1: Not all of the group means are the same.
15. • The normality assumption:
dependent variable should be approximately normally distributed
• The homogeneity of variance assumption:
variance of each group should be approximately equal
• The independence assumption:
observations should be independent of each other
Assumptions of one way-ANOVA
Example
• Suppose we are testing a new drug to see if it helps reduce the time to recover from a
fever. We decide to test the drug on three different races (Caucasian, African American,
and Hispanic. We randomly select 10 test subjects from each of those races, so all
together, we have 30 test subjects.
• The response variable is the time in minutes after taking the medicine before the fever is
reduced.
18. • Two-way ANOVA is a type of study design with one numerical outcome variable
and two categorical explanatory variables.
• Mathematical model of two –way ANOVA is as follows
𝑌𝑖𝑗 = µ + α𝑖 + 𝛽𝑗 + γ𝑖𝑗 + ε𝑖𝑗𝑘
where µ is overall mean effect
α𝑖 is effect due to 𝑖 𝑡ℎ level of first factor
𝛽𝑗 is effect due to 𝑗 𝑡ℎ level of second factor
γ𝑖𝑗 is effect due to interaction between 𝑖 𝑡ℎ
level of first factor and 𝑗 𝑡ℎ
level of second factor
Two – Way ANOVA
Assumptions:
• The populations from which the samples were obtained must be normally or approximately
normally distributed.
• The samples must be independent.
• The variances of the populations must be equal.
• The groups must have the same sample size.
19. Proc anova data=time;
class gender race;
model time=gender race;
run;
SAS code for Two-way ANOVA
Output of two-way ANOVA
21. ANCOVA
• ANCOVA by definition is a general linear model that includes both ANOVA (categorical)
predictors and Regression (continuous) predictors.
• ANCOVA examines the influence of an independent variable on a dependent variable
while removing the effect of the covariate factor.
• ANCOVA first conducts a regression of the independent variable (i.e., the covariate) on
the dependent variable.
• The residuals (the unexplained variance in the regression model) are then subject to an
ANOVA.
22. ANCOVA in other words…
• Analysis of Covariance (ANCOVA) is a statistical test related to ANOVA
• It tests whether there is a significant difference between groups after controlling for
variance explained by a covariate
• A covariate (CV) is a continuous variable that correlates with the dependent variable (DV)
• This is one way that you can run a statistical test with both categorical and continuous
independent variables
Purposes of ANCOVA
Increase sensitivity of F test
• Removes predictable variance from the error term
• Improves power of the analysis
23. Adjustment of Covariate Effect
Partitioning variance in ANOVA
Variance
Variance due to Treatment Within cell variance(Error)
Variance due to Covariate
Partitioning variance in ANCOVA
Variance
Variance due to Treatment Within Cell Varaince(Error)
24. Relationship between CV and DV
Hypotheses for ANCOVA
H0: the group means are equal after controlling for the covariate
Vs
H1: the group means are not equal after controlling for the covariate
25. • linearity of regression: The regression relationship between the dependent variable and
concomitant variables must be linear.
• homogeneity of error variances: The error is a random variable zero mean and equal
variances for different treatment classes and observations
• independence of error terms: The errors are uncorrelated. That is that the error
covariance matrix is diagonal.
• normality of error terms: The residual (error terms) should be normally distributed
𝜀𝑖𝑗~𝑁(0, 𝜎2)
• homogeneity of regression slopes: The slopes of the different regression lines should be
equivalent, i.e., regression lines should be parallel among groups.
Assumptions
26. Choosing Covariates
Variables that affect or have the potential to affect the dependent variable
• Demographic information
• Inherent characteristics
• Differences in group characteristics due to sampling
Number of covariates depends on:
• Known relationship or previous research
• No. of independent variables or groups
• Total no. of subjects
Example
Suppose we want to compare the effect of drugs on the weights of a particular group of
patients(homogeneous among themselves). We can analyze the data by performing the ANCOVA by
regarding:
𝑦: the final weight of the patients taking drugs, after a specified period as the response variable.
𝑥: the initial weight of the animals at the time of starting the experiment as the covariate.
27. Model
So, our model becomes:
𝑦𝑖𝑗 = 𝜇 + 𝛼𝑖 + 𝛽 𝑥𝑖𝑗 − 𝑥00 + 𝜀𝑖𝑗 Where,
• 𝜇 is the general mean effect
• 𝛼𝑖 is the (fixed) additional effect due to the 𝑖th treatment (i=1,2,….,p)
• 𝜀𝑖𝑗 is the random error effect (j=1,2,…, 𝑛𝑖)
• 𝛽 is the coefficient of regression of y on x
• 𝑥𝑖𝑗 is the value of covariate variable corresponding to the response variable 𝑦𝑖𝑗
• PROC MIXED data =<name of the dataset>;
CLASS variables ;
MODEL dependent = < fixed-effects > <covariates> < / options > ;
LSMEANS fixed-effects < / options > ;
run;
SAS Codes in ANCOVA
Group means are adjusted based on the how much amount of effect the covariate actually has . The formula does kind of a mini regression equation and figures out how much variance is explained in the outcome by the covariate that we might have and then it can actually give a quantitative value to say this covariate is either increasing or decreasing the outcome variable by this amount 5points or 10 points. And the group mean is just adjusted by that 5 point or 10 point the covariate actually had. So the formula looks at the relationship between the covariate and the outcome and then predicts how much variance it has with the outcome an actual value numeric value and then removes the numeric value from the group means and so now the group means are adjusted based upon the effect the covariate is having. So we get SPSS and we actually look at the outputs you’ll see the group means will be designated as the adjusted group means. In other words, group means have been changed by a certain amount and the amount is the effect the covariate is actually having on the outcome. So that how it works.
The effect of covariates should be same between treatment groups
CLASS declares qualitative variables that create indicator variables in design matrices
MODEL specifies dependent variable and fixed effects, setting up X
LSMEANS computes least squares means for classification fixed effects <DIFF computes differences of the least squares means, ADJUST= performs multiple comparisons adjustments, CL produces confidence limits>