SlideShare a Scribd company logo
1 of 30
Download to read offline
Anova,
Chi-square
One Way & Two Way,
Test Of Association,
Goodness Of Fit
Anova
 Analysis of Variance (ANOVA) is a
statistical method used to compare the
difference between two or more sample
means.
And in real life we will encounter more
than 2 samples.
 Anova can also be simply referred as
Multiple Sample Test.
Rupak Roy
2 kinds of Anova
 One Way
 Two way
One way or two way refers to the number of
independent variables in the data . One
way anova has one independent variable
with 2 levels and two way anova can have
multiple levels.
Rupak Roy
 One way Anova
here are the scores of 10 days
by the athletics trained by different
trainers. We need to determine
whether the trainers have any
effect on the scores. i.e. are the
differences observed in the
sample means statistically
significant?
There are two variances of anova
1) Within group variance ( sum of squared difference
between each observations and the mean of the group it
belongs) i.e. Sum of squares Within (SSW).
2) Between group variance ( sum of squared difference
between each group mean and the overall mean) i.e. Sum
of Squares Between (SSB).
Rupak Roy
 So what Anova does it takes the ratio of SSB and
SSW, if the ratio is close to each other then it
concludes that the means are not different. If the
ratio is not close to 1 then it concludes the means
are different.
 Alternative way to understand Anova’s SSB & SSW
+ve or – neg: how far or near the group average is
from the overall mean ( between group variation
(SSB))
+ve or –neg: how far or near the observation is from
the group average ( within group variation (SSW))
Rupak Roy
 In excel
go to data tab and
select data analysis
and then
Anova single Factor
fill the input range
with alpha i.e.
level of confidence
Rupak Roy
 In the output
check for
Between group
value
Between group P-value is 0.74 which is > 0.05, So we failed
to reject the null of hypothesis. Therefore trainers have no
impact on the scores made by athletics .
Rupak Roy
Assumptions for anova
 The samples must be independent to
each other or we can say data should be
random in nature.
 Normality – the distributions of the
population must be normally distributed
or even approximately.
 All the populations should have a
common variance. If not then outcome
of P-value will not be reliable.
Rupak Roy
Two way anova
 We use two way anova to compare the
effect of multiple levels or factors.
 Or simply we can say multiple factors
influencing the outcome.
Two types of two way anova functions:
1. with replication
2. without replication
Rupak Roy
 With a two way ANOVA with replication,
refers if we have 2 groups and within that
group individuals are doing more than one
thing ( like two groups of students from two
colleges taking two tests ) and if we only
have one group taking two tests, we will use
without replication.
Rupak Roy
 Example: 2 way anova
In Excel:
go to data tab and
select data analysis
Then
from the list select
Anova : two factor
with Replication
Rupak Roy
 Provide the input range
and rows per sample : 3
as we can see 3 rows per sample
Alpha : 0.05
Rupak Roy
Two-way anova gives 3 p-values because it tests 3 null
hypothesis.
1st Null hypothesis : Sample : Cold
Hot
Humid
2nd Null hypothesis: Columns : place, place2, place3
3rd Null hypothesis: Interaction: combination of 1st factor & 2nd
factor. So we need the P-values of the Interaction to conclude
that the multiple factors have any effect over the outcome or
not and here we have 0.12 which is not statistically significant.
However, the P-value of one of the individual factor(sample) is
significant 3.06538E-07
Rupak Roy
 Hence we conclude that the combinations of hot,
cold or humid climate with place1, place2, place3
have no effect in the population size.
 But with the statistically significant P value of the 1st
null hypothesis we can also conclude that the
climate type hot, cold & humid have a positive
effect on the population rise. And again for the 2nd
null hypothesis places have no effect in the
population rise.
If an anova results in rejecting the null hypothesis, we
can also understand from the rejection is that at least
one sample mean is different.
To determine which group mean are different, we
use Post Hoc Test.
Rupak Roy
Types of Post Hoc Test
LSD Tests
Tukey Tests
Scheffe Tests
But very often we want to know which are
different i.e. Post Hoc Tests.
Rupak Roy
Chi -square
 a statistical method for multiple sample
tests.
 2 common applications of chi-square:
* Test of association
* goodness of fit
 Remember it is used only for dealing with
count or categorical data.
Rupak Roy
Chi-square test of association or
Independence
 Let’s understand with the help of an
example where if the age has any impact
on the type of cars or we can say is there
any association between age and type
of cars.
Rupak Roy
Here
Null Hypothesis (Ho): there is no association between
age and car types.
Alternative Hypothesis (Ha): there is an association
between age and car types.
In order to run the chi-square test we need the Expected
values and if we don’t have, we have to calculate
Manually the Expected values.
Expected Values = Row Total* Column Total)/N
Where N is the sum of the observations in the sample.
Rupak Roy
Expected Table
we can also use
reference to cells
during calculation
like =(D3*F3)/F7
Refer the lab video for better understanding
Rupak Roy
 Expected values
Now, in Excel
Chi-square = CHISQ.TEST(actual-range, expected-range)
= 1.01888E-42
So we will reject the null hypothesis and conclude there are
some association between age and types of car.
Rupak Roy
Example showing how to compute chi-square test
Rupak Roy
Chi-square Test for Goodness of Fit
 It describes whether or not the data has
followed a particular distribution.
Example:
Whether or not the sample or the distribution is following
binomial distribution.
Rupak Roy
 Take an example where a we toss 2 coins at a time , and
we got 0 head in 5 toss, 1 head in 6 toss , 2 heads in 19
toss, total 30 times tossed
 Now we will calculate expected binomial probability
of heads in the expected table
Number_s: E4 i.e. 0 for 0 head. Trails: 2 i.e. two coins
Probability_s (success) : 2(trials)/6(total trials) = 1/3 for 0
head, 1head, 2 heads
False: point probability.
Rupak Roy
 Expected Values = Row total * Column Total
Therefore,
flipped=0.44*Column Total of actual times the coin
flipped (for 0 head)
then we will repeat the same
steps for 1and
2 heads in the
Expected table
Hence, Chi-square test = (actual & expected )=1.0090E-18
Rupak Roy
 With chi-square test result 1.0090E-18
which is smaller than 0.05 provides a strong
evidence that the distribution is following
a binomial distribution.
Rupak Roy
Practice
 The time taken to assemble a laptop in a
repairing shop having normal distribution of
mean 10 hours and standard deviation of 1
hour, what is the probability that the shop can
assemble a laptop in a given period of time ?
a) assemble > 6hours?
b) assemble < 10 hours ?
Answers : a) 0.9 =1-NORM.DIST(6,10,1,TRUE)
b) 0.5 =NORM.DIST(10,10,1,TRUE)
Rupak Roy
Poisson distribution
 A train passes through a busy crossing at an average rate
of 150 miles per hour.
What will the probability that no train passes in 10
minutes ?
In Excel:
= poisson.dist( X , mean , cumulative)
where X = 0 (no train passes)
Average rate of the train per minute = 150/60 =2.5 = 3 miles
Therefore average rate through 10min = 3x10= 30 miles
Hence,
= poisson.dist (0,30,FALSE) = 9.35762E-14
Rupak Roy
 Find the probability that 5 train passes through
a given only 10 minute.
= poisson.dist ( X , mean , cumulative)
where X = 5 ,
average rate per minute = 150/60 =2.5 = 3 miles
therefore average rate through 10min = 3x10= 30 miles
Hence
P value = poisson.dist ( 5,30,False)
= 2.25735e-08
Rupak Roy
 The average registration for a event is 15%. A mail
campaign for promoting a event was sent to 10000
customers and you received 1250 registrations. Are you
sure the registration by mail campaign was really good
than expected or it is just a randomness.
Variable outcome: Registrations Yes or No.
Therefore, it is a binomial probability distribution
Probability of success rate (based on previous data): 15%
Probability of new Success: 1250/10 000 = 12.5%=0.125
Probability of seeing the mail campaign outcome due to
randomness
=BINOM.DIST(1250,10000,0.125,FALSE) =0.01
which is very low. Hence the new campaign was a success.
Rupak Roy
 Hope you have Enjoyed.
Thank you
Rupak Roy

More Related Content

What's hot

One Way Anova
One Way AnovaOne Way Anova
One Way Anovashoffma5
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsBabasab Patil
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testingpraveen3030
 
Two-way Mixed Design with SPSS
Two-way Mixed Design with SPSSTwo-way Mixed Design with SPSS
Two-way Mixed Design with SPSSJ P Verma
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testingo_devinyak
 
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestStudent's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestAzmi Mohd Tamil
 
Solution to the Practice Test 3A, Normal Probability Distribution
Solution to the Practice Test 3A, Normal Probability DistributionSolution to the Practice Test 3A, Normal Probability Distribution
Solution to the Practice Test 3A, Normal Probability DistributionLong Beach City College
 
The wilcoxon matched pairs signed-ranks test
The wilcoxon matched pairs signed-ranks testThe wilcoxon matched pairs signed-ranks test
The wilcoxon matched pairs signed-ranks testRegent University
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of varianceRavi Rohilla
 
Independent sample t test
Independent sample t testIndependent sample t test
Independent sample t testShajar Khan
 
Parametric test _ t test and ANOVA _ Biostatistics and Research Methodology....
Parametric test _ t test and ANOVA _  Biostatistics and Research Methodology....Parametric test _ t test and ANOVA _  Biostatistics and Research Methodology....
Parametric test _ t test and ANOVA _ Biostatistics and Research Methodology....AZCPh
 
One-Sample Hypothesis Tests
One-Sample Hypothesis TestsOne-Sample Hypothesis Tests
One-Sample Hypothesis TestsSr Edith Bogue
 

What's hot (20)

One way anova
One way anovaOne way anova
One way anova
 
Z test asia
Z test asiaZ test asia
Z test asia
 
Measurement
MeasurementMeasurement
Measurement
 
One Way Anova
One Way AnovaOne Way Anova
One Way Anova
 
Posthoc
PosthocPosthoc
Posthoc
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Two-way Mixed Design with SPSS
Two-way Mixed Design with SPSSTwo-way Mixed Design with SPSS
Two-way Mixed Design with SPSS
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
 
Hypothesis testing Part1
Hypothesis testing Part1Hypothesis testing Part1
Hypothesis testing Part1
 
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestStudent's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
 
Two sample t-test
Two sample t-testTwo sample t-test
Two sample t-test
 
Solution to the Practice Test 3A, Normal Probability Distribution
Solution to the Practice Test 3A, Normal Probability DistributionSolution to the Practice Test 3A, Normal Probability Distribution
Solution to the Practice Test 3A, Normal Probability Distribution
 
The wilcoxon matched pairs signed-ranks test
The wilcoxon matched pairs signed-ranks testThe wilcoxon matched pairs signed-ranks test
The wilcoxon matched pairs signed-ranks test
 
Correlations using SPSS
Correlations using SPSSCorrelations using SPSS
Correlations using SPSS
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
Independent sample t test
Independent sample t testIndependent sample t test
Independent sample t test
 
Parametric test _ t test and ANOVA _ Biostatistics and Research Methodology....
Parametric test _ t test and ANOVA _  Biostatistics and Research Methodology....Parametric test _ t test and ANOVA _  Biostatistics and Research Methodology....
Parametric test _ t test and ANOVA _ Biostatistics and Research Methodology....
 
1 ANOVA.ppt
1 ANOVA.ppt1 ANOVA.ppt
1 ANOVA.ppt
 
One-Sample Hypothesis Tests
One-Sample Hypothesis TestsOne-Sample Hypothesis Tests
One-Sample Hypothesis Tests
 

Similar to Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit

Directional Hypothesis testing
Directional Hypothesis testing Directional Hypothesis testing
Directional Hypothesis testing Rupak Roy
 
hypothesis testing.pptx
hypothesis testing.pptxhypothesis testing.pptx
hypothesis testing.pptxRUELLICANTO1
 
10.Analysis of Variance.ppt
10.Analysis of Variance.ppt10.Analysis of Variance.ppt
10.Analysis of Variance.pptAbdulhaqAli
 
Hypothesis Testing with ease
Hypothesis Testing with easeHypothesis Testing with ease
Hypothesis Testing with easeRupak Roy
 
4 1 probability and discrete probability distributions
4 1 probability and discrete    probability distributions4 1 probability and discrete    probability distributions
4 1 probability and discrete probability distributionsLama K Banna
 
Analyzing experimental research data
Analyzing experimental research dataAnalyzing experimental research data
Analyzing experimental research dataAtula Ahuja
 
Ashish Dadheech PRESENT TOPIC NURSING RESEARCH
Ashish Dadheech PRESENT TOPIC NURSING RESEARCHAshish Dadheech PRESENT TOPIC NURSING RESEARCH
Ashish Dadheech PRESENT TOPIC NURSING RESEARCHAshish Dadheech
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
 
Week 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docxWeek 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docxcockekeshia
 
Non parametrics tests
Non parametrics testsNon parametrics tests
Non parametrics testsrodrick koome
 
Inferential Statistics.pdf
Inferential Statistics.pdfInferential Statistics.pdf
Inferential Statistics.pdfShivakumar B N
 
Types of Statistics
Types of Statistics Types of Statistics
Types of Statistics Rupak Roy
 
Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data sciencepujashri1975
 
2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample size2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample sizesalummkata1
 
Statistics for Anaesthesiologists
Statistics for AnaesthesiologistsStatistics for Anaesthesiologists
Statistics for Anaesthesiologistsxeonfusion
 
Anova stat 512
Anova stat 512Anova stat 512
Anova stat 512gargnisha
 

Similar to Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit (20)

Directional Hypothesis testing
Directional Hypothesis testing Directional Hypothesis testing
Directional Hypothesis testing
 
Anova.pptx
Anova.pptxAnova.pptx
Anova.pptx
 
hypothesis testing.pptx
hypothesis testing.pptxhypothesis testing.pptx
hypothesis testing.pptx
 
non para.doc
non para.docnon para.doc
non para.doc
 
10.Analysis of Variance.ppt
10.Analysis of Variance.ppt10.Analysis of Variance.ppt
10.Analysis of Variance.ppt
 
Hypothesis Testing with ease
Hypothesis Testing with easeHypothesis Testing with ease
Hypothesis Testing with ease
 
4 1 probability and discrete probability distributions
4 1 probability and discrete    probability distributions4 1 probability and discrete    probability distributions
4 1 probability and discrete probability distributions
 
Analyzing experimental research data
Analyzing experimental research dataAnalyzing experimental research data
Analyzing experimental research data
 
Ashish Dadheech PRESENT TOPIC NURSING RESEARCH
Ashish Dadheech PRESENT TOPIC NURSING RESEARCHAshish Dadheech PRESENT TOPIC NURSING RESEARCH
Ashish Dadheech PRESENT TOPIC NURSING RESEARCH
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
 
Week 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docxWeek 3 Lecture 9 Effect Size When we reject the null h.docx
Week 3 Lecture 9 Effect Size When we reject the null h.docx
 
Non parametrics tests
Non parametrics testsNon parametrics tests
Non parametrics tests
 
Inferential Statistics.pdf
Inferential Statistics.pdfInferential Statistics.pdf
Inferential Statistics.pdf
 
Types of Statistics
Types of Statistics Types of Statistics
Types of Statistics
 
HYPOTHESES.pptx
HYPOTHESES.pptxHYPOTHESES.pptx
HYPOTHESES.pptx
 
Module-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data scienceModule-2_Notes-with-Example for data science
Module-2_Notes-with-Example for data science
 
Stat2013
Stat2013Stat2013
Stat2013
 
2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample size2.0.statistical methods and determination of sample size
2.0.statistical methods and determination of sample size
 
Statistics for Anaesthesiologists
Statistics for AnaesthesiologistsStatistics for Anaesthesiologists
Statistics for Anaesthesiologists
 
Anova stat 512
Anova stat 512Anova stat 512
Anova stat 512
 

More from Rupak Roy

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPRupak Roy
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPRupak Roy
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLPRupak Roy
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical StepsRupak Roy
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular ExpressionsRupak Roy
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining Rupak Roy
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase Rupak Roy
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQLRupak Roy
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSRupak Roy
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Rupak Roy
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functionsRupak Roy
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Rupak Roy
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command LineRupak Roy
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations Rupak Roy
 

More from Rupak Roy (20)

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLP
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLP
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLP
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical Steps
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular Expressions
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMS
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command Line
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/managementakshesh doshi
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computationsit20ad004
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationBoston Institute of Analytics
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 

Recently uploaded (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Spark3's new memory model/management
Spark3's new memory model/managementSpark3's new memory model/management
Spark3's new memory model/management
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Data Warehouse , Data Cube Computation
Data Warehouse   , Data Cube ComputationData Warehouse   , Data Cube Computation
Data Warehouse , Data Cube Computation
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 

Multiple sample test - Anova, Chi-square, Test of association, Goodness of Fit

  • 1. Anova, Chi-square One Way & Two Way, Test Of Association, Goodness Of Fit
  • 2. Anova  Analysis of Variance (ANOVA) is a statistical method used to compare the difference between two or more sample means. And in real life we will encounter more than 2 samples.  Anova can also be simply referred as Multiple Sample Test. Rupak Roy
  • 3. 2 kinds of Anova  One Way  Two way One way or two way refers to the number of independent variables in the data . One way anova has one independent variable with 2 levels and two way anova can have multiple levels. Rupak Roy
  • 4.  One way Anova here are the scores of 10 days by the athletics trained by different trainers. We need to determine whether the trainers have any effect on the scores. i.e. are the differences observed in the sample means statistically significant? There are two variances of anova 1) Within group variance ( sum of squared difference between each observations and the mean of the group it belongs) i.e. Sum of squares Within (SSW). 2) Between group variance ( sum of squared difference between each group mean and the overall mean) i.e. Sum of Squares Between (SSB). Rupak Roy
  • 5.  So what Anova does it takes the ratio of SSB and SSW, if the ratio is close to each other then it concludes that the means are not different. If the ratio is not close to 1 then it concludes the means are different.  Alternative way to understand Anova’s SSB & SSW +ve or – neg: how far or near the group average is from the overall mean ( between group variation (SSB)) +ve or –neg: how far or near the observation is from the group average ( within group variation (SSW)) Rupak Roy
  • 6.  In excel go to data tab and select data analysis and then Anova single Factor fill the input range with alpha i.e. level of confidence Rupak Roy
  • 7.  In the output check for Between group value Between group P-value is 0.74 which is > 0.05, So we failed to reject the null of hypothesis. Therefore trainers have no impact on the scores made by athletics . Rupak Roy
  • 8. Assumptions for anova  The samples must be independent to each other or we can say data should be random in nature.  Normality – the distributions of the population must be normally distributed or even approximately.  All the populations should have a common variance. If not then outcome of P-value will not be reliable. Rupak Roy
  • 9. Two way anova  We use two way anova to compare the effect of multiple levels or factors.  Or simply we can say multiple factors influencing the outcome. Two types of two way anova functions: 1. with replication 2. without replication Rupak Roy
  • 10.  With a two way ANOVA with replication, refers if we have 2 groups and within that group individuals are doing more than one thing ( like two groups of students from two colleges taking two tests ) and if we only have one group taking two tests, we will use without replication. Rupak Roy
  • 11.  Example: 2 way anova In Excel: go to data tab and select data analysis Then from the list select Anova : two factor with Replication Rupak Roy
  • 12.  Provide the input range and rows per sample : 3 as we can see 3 rows per sample Alpha : 0.05 Rupak Roy
  • 13. Two-way anova gives 3 p-values because it tests 3 null hypothesis. 1st Null hypothesis : Sample : Cold Hot Humid 2nd Null hypothesis: Columns : place, place2, place3 3rd Null hypothesis: Interaction: combination of 1st factor & 2nd factor. So we need the P-values of the Interaction to conclude that the multiple factors have any effect over the outcome or not and here we have 0.12 which is not statistically significant. However, the P-value of one of the individual factor(sample) is significant 3.06538E-07 Rupak Roy
  • 14.  Hence we conclude that the combinations of hot, cold or humid climate with place1, place2, place3 have no effect in the population size.  But with the statistically significant P value of the 1st null hypothesis we can also conclude that the climate type hot, cold & humid have a positive effect on the population rise. And again for the 2nd null hypothesis places have no effect in the population rise. If an anova results in rejecting the null hypothesis, we can also understand from the rejection is that at least one sample mean is different. To determine which group mean are different, we use Post Hoc Test. Rupak Roy
  • 15. Types of Post Hoc Test LSD Tests Tukey Tests Scheffe Tests But very often we want to know which are different i.e. Post Hoc Tests. Rupak Roy
  • 16. Chi -square  a statistical method for multiple sample tests.  2 common applications of chi-square: * Test of association * goodness of fit  Remember it is used only for dealing with count or categorical data. Rupak Roy
  • 17. Chi-square test of association or Independence  Let’s understand with the help of an example where if the age has any impact on the type of cars or we can say is there any association between age and type of cars. Rupak Roy
  • 18. Here Null Hypothesis (Ho): there is no association between age and car types. Alternative Hypothesis (Ha): there is an association between age and car types. In order to run the chi-square test we need the Expected values and if we don’t have, we have to calculate Manually the Expected values. Expected Values = Row Total* Column Total)/N Where N is the sum of the observations in the sample. Rupak Roy
  • 19. Expected Table we can also use reference to cells during calculation like =(D3*F3)/F7 Refer the lab video for better understanding Rupak Roy
  • 20.  Expected values Now, in Excel Chi-square = CHISQ.TEST(actual-range, expected-range) = 1.01888E-42 So we will reject the null hypothesis and conclude there are some association between age and types of car. Rupak Roy
  • 21. Example showing how to compute chi-square test Rupak Roy
  • 22. Chi-square Test for Goodness of Fit  It describes whether or not the data has followed a particular distribution. Example: Whether or not the sample or the distribution is following binomial distribution. Rupak Roy
  • 23.  Take an example where a we toss 2 coins at a time , and we got 0 head in 5 toss, 1 head in 6 toss , 2 heads in 19 toss, total 30 times tossed  Now we will calculate expected binomial probability of heads in the expected table Number_s: E4 i.e. 0 for 0 head. Trails: 2 i.e. two coins Probability_s (success) : 2(trials)/6(total trials) = 1/3 for 0 head, 1head, 2 heads False: point probability. Rupak Roy
  • 24.  Expected Values = Row total * Column Total Therefore, flipped=0.44*Column Total of actual times the coin flipped (for 0 head) then we will repeat the same steps for 1and 2 heads in the Expected table Hence, Chi-square test = (actual & expected )=1.0090E-18 Rupak Roy
  • 25.  With chi-square test result 1.0090E-18 which is smaller than 0.05 provides a strong evidence that the distribution is following a binomial distribution. Rupak Roy
  • 26. Practice  The time taken to assemble a laptop in a repairing shop having normal distribution of mean 10 hours and standard deviation of 1 hour, what is the probability that the shop can assemble a laptop in a given period of time ? a) assemble > 6hours? b) assemble < 10 hours ? Answers : a) 0.9 =1-NORM.DIST(6,10,1,TRUE) b) 0.5 =NORM.DIST(10,10,1,TRUE) Rupak Roy
  • 27. Poisson distribution  A train passes through a busy crossing at an average rate of 150 miles per hour. What will the probability that no train passes in 10 minutes ? In Excel: = poisson.dist( X , mean , cumulative) where X = 0 (no train passes) Average rate of the train per minute = 150/60 =2.5 = 3 miles Therefore average rate through 10min = 3x10= 30 miles Hence, = poisson.dist (0,30,FALSE) = 9.35762E-14 Rupak Roy
  • 28.  Find the probability that 5 train passes through a given only 10 minute. = poisson.dist ( X , mean , cumulative) where X = 5 , average rate per minute = 150/60 =2.5 = 3 miles therefore average rate through 10min = 3x10= 30 miles Hence P value = poisson.dist ( 5,30,False) = 2.25735e-08 Rupak Roy
  • 29.  The average registration for a event is 15%. A mail campaign for promoting a event was sent to 10000 customers and you received 1250 registrations. Are you sure the registration by mail campaign was really good than expected or it is just a randomness. Variable outcome: Registrations Yes or No. Therefore, it is a binomial probability distribution Probability of success rate (based on previous data): 15% Probability of new Success: 1250/10 000 = 12.5%=0.125 Probability of seeing the mail campaign outcome due to randomness =BINOM.DIST(1250,10000,0.125,FALSE) =0.01 which is very low. Hence the new campaign was a success. Rupak Roy
  • 30.  Hope you have Enjoyed. Thank you Rupak Roy