SlideShare a Scribd company logo
1 of 71
Chapter 14 Inference for Distributions of Categorical Variables:  Chi-Square Procedures
14.1 Test for Goodness of Fit
The problem Suppose we open a bag of M&M’s and count the number of M&M’s of each color. How would we know if our color counts are at normal levels? How would we know if our color counts were abnormal?
Chi-Square Distribution When we want to test the proportion of many counts (i.e. a two-way table or an array), we need to use a new distribution- The Chi-Square Distribution (Chi =  = “KAI”) As you might suspect, this is another (the last of the year) PHANTOMS procedure. The 2 distribution is found at table D and the [2nd] -> [Vars] (DIsT) menu on your calculator
Chi-Square Distribution When we want to test the proportion of many counts (i.e. a two-way table or an array), we need to use a new distribution- The Chi-Square Distribution (Chi =  = “KAI”) As you might suspect, this is another (the last of the year) PHANTOMS procedure. The 2 distribution is found at table D and the [2nd] -> [Vars] (DIsT) menu on your calculator
The 2 distribution Like the t-distribution, the 2 distribution is variable.  i.e. the distribution also has degrees of freedom. It is single peaked, right skewed. As the df increases, the peak decreases in height, moves to the right and becomes more symmetric/Normal. As df increases, the 2 statistic  needed for statistically significant results also increases
The 2 distribution
Chi-Square Goodness of Fit When we want to check whether a distribution fits a hypothesized distribution, we use the “2goodness of fit test” This is procedure is frequently used to see if a distribution is not in equal proportions  No, this will not be much different than what we have already been doing for the last 3 chapters.
2GOF Test Parameter Unlike previous tests, you will not need to state a  or a p. You need to state where the distribution come from. EXWe are investigating the proportions of all 15 oz. bags ofchocolate M&M’s of M&M’s
2GOF Test HypothesesThere are two styles for stating hypothesis Style 1 In this style, you will refer to a written table-or- state that all proportions are “equal” H0:  the proportions of M&M’s are the same as the table providedHa: at least one color count is different than the table 	H0: the proportions of accidents for each day is equalHa: at least one day has a count that is not equal
2GOF Test Hypotheses (cont.) Style 2 	In this style, you will write out the expected proportions 	H0: pred = pblue = pyel = pbrn = pgrn = porg = 1/6Ha: at least one probability is different that stated above.
2GOF Test Hypotheses (cont) 	Notice that the alternative hypothesis in each case is that at least one proportion is different than hypothesized
2GOF Test Assumptions 1. All expected cell counts are greater than 1 2. No more than 20% of the cell counts is less than 5 (that’s a whole lot easier, yeah?) Name of the Test “2Goodness Of Fit Test”
2GOF Test Test Statistic Observed Count (O) is the count for each cell that we observed.  The sum of each observed count is ‘n’ 	Expected Count (E) is the expected frequency of each cell times the sample size ‘n’
2GOF Test Test Statistic (cont) If we opened up a bag of M&M’s and found the following count: RedBlueBrwnYelGrnOrng O	:	5	3	10	6	4	3	 n = 31 E:		5.17	5.17	5.17	5.17	5.17	5.17 	Note: expected counts are all equal to 31/6We are testing to see if M&M’s come in equal proportions
2GOF Test Test Statistic (cont) The test statistic is 2(“kai squared”): Degrees of freedom (df) = # of classes – 1
2GOF Test Test Statistic (cont.)
2GOF Test P Value 	p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf 	Usage: “2-cdf( lower, upper, df ) pval = P(2(5) >  6.739)
2GOF Test P Value 	p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf 	Usage: “2-cdf( lower, upper, df ) pval = P(2(5) >  6.739)
2GOF Test P Value 	p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf 	Usage: “2-cdf( lower, upper, df ) pval = P(2(5) >  6.739)
2GOF Test P Value 	p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf 	Usage: “2-cdf( lower, upper, df ) pval = P(2(5) >  6.739) pval = 0.2409
2GOF Test Decision 	Similarly to the other tests, reject the null hypothesis when the p-value is below the accepted level Summary 	Use the same 3 part summary: 	1) Interpret the p value w.r.t. sampling distribution 	2) Make decision with reference to an alpha level 	3) Summarize the results in context of the problem
2GOF Test Summary (cont.) 	“The given proportions in a sample of 31 would appear in approximately 24% of  all random samples.” 	“Because this p value is greater than any acceptable alpha levels, we fail to reject the null hypothesis.” 	“We do not have sufficient evidence to conclude that the color distribution in M&M’s is not equally distributed”
Calculator methods  TI83/84
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2”
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2”
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)”
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)”
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)”
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)” This is the value of 2.
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)” This is the value of 2. Use the 2-cdf from the “Dist Menu” to find p-value “2-cdf (lower, upper, df)
Calculator methods  TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)” This is the value of 2. Use the 2-cdf from the “Dist Menu” to find p-value “2-cdf (lower, upper, df)
14.2 Inference For Two-Way Tables
Comparing two-groups ,[object Object]
Not that information is presented in a two-way table with marginal distributions
Is there a relationship between these two categorical variables??,[object Object]
Expected cell count for 2-way tables
Expected cell count for 2-way tables % of population that are in the column
Expected cell count for 2-way tables Count of cell if the rows “obeyed”the column percentages
Expected cell count for 2-way tables Even for a small table, these calculations get cumbersome
Expected Counts 30 99 243 84 Row total x Column Total Expected = Total
Expected Counts 30 99 243 84 99 x 84 Expected = 243
Expected Counts 30 99 243 84 99 x 84 Expected = = 34.22 243
Expected Counts 34.44 99 243 84 99 x 84 Expected = = 34.22 243
Expected Counts 34.44 99 243 84 99 x 84 Expected = = 34.22 243
Expected Counts 34.44 99 243 84 99 x 84 Expected = = 34.22 243 Let’s start with the PHANTOMS procedure
2 Test for Homogeneity Parameter State where each proportion comes from and what each count represents 	“We are investigating the proportions of customers in the store  who purchase French, Italian or other wine while listening to French, Italian or other music.”
2 Test for Homogeneity Hypotheses The null hypothesis is always “the distributions of (group A) are the same in all population of (group B)” The alternative hypothesis is always “the distribution of (group A) are not all the same 	“H0: the distributions of wine types are the same in all populations of music types Ha: the distributions of wine types are not all the same”
2 Test for Homogeneity Assumptions (1) No more than 20% of the expected cell counts are less than 5 (2) All expected cell counts are > 1 (3) In a 2 x 2 table, all expected counts are greater than 5
2 Test for Homogeneity “All expected cell counts are greater than 5”
2 Test for Homogeneity Test Statistic
2 Test for Homogeneity P Value Decision
2 Test for Homogeneity P Value Decision
2 Test for Homogeneity P Value Decision Reject null hypothesis
2 Test for Homogeneity Summary Approximately 0.1% of the time, a random sample of 243 will produce the distribution given. Because the p value is less than an  of 0.05, we will reject the null hypothesis. We have sufficient evidence at the 5% significance level to conclude that the distribution of wine types purchased is not the same in all music  types.
Calculator Methods Methods on the TI84
Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1]
Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1]
Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1] Input the correct matrix size and cell counts(Use [ENTER] or the Cursor Keys to switch between fields.)
Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1] Input the correct matrix size and cell counts(Use [ENTER] or the Cursor Keys to switch between fields.)
Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test”
Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test”
Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate”
Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate”
Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate”
Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate” 	The expected cell counts will be calculated and stored in Matrix [B] (go back to the Matrix menu to see the expected Counts)
2 Tests Occasionally,  you will be asked to find the cell that “contributed the most to the 2 statistic.” When this is asked, you must calculate the 2 statistic by hand and find the largest value of(O – E)2 / E. This is usually the cell that differs the most from the expected count Since this is a percent calculation, it is not always predictable.
2 Test for Independence A similar test for two way tables is the “2 Test for Independence” sometimes called“2 Test for Association” This test is asks the question, “do the two variables influence each other?” When there is no association, the observed two-way table is close to the expected table
2 Test for Independence 	This test really only differs from the test for homogeneity in the hypotheses and the conclusion. Hypotheses The null hypothesis is “there is no association between (group 1) and (group 2)” The alternative hypothesis is “there is an association between (group 1) and (group 2)”
2 Test for Independence Conclusion Phrase your conclusions similar to the ones we have been constructing. When failing to reject H0:After interpreting the p value and comparing the p value to alpha, state that there is “no evidence to conclude that an association exists between (group 1) and (group 2)” 	Likewise, when rejecting H0, state that “there is sufficient evidence to conclude that an association exists between (group 1) and (group 2)”
Assignment 14.2 Page 877 #29, 31, 32, 33

More Related Content

What's hot (20)

Inferences about Two Proportions
 Inferences about Two Proportions Inferences about Two Proportions
Inferences about Two Proportions
 
Chapter12
Chapter12Chapter12
Chapter12
 
Two Variances or Standard Deviations
Two Variances or Standard DeviationsTwo Variances or Standard Deviations
Two Variances or Standard Deviations
 
Chapter 15
Chapter 15 Chapter 15
Chapter 15
 
mining
miningmining
mining
 
Chapter9
Chapter9Chapter9
Chapter9
 
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anovaSolution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Probability Distribution
Probability DistributionProbability Distribution
Probability Distribution
 
13 fractions, multiplication and divisin of fractions
13 fractions, multiplication and divisin of fractions13 fractions, multiplication and divisin of fractions
13 fractions, multiplication and divisin of fractions
 
Chapter8
Chapter8Chapter8
Chapter8
 
Complements and Conditional Probability, and Bayes' Theorem
 Complements and Conditional Probability, and Bayes' Theorem Complements and Conditional Probability, and Bayes' Theorem
Complements and Conditional Probability, and Bayes' Theorem
 
Counting
CountingCounting
Counting
 
Tryptone task
Tryptone taskTryptone task
Tryptone task
 
Estimating a Population Mean
Estimating a Population MeanEstimating a Population Mean
Estimating a Population Mean
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
 
Chapter04
Chapter04Chapter04
Chapter04
 
Doe Helicopters Project
Doe Helicopters ProjectDoe Helicopters Project
Doe Helicopters Project
 
Assessing Normality
Assessing NormalityAssessing Normality
Assessing Normality
 
Basic Concepts of Probability
Basic Concepts of ProbabilityBasic Concepts of Probability
Basic Concepts of Probability
 
Chapter11
Chapter11Chapter11
Chapter11
 

Viewers also liked

10.1 part2
10.1 part210.1 part2
10.1 part2leblance
 
Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Shakeel Nouman
 
Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)
Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)
Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)pops macalino
 
Aron chpt 11 ed (2)
Aron chpt 11 ed (2)Aron chpt 11 ed (2)
Aron chpt 11 ed (2)Sandra Nicks
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independencejasondroesch
 
Chi square test final
Chi square test finalChi square test final
Chi square test finalHar Jindal
 
Chi Square Worked Example
Chi Square Worked ExampleChi Square Worked Example
Chi Square Worked ExampleJohn Barlow
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011photomatt
 

Viewers also liked (18)

Stats chapter 7
Stats chapter 7Stats chapter 7
Stats chapter 7
 
Stats chapter 15
Stats chapter 15Stats chapter 15
Stats chapter 15
 
10.1 part2
10.1 part210.1 part2
10.1 part2
 
Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)Nonparametric methods and chi square tests (1)
Nonparametric methods and chi square tests (1)
 
Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)
Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)
Chi-Square test of Homogeneity by Pops P. Macalino (TSU-MAEd)
 
Chi square using excel
Chi square using excelChi square using excel
Chi square using excel
 
Aron chpt 11 ed (2)
Aron chpt 11 ed (2)Aron chpt 11 ed (2)
Aron chpt 11 ed (2)
 
Goodness of fit (ppt)
Goodness of fit (ppt)Goodness of fit (ppt)
Goodness of fit (ppt)
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Chi square analysis
Chi square analysisChi square analysis
Chi square analysis
 
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and IndependenceThe Chi-Square Statistic: Tests for Goodness of Fit and Independence
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
 
Chi square test final
Chi square test finalChi square test final
Chi square test final
 
Chi Square Worked Example
Chi Square Worked ExampleChi Square Worked Example
Chi Square Worked Example
 
Chi square test
Chi square test Chi square test
Chi square test
 
The Chi-Squared Test
The Chi-Squared TestThe Chi-Squared Test
The Chi-Squared Test
 
Chi – square test
Chi – square testChi – square test
Chi – square test
 
Chi square test
Chi square testChi square test
Chi square test
 
State of the Word 2011
State of the Word 2011State of the Word 2011
State of the Word 2011
 

Similar to Stats chapter 14

Memorization of Various Calculator shortcuts
Memorization of Various Calculator shortcutsMemorization of Various Calculator shortcuts
Memorization of Various Calculator shortcutsPrincessNorberte
 
10.Analysis of Variance.ppt
10.Analysis of Variance.ppt10.Analysis of Variance.ppt
10.Analysis of Variance.pptAbdulhaqAli
 
1. A law firm wants to determine the trend in its annual billings .docx
1. A law firm wants to determine the trend in its annual billings .docx1. A law firm wants to determine the trend in its annual billings .docx
1. A law firm wants to determine the trend in its annual billings .docxmonicafrancis71118
 
Stat 130 chi-square goodnes-of-fit test
Stat 130   chi-square goodnes-of-fit testStat 130   chi-square goodnes-of-fit test
Stat 130 chi-square goodnes-of-fit testAldrin Lozano
 
Lesson06_new
Lesson06_newLesson06_new
Lesson06_newshengvn
 
Chi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupChi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupNeelam Zafar
 
TSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docx
TSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docxTSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docx
TSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docxnanamonkton
 
Chi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptChi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptBAGARAGAZAROMUALD2
 

Similar to Stats chapter 14 (20)

Chapter11
Chapter11Chapter11
Chapter11
 
Memorization of Various Calculator shortcuts
Memorization of Various Calculator shortcutsMemorization of Various Calculator shortcuts
Memorization of Various Calculator shortcuts
 
Statistics
StatisticsStatistics
Statistics
 
10.Analysis of Variance.ppt
10.Analysis of Variance.ppt10.Analysis of Variance.ppt
10.Analysis of Variance.ppt
 
Chapter12
Chapter12Chapter12
Chapter12
 
T test
T test T test
T test
 
1. A law firm wants to determine the trend in its annual billings .docx
1. A law firm wants to determine the trend in its annual billings .docx1. A law firm wants to determine the trend in its annual billings .docx
1. A law firm wants to determine the trend in its annual billings .docx
 
Two Proportions
Two Proportions  Two Proportions
Two Proportions
 
Hypothesis and Test
Hypothesis and TestHypothesis and Test
Hypothesis and Test
 
Stat 130 chi-square goodnes-of-fit test
Stat 130   chi-square goodnes-of-fit testStat 130   chi-square goodnes-of-fit test
Stat 130 chi-square goodnes-of-fit test
 
Lesson06_new
Lesson06_newLesson06_new
Lesson06_new
 
Chi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & groupChi square and t tests, Neelam zafar & group
Chi square and t tests, Neelam zafar & group
 
Chapter 9
Chapter 9Chapter 9
Chapter 9
 
Chisquare Test
Chisquare Test Chisquare Test
Chisquare Test
 
Inferential statistics nominal data
Inferential statistics   nominal dataInferential statistics   nominal data
Inferential statistics nominal data
 
Two variances or standard deviations
Two variances or standard deviations  Two variances or standard deviations
Two variances or standard deviations
 
STATISTIC ESTIMATION
STATISTIC ESTIMATIONSTATISTIC ESTIMATION
STATISTIC ESTIMATION
 
Goodness of Fit Notation
Goodness of Fit NotationGoodness of Fit Notation
Goodness of Fit Notation
 
TSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docx
TSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docxTSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docx
TSTD 6251  Fall 2014SPSS Exercise and Assignment 120 PointsI.docx
 
Chi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.pptChi-Square Presentation - Nikki.ppt
Chi-Square Presentation - Nikki.ppt
 

More from Richard Ferreria (20)

Chapter6
Chapter6Chapter6
Chapter6
 
Chapter2
Chapter2Chapter2
Chapter2
 
Chapter3
Chapter3Chapter3
Chapter3
 
Chapter1
Chapter1Chapter1
Chapter1
 
Chapter4
Chapter4Chapter4
Chapter4
 
Chapter7
Chapter7Chapter7
Chapter7
 
Chapter5
Chapter5Chapter5
Chapter5
 
Chapter14
Chapter14Chapter14
Chapter14
 
Chapter15
Chapter15Chapter15
Chapter15
 
Chapter10
Chapter10Chapter10
Chapter10
 
Chapter13
Chapter13Chapter13
Chapter13
 
Adding grades to your google site v2 (dropbox)
Adding grades to your google site v2 (dropbox)Adding grades to your google site v2 (dropbox)
Adding grades to your google site v2 (dropbox)
 
Stats chapter 13
Stats chapter 13Stats chapter 13
Stats chapter 13
 
Stats chapter 12
Stats chapter 12Stats chapter 12
Stats chapter 12
 
Stats chapter 11
Stats chapter 11Stats chapter 11
Stats chapter 11
 
Stats chapter 11
Stats chapter 11Stats chapter 11
Stats chapter 11
 
Stats chapter 10
Stats chapter 10Stats chapter 10
Stats chapter 10
 
Stats chapter 9
Stats chapter 9Stats chapter 9
Stats chapter 9
 
Stats chapter 8
Stats chapter 8Stats chapter 8
Stats chapter 8
 
Stats chapter 8
Stats chapter 8Stats chapter 8
Stats chapter 8
 

Recently uploaded

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Recently uploaded (20)

Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Stats chapter 14

  • 1. Chapter 14 Inference for Distributions of Categorical Variables: Chi-Square Procedures
  • 2. 14.1 Test for Goodness of Fit
  • 3. The problem Suppose we open a bag of M&M’s and count the number of M&M’s of each color. How would we know if our color counts are at normal levels? How would we know if our color counts were abnormal?
  • 4. Chi-Square Distribution When we want to test the proportion of many counts (i.e. a two-way table or an array), we need to use a new distribution- The Chi-Square Distribution (Chi =  = “KAI”) As you might suspect, this is another (the last of the year) PHANTOMS procedure. The 2 distribution is found at table D and the [2nd] -> [Vars] (DIsT) menu on your calculator
  • 5. Chi-Square Distribution When we want to test the proportion of many counts (i.e. a two-way table or an array), we need to use a new distribution- The Chi-Square Distribution (Chi =  = “KAI”) As you might suspect, this is another (the last of the year) PHANTOMS procedure. The 2 distribution is found at table D and the [2nd] -> [Vars] (DIsT) menu on your calculator
  • 6. The 2 distribution Like the t-distribution, the 2 distribution is variable. i.e. the distribution also has degrees of freedom. It is single peaked, right skewed. As the df increases, the peak decreases in height, moves to the right and becomes more symmetric/Normal. As df increases, the 2 statistic needed for statistically significant results also increases
  • 8. Chi-Square Goodness of Fit When we want to check whether a distribution fits a hypothesized distribution, we use the “2goodness of fit test” This is procedure is frequently used to see if a distribution is not in equal proportions No, this will not be much different than what we have already been doing for the last 3 chapters.
  • 9. 2GOF Test Parameter Unlike previous tests, you will not need to state a  or a p. You need to state where the distribution come from. EXWe are investigating the proportions of all 15 oz. bags ofchocolate M&M’s of M&M’s
  • 10. 2GOF Test HypothesesThere are two styles for stating hypothesis Style 1 In this style, you will refer to a written table-or- state that all proportions are “equal” H0: the proportions of M&M’s are the same as the table providedHa: at least one color count is different than the table H0: the proportions of accidents for each day is equalHa: at least one day has a count that is not equal
  • 11. 2GOF Test Hypotheses (cont.) Style 2 In this style, you will write out the expected proportions H0: pred = pblue = pyel = pbrn = pgrn = porg = 1/6Ha: at least one probability is different that stated above.
  • 12. 2GOF Test Hypotheses (cont) Notice that the alternative hypothesis in each case is that at least one proportion is different than hypothesized
  • 13. 2GOF Test Assumptions 1. All expected cell counts are greater than 1 2. No more than 20% of the cell counts is less than 5 (that’s a whole lot easier, yeah?) Name of the Test “2Goodness Of Fit Test”
  • 14. 2GOF Test Test Statistic Observed Count (O) is the count for each cell that we observed. The sum of each observed count is ‘n’ Expected Count (E) is the expected frequency of each cell times the sample size ‘n’
  • 15. 2GOF Test Test Statistic (cont) If we opened up a bag of M&M’s and found the following count: RedBlueBrwnYelGrnOrng O : 5 3 10 6 4 3 n = 31 E: 5.17 5.17 5.17 5.17 5.17 5.17 Note: expected counts are all equal to 31/6We are testing to see if M&M’s come in equal proportions
  • 16. 2GOF Test Test Statistic (cont) The test statistic is 2(“kai squared”): Degrees of freedom (df) = # of classes – 1
  • 17. 2GOF Test Test Statistic (cont.)
  • 18. 2GOF Test P Value p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf Usage: “2-cdf( lower, upper, df ) pval = P(2(5) > 6.739)
  • 19. 2GOF Test P Value p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf Usage: “2-cdf( lower, upper, df ) pval = P(2(5) > 6.739)
  • 20. 2GOF Test P Value p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf Usage: “2-cdf( lower, upper, df ) pval = P(2(5) > 6.739)
  • 21. 2GOF Test P Value p val = P(2(df) > test statistic ) on the calculator, [2nd] -> [VARS] (DIST) -> 2-cdf Usage: “2-cdf( lower, upper, df ) pval = P(2(5) > 6.739) pval = 0.2409
  • 22. 2GOF Test Decision Similarly to the other tests, reject the null hypothesis when the p-value is below the accepted level Summary Use the same 3 part summary: 1) Interpret the p value w.r.t. sampling distribution 2) Make decision with reference to an alpha level 3) Summarize the results in context of the problem
  • 23. 2GOF Test Summary (cont.) “The given proportions in a sample of 31 would appear in approximately 24% of all random samples.” “Because this p value is greater than any acceptable alpha levels, we fail to reject the null hypothesis.” “We do not have sufficient evidence to conclude that the color distribution in M&M’s is not equally distributed”
  • 25. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2”
  • 26. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2”
  • 27. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)”
  • 28. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)”
  • 29. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)”
  • 30. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)” This is the value of 2.
  • 31. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)” This is the value of 2. Use the 2-cdf from the “Dist Menu” to find p-value “2-cdf (lower, upper, df)
  • 32. Calculator methods TI83/84 Begin by storing the observed counts in “L1” Store the expected counts in “L2” From the Home Screen evaluate:“sum((L1 – L2)2/L2)” This is the value of 2. Use the 2-cdf from the “Dist Menu” to find p-value “2-cdf (lower, upper, df)
  • 33. 14.2 Inference For Two-Way Tables
  • 34.
  • 35. Not that information is presented in a two-way table with marginal distributions
  • 36.
  • 37. Expected cell count for 2-way tables
  • 38. Expected cell count for 2-way tables % of population that are in the column
  • 39. Expected cell count for 2-way tables Count of cell if the rows “obeyed”the column percentages
  • 40. Expected cell count for 2-way tables Even for a small table, these calculations get cumbersome
  • 41. Expected Counts 30 99 243 84 Row total x Column Total Expected = Total
  • 42. Expected Counts 30 99 243 84 99 x 84 Expected = 243
  • 43. Expected Counts 30 99 243 84 99 x 84 Expected = = 34.22 243
  • 44. Expected Counts 34.44 99 243 84 99 x 84 Expected = = 34.22 243
  • 45. Expected Counts 34.44 99 243 84 99 x 84 Expected = = 34.22 243
  • 46. Expected Counts 34.44 99 243 84 99 x 84 Expected = = 34.22 243 Let’s start with the PHANTOMS procedure
  • 47. 2 Test for Homogeneity Parameter State where each proportion comes from and what each count represents “We are investigating the proportions of customers in the store who purchase French, Italian or other wine while listening to French, Italian or other music.”
  • 48. 2 Test for Homogeneity Hypotheses The null hypothesis is always “the distributions of (group A) are the same in all population of (group B)” The alternative hypothesis is always “the distribution of (group A) are not all the same “H0: the distributions of wine types are the same in all populations of music types Ha: the distributions of wine types are not all the same”
  • 49. 2 Test for Homogeneity Assumptions (1) No more than 20% of the expected cell counts are less than 5 (2) All expected cell counts are > 1 (3) In a 2 x 2 table, all expected counts are greater than 5
  • 50. 2 Test for Homogeneity “All expected cell counts are greater than 5”
  • 51. 2 Test for Homogeneity Test Statistic
  • 52. 2 Test for Homogeneity P Value Decision
  • 53. 2 Test for Homogeneity P Value Decision
  • 54. 2 Test for Homogeneity P Value Decision Reject null hypothesis
  • 55. 2 Test for Homogeneity Summary Approximately 0.1% of the time, a random sample of 243 will produce the distribution given. Because the p value is less than an  of 0.05, we will reject the null hypothesis. We have sufficient evidence at the 5% significance level to conclude that the distribution of wine types purchased is not the same in all music types.
  • 57. Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1]
  • 58. Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1]
  • 59. Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1] Input the correct matrix size and cell counts(Use [ENTER] or the Cursor Keys to switch between fields.)
  • 60. Calculator Methods Methods on the TI84 Before you begin the test, you must enter the “observed counts” into MATRIX [A] [2ND] -> [x-1] (MATRIX) -> “EDIT” -> [1] Input the correct matrix size and cell counts(Use [ENTER] or the Cursor Keys to switch between fields.)
  • 61. Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test”
  • 62. Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test”
  • 63. Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate”
  • 64. Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate”
  • 65. Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate”
  • 66. Calculator Methods Methods on the TI84 (cont.) IMPORTANT: after inputting the observed matrix, quit and go to the home screen [STAT] -> “TESTS” -> “2 Test” Ensure that “Observed” is set to [A] and“Expected” is set to [B] “Calculate” The expected cell counts will be calculated and stored in Matrix [B] (go back to the Matrix menu to see the expected Counts)
  • 67. 2 Tests Occasionally, you will be asked to find the cell that “contributed the most to the 2 statistic.” When this is asked, you must calculate the 2 statistic by hand and find the largest value of(O – E)2 / E. This is usually the cell that differs the most from the expected count Since this is a percent calculation, it is not always predictable.
  • 68. 2 Test for Independence A similar test for two way tables is the “2 Test for Independence” sometimes called“2 Test for Association” This test is asks the question, “do the two variables influence each other?” When there is no association, the observed two-way table is close to the expected table
  • 69. 2 Test for Independence This test really only differs from the test for homogeneity in the hypotheses and the conclusion. Hypotheses The null hypothesis is “there is no association between (group 1) and (group 2)” The alternative hypothesis is “there is an association between (group 1) and (group 2)”
  • 70. 2 Test for Independence Conclusion Phrase your conclusions similar to the ones we have been constructing. When failing to reject H0:After interpreting the p value and comparing the p value to alpha, state that there is “no evidence to conclude that an association exists between (group 1) and (group 2)” Likewise, when rejecting H0, state that “there is sufficient evidence to conclude that an association exists between (group 1) and (group 2)”
  • 71. Assignment 14.2 Page 877 #29, 31, 32, 33