SlideShare a Scribd company logo
Categorical
Data Analysis
KRISHNAKUMAR D
AVMC&H
Categorical Data Analysis
Text Book: “ An
Introduction to
Categorical Data
Analysis”
By “ Alan Agresti”
Scales Of Measurement
 Four Scales Of Measurement:
 Nominal : No Order (e.g)- gender
 Ordinal: Order (e.g)- Income Status
 Ratio: Equal intervals with no True
0 (e.g): Height
 Interval: Equal intervals with True 0
(e.g): Temperature
Categorical Data Analysis
Categorical Data: Analysis
Strategies
 Hypothesis Testing: Is there any association?
 Chi Square Test, Fishers Exact test, etc
 Chapter- 1, 2 , 3.
 Modeling: What is the nature of Association?
 Logistic Regression, Log linear Models
 Chapter- 4, 5, 6,7
What is categorical data?
The measurement scale for the response
consists of a number of categories
Variable Measurement Scale
Farm system Organic & non organic
Education Good , average, poor
Food texture
Very soft, Soft, Hard,
Very hard
Nutrition
status
Grade 1, 2, 3
KAP- public
health
“yes” or “No”
Data Analysis considered:
 Response variable(s) –( Dependent Variable or Y variable)
is categorical
 Explanatory variable(s) –(Independent or X variable)
may be categorical or continuous or both
Example: Diabetes (categorical response) depend on the
explanatory variables?
Sex (categorical)
Age (continuous)
Example:
Y = Diabetes( Present, absent/ Normal, mild , moderate,
severe- Independent)
X’s = Income, Education, gender, age, Sedentary life style,
Hereditary etc.
Important Note
 Methods designed for nominal variables give the same results no
matter how the categories are listed
 Methods for ordinal variables utilize the category ordering. Whether we
list the categories from low to high or from high to low is irrelevant in
terms of substantive conclusions, but results would change if the
categories were reordered in any other way.
 Methods designed for ordinal variables cannot be used with nominal
variables
 However, Methods designed for nominal variables can be used with
nominal or ordinal variables
 If used, it results in serious loss of power.
•nominal < ordinal < interval
Probability Distributions
 For continuous response variable – Normal distribution
 For Categorical response variable – Binomial
distribution or multinomial distribution
Binomial Distribution
 n Bernoulli trials - two possible outcomes for each
(success, failure)
 ∏ = P(success), 1 − ∏ = P(failure) for each trial
 Y = number of successes out of n trials
 Trials are independent
Y has binomial distribution
, y= 0,1, 2,…, n
Example: Binomial
Distribution
 Vote (Democrat, Republican)
 Suppose = prob(Democrat) = 0.50.
For n = 3 persons, let y = number of
Democratic votes
then, p(0) = 0.125
p(1) = 0.375
p(2)= 0.375
p(3) = 0.125
Multinomial distribution
 When each trial has >2 possible outcomes, no of
outcomes in various categories have multinomial
distribution.
 Let c denote the number of outcome categories
 The binomial distribution is the special case with c = 2
categories.
Properties of the
Multinomial Experiment
1. The experiment consists of n identical trials.
2. There are k possible outcomes to each trial. These
outcomes are called classes, categories, or cells.
3. The probabilities of the k outcomes, denoted by p1,
p2,…, pk, remain the same from trial to trial,where
p1 + p2 + … + pk = 1.
4. The trials are independent.
5. The random variables of interest are the cell
counts, n1, n2, …, nk, of the number of
observations that fall in each of the k classes.
Statistical Inference for a
proportion
 The parameters of a Binomial and Multinomial
distribution are estimated using the sample data.
 Methods of estimation is “Maximum Likelihood
Estimation” (ML Estimation)
 The likelihood function(denoted by l) is the probability
of the observed data, expressed as a function of the
parameter value.
Contd…
Example:
Consider a Binomial case, n = 2, observe y = 1
 The likelihood function defined for between 0 and 1
 If = 0, probability is l (0) = 0 of getting y = 1
 If = 0.5, probability is l(0.5) = 0.5 of getting y = 1
Maximum Likelihood
 The maximum likelihood (ML) estimate is the
parameter value at which the likelihood function takes
its maximum.
 Example
l( ) = 2(1 − ) maximized at ˆ = 0.5
 i.e., y = 1 in n = 2 trials is most likely if = 0.5.
ML estimate of is ˆ = 0.50.
 In general, ML estimate of is p= y/n.
Binomial Likelihood functions for y=0
successes and y=6 successes in n
=10 trials
The result y = 6 in n = 10 trials is more likely to
occur when π = 0.60 than when π equals any other value.
Significance Test for
binomial parameter
 A significance test merely indicates whether a
particular value for a parameter is plausible.
 The ML estimator for the Binomial Distribution is the
sample proportion , p.
Confidence interval and
significance tests
 Three different test methods to find CI and test
statistic:
 Wald Method
 Likelihood-ratio method
 Score method
Wald Test
 Let be the ML estimator. Then the Wald Test
statistic to test is given by
Where SE is the Standard Error of the ML estimate
and this follows standard normal distribution and Z2
follows Chisquare distribution with d.f = 1.
 The z or chi-squared test using this test statistic is
called a Wald test.
Likelihood Ratio Test
This alternative test uses the likelihood function
through the ratio of two maximizations of it:
1. the maximum over the possible parameter values
that assume the null hypothesis,
2. the maximum over the larger set of possible
parameter values, permitting the null or the
alternative hypothesis to be true.
Contd..
Let l0 denote the maximized value of the likelihood
function under the null hypothesis, and let l1 denote
the maximized value more generally.
For instance, when there is a single parameter β, l0 is
the likelihood function calculated at β0, and 1 is the
likelihood function calculated at the ML estimate ˆ β.
Then l1 is always at least as large as l0, because l1
refers to maximizing over a larger set of possible
parameter values.
Remarks
 For ordinary regression models assuming a normal
distribution for Y , the three tests provide identical results.
 In other cases, for large samples they have similar
behaviour when H0 is true.
 Wald CI often has poor performance in categorical data
analysis unless n quite large.
 For inference about proportions, score method tends to
perform better than Wald method, in terms of having
actual error rates closer to the advertised levels.
 In practice, Wald inference is popular because of
simplicity, ease of forming it using software output
Thank you

More Related Content

Similar to introduction CDA.pptx

Non-parametric Statistical tests for Hypotheses testing
Non-parametric Statistical tests for Hypotheses testingNon-parametric Statistical tests for Hypotheses testing
Non-parametric Statistical tests for Hypotheses testing
Sundar B N
 
Hypothesis
HypothesisHypothesis
Hypothesis
debarati roy
 
Presentation chi-square test & Anova
Presentation   chi-square test & AnovaPresentation   chi-square test & Anova
Presentation chi-square test & Anova
Sonnappan Sridhar
 
Hmisiri nonparametrics book
Hmisiri nonparametrics bookHmisiri nonparametrics book
Hmisiri nonparametrics book
College of Medicine(University of Malawi)
 
inferentialstatistics-210411214248.pdf
inferentialstatistics-210411214248.pdfinferentialstatistics-210411214248.pdf
inferentialstatistics-210411214248.pdf
ChenPalaruan
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
Dalia El-Shafei
 
Test of significance
Test of significanceTest of significance
Test of significance
Dr. Imran Zaheer
 
BS 723_Class 6(5).pptx
BS 723_Class 6(5).pptxBS 723_Class 6(5).pptx
BS 723_Class 6(5).pptx
Oyebayo Ridwan Olaniran
 
Test of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testTest of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square test
dr.balan shaikh
 
What So Funny About Proportion Testv3
What So Funny About Proportion Testv3What So Funny About Proportion Testv3
What So Funny About Proportion Testv3
ChrisConnors
 
Probability distribution Function & Decision Trees in machine learning
Probability distribution Function  & Decision Trees in machine learningProbability distribution Function  & Decision Trees in machine learning
Probability distribution Function & Decision Trees in machine learning
Sadia Zafar
 
Testing hypothesis (methods of testing the statement of organizations)
Testing hypothesis (methods of testing the statement of organizations)Testing hypothesis (methods of testing the statement of organizations)
Testing hypothesis (methods of testing the statement of organizations)
syedahadisa929
 
Week 7 spss 2 2013
Week 7 spss 2 2013Week 7 spss 2 2013
Week 7 spss 2 2013
wawaaa789
 
biostat__final_ppt_unit_3.pptx
biostat__final_ppt_unit_3.pptxbiostat__final_ppt_unit_3.pptx
biostat__final_ppt_unit_3.pptx
ShubhamYalawatakar1
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
muthukrishnaveni anand
 
Data science
Data scienceData science
Data science
S. M. Akash
 
Testing hypothesis
Testing hypothesisTesting hypothesis
Testing hypothesis
Amit Sharma
 
K.A.Sindhura-t,z,f tests
K.A.Sindhura-t,z,f testsK.A.Sindhura-t,z,f tests
K.A.Sindhura-t,z,f tests
Sindhura Kopparthi
 
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docxPAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
gerardkortney
 

Similar to introduction CDA.pptx (20)

Non-parametric Statistical tests for Hypotheses testing
Non-parametric Statistical tests for Hypotheses testingNon-parametric Statistical tests for Hypotheses testing
Non-parametric Statistical tests for Hypotheses testing
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Presentation chi-square test & Anova
Presentation   chi-square test & AnovaPresentation   chi-square test & Anova
Presentation chi-square test & Anova
 
Hmisiri nonparametrics book
Hmisiri nonparametrics bookHmisiri nonparametrics book
Hmisiri nonparametrics book
 
inferentialstatistics-210411214248.pdf
inferentialstatistics-210411214248.pdfinferentialstatistics-210411214248.pdf
inferentialstatistics-210411214248.pdf
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Test of significance
Test of significanceTest of significance
Test of significance
 
BS 723_Class 6(5).pptx
BS 723_Class 6(5).pptxBS 723_Class 6(5).pptx
BS 723_Class 6(5).pptx
 
Test of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square testTest of-significance : Z test , Chi square test
Test of-significance : Z test , Chi square test
 
Stat topics
Stat topicsStat topics
Stat topics
 
What So Funny About Proportion Testv3
What So Funny About Proportion Testv3What So Funny About Proportion Testv3
What So Funny About Proportion Testv3
 
Probability distribution Function & Decision Trees in machine learning
Probability distribution Function  & Decision Trees in machine learningProbability distribution Function  & Decision Trees in machine learning
Probability distribution Function & Decision Trees in machine learning
 
Testing hypothesis (methods of testing the statement of organizations)
Testing hypothesis (methods of testing the statement of organizations)Testing hypothesis (methods of testing the statement of organizations)
Testing hypothesis (methods of testing the statement of organizations)
 
Week 7 spss 2 2013
Week 7 spss 2 2013Week 7 spss 2 2013
Week 7 spss 2 2013
 
biostat__final_ppt_unit_3.pptx
biostat__final_ppt_unit_3.pptxbiostat__final_ppt_unit_3.pptx
biostat__final_ppt_unit_3.pptx
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
Data science
Data scienceData science
Data science
 
Testing hypothesis
Testing hypothesisTesting hypothesis
Testing hypothesis
 
K.A.Sindhura-t,z,f tests
K.A.Sindhura-t,z,f testsK.A.Sindhura-t,z,f tests
K.A.Sindhura-t,z,f tests
 
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docxPAGE  O&M Statistics – Inferential Statistics Hypothesis Test.docx
PAGE O&M Statistics – Inferential Statistics Hypothesis Test.docx
 

More from Krishna Krish Krish

VUR & Reflux Nephropathy.pptx
VUR & Reflux Nephropathy.pptxVUR & Reflux Nephropathy.pptx
VUR & Reflux Nephropathy.pptx
Krishna Krish Krish
 
DIALYSIS IN PREGNANCY.ppsx
DIALYSIS IN PREGNANCY.ppsxDIALYSIS IN PREGNANCY.ppsx
DIALYSIS IN PREGNANCY.ppsx
Krishna Krish Krish
 
RESPIRATORY SYSTEM.pptx
RESPIRATORY  SYSTEM.pptxRESPIRATORY  SYSTEM.pptx
RESPIRATORY SYSTEM.pptx
Krishna Krish Krish
 
KIDNEY DISORDER IN PREGNANCY.pptx
KIDNEY DISORDER IN PREGNANCY.pptxKIDNEY DISORDER IN PREGNANCY.pptx
KIDNEY DISORDER IN PREGNANCY.pptx
Krishna Krish Krish
 
Hyperoxaluria.pptx
Hyperoxaluria.pptxHyperoxaluria.pptx
Hyperoxaluria.pptx
Krishna Krish Krish
 
Renal stones.pptx
Renal stones.pptxRenal stones.pptx
Renal stones.pptx
Krishna Krish Krish
 
Tibia (Shinbone) Shaft Fractures.pptx
Tibia (Shinbone) Shaft Fractures.pptxTibia (Shinbone) Shaft Fractures.pptx
Tibia (Shinbone) Shaft Fractures.pptx
Krishna Krish Krish
 
Chisquared test.pptx
Chisquared test.pptxChisquared test.pptx
Chisquared test.pptx
Krishna Krish Krish
 
Two – Way Contingency tables.ppt
Two – Way Contingency tables.pptTwo – Way Contingency tables.ppt
Two – Way Contingency tables.ppt
Krishna Krish Krish
 
water management (1).pptx
water management (1).pptxwater management (1).pptx
water management (1).pptx
Krishna Krish Krish
 
tracheostomy.pptx
tracheostomy.pptxtracheostomy.pptx
tracheostomy.pptx
Krishna Krish Krish
 
Endotracheal tubes.pptx
Endotracheal tubes.pptxEndotracheal tubes.pptx
Endotracheal tubes.pptx
Krishna Krish Krish
 
Water resources management in India.pptx
Water resources management in India.pptxWater resources management in India.pptx
Water resources management in India.pptx
Krishna Krish Krish
 
Integrated Industrial Water Management –.pptx
Integrated Industrial Water Management –.pptxIntegrated Industrial Water Management –.pptx
Integrated Industrial Water Management –.pptx
Krishna Krish Krish
 
Syncope1.pptx
Syncope1.pptxSyncope1.pptx
Syncope1.pptx
Krishna Krish Krish
 
Nasopharyngeal Airway.pptx
Nasopharyngeal Airway.pptxNasopharyngeal Airway.pptx
Nasopharyngeal Airway.pptx
Krishna Krish Krish
 
Oropharyngeal Airway.pptx
Oropharyngeal Airway.pptxOropharyngeal Airway.pptx
Oropharyngeal Airway.pptx
Krishna Krish Krish
 
Basic Ventilation
Basic VentilationBasic Ventilation
Basic Ventilation
Krishna Krish Krish
 
Normal Childbirth.pptx
Normal Childbirth.pptxNormal Childbirth.pptx
Normal Childbirth.pptx
Krishna Krish Krish
 
PPH Postpartum hemorrhage.pptx
PPH Postpartum hemorrhage.pptxPPH Postpartum hemorrhage.pptx
PPH Postpartum hemorrhage.pptx
Krishna Krish Krish
 

More from Krishna Krish Krish (20)

VUR & Reflux Nephropathy.pptx
VUR & Reflux Nephropathy.pptxVUR & Reflux Nephropathy.pptx
VUR & Reflux Nephropathy.pptx
 
DIALYSIS IN PREGNANCY.ppsx
DIALYSIS IN PREGNANCY.ppsxDIALYSIS IN PREGNANCY.ppsx
DIALYSIS IN PREGNANCY.ppsx
 
RESPIRATORY SYSTEM.pptx
RESPIRATORY  SYSTEM.pptxRESPIRATORY  SYSTEM.pptx
RESPIRATORY SYSTEM.pptx
 
KIDNEY DISORDER IN PREGNANCY.pptx
KIDNEY DISORDER IN PREGNANCY.pptxKIDNEY DISORDER IN PREGNANCY.pptx
KIDNEY DISORDER IN PREGNANCY.pptx
 
Hyperoxaluria.pptx
Hyperoxaluria.pptxHyperoxaluria.pptx
Hyperoxaluria.pptx
 
Renal stones.pptx
Renal stones.pptxRenal stones.pptx
Renal stones.pptx
 
Tibia (Shinbone) Shaft Fractures.pptx
Tibia (Shinbone) Shaft Fractures.pptxTibia (Shinbone) Shaft Fractures.pptx
Tibia (Shinbone) Shaft Fractures.pptx
 
Chisquared test.pptx
Chisquared test.pptxChisquared test.pptx
Chisquared test.pptx
 
Two – Way Contingency tables.ppt
Two – Way Contingency tables.pptTwo – Way Contingency tables.ppt
Two – Way Contingency tables.ppt
 
water management (1).pptx
water management (1).pptxwater management (1).pptx
water management (1).pptx
 
tracheostomy.pptx
tracheostomy.pptxtracheostomy.pptx
tracheostomy.pptx
 
Endotracheal tubes.pptx
Endotracheal tubes.pptxEndotracheal tubes.pptx
Endotracheal tubes.pptx
 
Water resources management in India.pptx
Water resources management in India.pptxWater resources management in India.pptx
Water resources management in India.pptx
 
Integrated Industrial Water Management –.pptx
Integrated Industrial Water Management –.pptxIntegrated Industrial Water Management –.pptx
Integrated Industrial Water Management –.pptx
 
Syncope1.pptx
Syncope1.pptxSyncope1.pptx
Syncope1.pptx
 
Nasopharyngeal Airway.pptx
Nasopharyngeal Airway.pptxNasopharyngeal Airway.pptx
Nasopharyngeal Airway.pptx
 
Oropharyngeal Airway.pptx
Oropharyngeal Airway.pptxOropharyngeal Airway.pptx
Oropharyngeal Airway.pptx
 
Basic Ventilation
Basic VentilationBasic Ventilation
Basic Ventilation
 
Normal Childbirth.pptx
Normal Childbirth.pptxNormal Childbirth.pptx
Normal Childbirth.pptx
 
PPH Postpartum hemorrhage.pptx
PPH Postpartum hemorrhage.pptxPPH Postpartum hemorrhage.pptx
PPH Postpartum hemorrhage.pptx
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 

introduction CDA.pptx

  • 2. Categorical Data Analysis Text Book: “ An Introduction to Categorical Data Analysis” By “ Alan Agresti”
  • 3.
  • 4. Scales Of Measurement  Four Scales Of Measurement:  Nominal : No Order (e.g)- gender  Ordinal: Order (e.g)- Income Status  Ratio: Equal intervals with no True 0 (e.g): Height  Interval: Equal intervals with True 0 (e.g): Temperature Categorical Data Analysis
  • 5. Categorical Data: Analysis Strategies  Hypothesis Testing: Is there any association?  Chi Square Test, Fishers Exact test, etc  Chapter- 1, 2 , 3.  Modeling: What is the nature of Association?  Logistic Regression, Log linear Models  Chapter- 4, 5, 6,7
  • 6. What is categorical data? The measurement scale for the response consists of a number of categories Variable Measurement Scale Farm system Organic & non organic Education Good , average, poor Food texture Very soft, Soft, Hard, Very hard Nutrition status Grade 1, 2, 3 KAP- public health “yes” or “No”
  • 7. Data Analysis considered:  Response variable(s) –( Dependent Variable or Y variable) is categorical  Explanatory variable(s) –(Independent or X variable) may be categorical or continuous or both Example: Diabetes (categorical response) depend on the explanatory variables? Sex (categorical) Age (continuous) Example: Y = Diabetes( Present, absent/ Normal, mild , moderate, severe- Independent) X’s = Income, Education, gender, age, Sedentary life style, Hereditary etc.
  • 8. Important Note  Methods designed for nominal variables give the same results no matter how the categories are listed  Methods for ordinal variables utilize the category ordering. Whether we list the categories from low to high or from high to low is irrelevant in terms of substantive conclusions, but results would change if the categories were reordered in any other way.  Methods designed for ordinal variables cannot be used with nominal variables  However, Methods designed for nominal variables can be used with nominal or ordinal variables  If used, it results in serious loss of power. •nominal < ordinal < interval
  • 9. Probability Distributions  For continuous response variable – Normal distribution  For Categorical response variable – Binomial distribution or multinomial distribution
  • 10. Binomial Distribution  n Bernoulli trials - two possible outcomes for each (success, failure)  ∏ = P(success), 1 − ∏ = P(failure) for each trial  Y = number of successes out of n trials  Trials are independent Y has binomial distribution , y= 0,1, 2,…, n
  • 11. Example: Binomial Distribution  Vote (Democrat, Republican)  Suppose = prob(Democrat) = 0.50. For n = 3 persons, let y = number of Democratic votes then, p(0) = 0.125 p(1) = 0.375 p(2)= 0.375 p(3) = 0.125
  • 12. Multinomial distribution  When each trial has >2 possible outcomes, no of outcomes in various categories have multinomial distribution.  Let c denote the number of outcome categories  The binomial distribution is the special case with c = 2 categories.
  • 13. Properties of the Multinomial Experiment 1. The experiment consists of n identical trials. 2. There are k possible outcomes to each trial. These outcomes are called classes, categories, or cells. 3. The probabilities of the k outcomes, denoted by p1, p2,…, pk, remain the same from trial to trial,where p1 + p2 + … + pk = 1. 4. The trials are independent. 5. The random variables of interest are the cell counts, n1, n2, …, nk, of the number of observations that fall in each of the k classes.
  • 14. Statistical Inference for a proportion  The parameters of a Binomial and Multinomial distribution are estimated using the sample data.  Methods of estimation is “Maximum Likelihood Estimation” (ML Estimation)  The likelihood function(denoted by l) is the probability of the observed data, expressed as a function of the parameter value.
  • 15. Contd… Example: Consider a Binomial case, n = 2, observe y = 1  The likelihood function defined for between 0 and 1  If = 0, probability is l (0) = 0 of getting y = 1  If = 0.5, probability is l(0.5) = 0.5 of getting y = 1
  • 16. Maximum Likelihood  The maximum likelihood (ML) estimate is the parameter value at which the likelihood function takes its maximum.  Example l( ) = 2(1 − ) maximized at ˆ = 0.5  i.e., y = 1 in n = 2 trials is most likely if = 0.5. ML estimate of is ˆ = 0.50.  In general, ML estimate of is p= y/n.
  • 17. Binomial Likelihood functions for y=0 successes and y=6 successes in n =10 trials The result y = 6 in n = 10 trials is more likely to occur when π = 0.60 than when π equals any other value.
  • 18. Significance Test for binomial parameter  A significance test merely indicates whether a particular value for a parameter is plausible.  The ML estimator for the Binomial Distribution is the sample proportion , p.
  • 19. Confidence interval and significance tests  Three different test methods to find CI and test statistic:  Wald Method  Likelihood-ratio method  Score method
  • 20. Wald Test  Let be the ML estimator. Then the Wald Test statistic to test is given by Where SE is the Standard Error of the ML estimate and this follows standard normal distribution and Z2 follows Chisquare distribution with d.f = 1.  The z or chi-squared test using this test statistic is called a Wald test.
  • 21. Likelihood Ratio Test This alternative test uses the likelihood function through the ratio of two maximizations of it: 1. the maximum over the possible parameter values that assume the null hypothesis, 2. the maximum over the larger set of possible parameter values, permitting the null or the alternative hypothesis to be true.
  • 22. Contd.. Let l0 denote the maximized value of the likelihood function under the null hypothesis, and let l1 denote the maximized value more generally. For instance, when there is a single parameter β, l0 is the likelihood function calculated at β0, and 1 is the likelihood function calculated at the ML estimate ˆ β. Then l1 is always at least as large as l0, because l1 refers to maximizing over a larger set of possible parameter values.
  • 23. Remarks  For ordinary regression models assuming a normal distribution for Y , the three tests provide identical results.  In other cases, for large samples they have similar behaviour when H0 is true.  Wald CI often has poor performance in categorical data analysis unless n quite large.  For inference about proportions, score method tends to perform better than Wald method, in terms of having actual error rates closer to the advertised levels.  In practice, Wald inference is popular because of simplicity, ease of forming it using software output