SlideShare a Scribd company logo
1 of 8
Download to read offline
© 2013 ExcelR Solutions. All Rights Reserved
Advanced Regression
AGENDA	
Mul)nomial	
Regression	
Zero	Inflated	
Poisson	
Regression	
Nega)ve	
Binomial
© 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression
•  Logis'c	regression	(Binomial	distribu'on)	is	used	when	output	has	‘2’	categories	
•  Mul'nomial	regression	(classifica'on	model)	is	used	when	output	has	>	‘2’	categories	
•  Extension	to	logis'c	regression	
	
•  No	natural	ordering	of	categories	
•  Response	variable	has	>	‘2’	categories	&	hence	we	apply	mul'logit	
•  Understand	the	impact	of	cost	&	'me	on	the	various	modes	of	transport	
Mode	of	
transport	
Car	 Carpool	 Bus	 Rail	 All	modes	
Count	 218	 32	 81	 122	 453	
Probability	 0.48	 0.07	 0.18	 0.27	 1
© 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression
•  Whether	we	have	‘Y’	(response)	or	‘X’	(predictor),	which	is	categorical	with	‘s’	categories	
ü  Lowest	in	numerical	/	lexicographical	value	is	chosen	as	baseline	/	reference	
ü  Missing	level	in	output	is	baseline	level	
ü  We	can	choose	the	baseline	level	of	our	choice	based	on	‘relevel’	func'on	in	R	
ü  Model	formulates	the	rela'onship	between	transformed	(logit)	Y	&	numerical	X	linearly	
ü  Modeling	quan'ta've	variables	linearly	might	not	always	be	correct
© 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression - Output
Itera'on	History:		
•  Itera've	procedure	is	used	to	compute	maximum	likelihood	es'mates	
•  #	itera'ons	&	convergence	status	is	provided	
•  -2logL	=	2	*	nega've	log	likelihood	
•  -2logL	has	χ2	distribu'on,	which	is	used	for	hypothesis	tes'ng	of	goodness	of	fit	
#	parameters	=	27
© 2013 ExcelR Solutions. All Rights Reserved
Multinomial Regression - Output
Log(P(choice	=	carpool	|	x)	/	P(choice	=	car	|	x)	=	β20	+	β21	*	cost.car	+	β22	*	cost.carpool	+	…………….		
	
This	equa'on	compares	the	log	of	probabili'es	of	carpool	to	car			
•  ‘car’	has	been	chosen	as	baseline	
•  x	=	vector	represen'ng	the	values	of	all	inputs	
•  The	regression	coefficient	0.636	indicates	that	for	a	‘1’	unit	increases	the	‘cost.car’,	the	log	odds	of	‘carpool’	to	‘car’	
increases	by	0.636	
•  Intercept	value	does	not	mean	anything	in	this	context	
	
•  If	we	have	a	categorical	X	also,	say	Gender	(female	=	0,	male	=	1),	then	regression	coefficient	(say	0.22)	indicates	
that	rela've	to	females,	males	increase	the	log	odds	of	‘carpool’	to	‘car’	by	0.22
© 2013 ExcelR Solutions. All Rights Reserved
Probability
•  Let	p	=	p(x	|	A)	be	the	probability	of	any	event	(say	airi'on)	under	condi'on	A	(say	
gender	=	female)		
	
•  Then		p(x	|	A)	÷	(1	-	p(x	|	A)	is	called	the	odds	associated	with	the	event	
Odds
•  If	there	are	two	condi'ons	A	(gender	=	female)	&	B	(gender	=	male)	then	the	ra'o	
						p(x	|	A)	÷	(1	-	p(x	|	A)	/	p(x	|	B)	÷	(1	-	p(x	|	B)		is	called	as	odds	ra'o	of	A	with	respect	to	B	
Odds Ratio
•  p(x	|	A)	÷	p(x	|	B)	is	called	as	rela've	risk	
Relative Risk
hips://en.wikipedia.org/wiki/Rela've_risk
© 2013 ExcelR Solutions. All Rights Reserved
•  Odds	ra'o	is	computed	from	the	coefficients	in	the	linear	model	equa'on	by	simply	
exponen'a'ng	
•  Exponen'ated	regression	coefficients	are	odds	ra'o	for	a	unit	change	in	a	predictor	
variable	
•  The	odds	ra'o	for	a	unit	increase	in	cost.car	is	1.88	for	choosing	carpool	vs	car	
Odds Ratio
© 2013 ExcelR Solutions. All Rights Reserved
Goodness of fit
Linear	 GLM	
Analysis	of	Variance	 Analysis	of	Deviance	
Residual	Deviance	 Residual	Sum	of	Squares	
OLS	 Maximum	Likelihood	
•  Residual	Deviance	is	-2	log	L	
•  Adding	more	parameters	to	the	model	will	reduce	Residual	Deviance	even	if	it	is	not	
going	to	be	useful	for	predic'on	
•  In	order	to	control	this,	penalty	of	“2	*	number	of	parameters”	is	added	to	to	
Residual	deviance	
•  This	penalized	value	of	-2	log	L	is	called	as	AIC	criterion	
•  AIC	=	-2	log	L	+	2	*	number	of	parameters	
Note:	“Mul'logit	Model	with	Interac(on”

More Related Content

More from Data Analytics Courses in Pune

More from Data Analytics Courses in Pune (10)

Data science course in Pune
Data science course in PuneData science course in Pune
Data science course in Pune
 
Data science certification in pune
Data science certification in puneData science certification in pune
Data science certification in pune
 
Data science certification in pune
Data science certification in puneData science certification in pune
Data science certification in pune
 
Data Science Course In Pune
Data Science Course In PuneData Science Course In Pune
Data Science Course In Pune
 
data science certification
data science certificationdata science certification
data science certification
 
Data Science Course
Data Science CourseData Science Course
Data Science Course
 
Best data science training, best data science training institute in hyderabad.
 Best data science training, best data science training institute in hyderabad. Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.
 
Best data science training, best data science training institute in hyderabad.
 Best data science training, best data science training institute in hyderabad. Best data science training, best data science training institute in hyderabad.
Best data science training, best data science training institute in hyderabad.
 
Machine learning course in Coimbatore
Machine learning course in CoimbatoreMachine learning course in Coimbatore
Machine learning course in Coimbatore
 
Data science course in pune
Data science course in puneData science course in pune
Data science course in pune
 

Recently uploaded

Recently uploaded (20)

Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 

Data Science Course

  • 1. © 2013 ExcelR Solutions. All Rights Reserved Advanced Regression AGENDA Mul)nomial Regression Zero Inflated Poisson Regression Nega)ve Binomial
  • 2. © 2013 ExcelR Solutions. All Rights Reserved Multinomial Regression •  Logis'c regression (Binomial distribu'on) is used when output has ‘2’ categories •  Mul'nomial regression (classifica'on model) is used when output has > ‘2’ categories •  Extension to logis'c regression •  No natural ordering of categories •  Response variable has > ‘2’ categories & hence we apply mul'logit •  Understand the impact of cost & 'me on the various modes of transport Mode of transport Car Carpool Bus Rail All modes Count 218 32 81 122 453 Probability 0.48 0.07 0.18 0.27 1
  • 3. © 2013 ExcelR Solutions. All Rights Reserved Multinomial Regression •  Whether we have ‘Y’ (response) or ‘X’ (predictor), which is categorical with ‘s’ categories ü  Lowest in numerical / lexicographical value is chosen as baseline / reference ü  Missing level in output is baseline level ü  We can choose the baseline level of our choice based on ‘relevel’ func'on in R ü  Model formulates the rela'onship between transformed (logit) Y & numerical X linearly ü  Modeling quan'ta've variables linearly might not always be correct
  • 4. © 2013 ExcelR Solutions. All Rights Reserved Multinomial Regression - Output Itera'on History: •  Itera've procedure is used to compute maximum likelihood es'mates •  # itera'ons & convergence status is provided •  -2logL = 2 * nega've log likelihood •  -2logL has χ2 distribu'on, which is used for hypothesis tes'ng of goodness of fit # parameters = 27
  • 5. © 2013 ExcelR Solutions. All Rights Reserved Multinomial Regression - Output Log(P(choice = carpool | x) / P(choice = car | x) = β20 + β21 * cost.car + β22 * cost.carpool + ……………. This equa'on compares the log of probabili'es of carpool to car •  ‘car’ has been chosen as baseline •  x = vector represen'ng the values of all inputs •  The regression coefficient 0.636 indicates that for a ‘1’ unit increases the ‘cost.car’, the log odds of ‘carpool’ to ‘car’ increases by 0.636 •  Intercept value does not mean anything in this context •  If we have a categorical X also, say Gender (female = 0, male = 1), then regression coefficient (say 0.22) indicates that rela've to females, males increase the log odds of ‘carpool’ to ‘car’ by 0.22
  • 6. © 2013 ExcelR Solutions. All Rights Reserved Probability •  Let p = p(x | A) be the probability of any event (say airi'on) under condi'on A (say gender = female) •  Then p(x | A) ÷ (1 - p(x | A) is called the odds associated with the event Odds •  If there are two condi'ons A (gender = female) & B (gender = male) then the ra'o p(x | A) ÷ (1 - p(x | A) / p(x | B) ÷ (1 - p(x | B) is called as odds ra'o of A with respect to B Odds Ratio •  p(x | A) ÷ p(x | B) is called as rela've risk Relative Risk hips://en.wikipedia.org/wiki/Rela've_risk
  • 7. © 2013 ExcelR Solutions. All Rights Reserved •  Odds ra'o is computed from the coefficients in the linear model equa'on by simply exponen'a'ng •  Exponen'ated regression coefficients are odds ra'o for a unit change in a predictor variable •  The odds ra'o for a unit increase in cost.car is 1.88 for choosing carpool vs car Odds Ratio
  • 8. © 2013 ExcelR Solutions. All Rights Reserved Goodness of fit Linear GLM Analysis of Variance Analysis of Deviance Residual Deviance Residual Sum of Squares OLS Maximum Likelihood •  Residual Deviance is -2 log L •  Adding more parameters to the model will reduce Residual Deviance even if it is not going to be useful for predic'on •  In order to control this, penalty of “2 * number of parameters” is added to to Residual deviance •  This penalized value of -2 log L is called as AIC criterion •  AIC = -2 log L + 2 * number of parameters Note: “Mul'logit Model with Interac(on”