SlideShare a Scribd company logo
1 of 15
Download to read offline
Data Analytics for Beginners
Predictive Analytics
September 5-9, 2022
Dr. Roma Mitra Debnath
Associate Professor (Applied Statistics)
Indian Instituteof Public Administration
Email: roma.mitra@gmail.com
9/9/2022 1
Roma Mitra IIPA
Regression and Correlation
• Regression analysis is the process of
constructing a mathematical model that can
be used to predict or determine one variable
by another variable.
• Correlation is a measure of the degree of
relatedness of two variables.
9/9/2022 Roma Mitra IIPA 2
Roma Mitra IIPA Chap 12-3
• Only one independentvariable, X
• Relationship between X and Y is described
by a linear function
• Changes in Y are assumed to be caused by
changes in X
Regression Analysis
9/9/2022
Regression Analysis
• ForecastingTool
• Assumes Causal Relation between dependent(DV) and
independentvariable (IV)
• Linear/non-linearRelation
• Select a model based on relation
 Causal::
 Education (IV) reduces Poverty (DV)
• Government expenditureon health Infant mortality
rate
• Efficiency of Government depends on Governance
• Corporate social responsibilityand financial
performance of the organization
• …
9/9/2022 Roma Mitra IIPA 4
Simple Regression Analysis
• bivariate (two variables) linear regression --
the most elementary regression model
–dependent variable, the variable to be
predicted, usually called Y
–independent variable, the predictor or
explanatory variable, usually called X
9/9/2022 Roma Mitra IIPA 5
i
i
1
0
i ε
X
β
β
Y 


Linear component
Simple Linear Regression Model
Population
Y intercept
Population
Slope
Coefficient
Random
Error
term
Dependent
Variable
Independent
Variable
Random Error
component
9/9/2022 6
Roma Mitra IIPA
i
1
0
i X
b
b
Ŷ 

The simple linear regressionequation provides an estimateof the
population regression line
Simple Linear Regression Equation (Prediction Line)
Estimate of the
regression
intercept
Estimate of the
regression slope
Estimated (or
predicted) Y
value for
observation i
Value of X for
observation i
The individual random error terms ei have a mean of zero
9/9/2022 7
Roma Mitra IIPA
Types of Relationships
Y
X
Y
X
Y
Y
X
X
Linear relationships Curvilinear relationships
9/9/2022 8
Roma Mitra IIPA
Types of Relationships
Y
X
Y
X
Y
Y
X
X
Strong relationships Weak relationships
9/9/2022 9
Roma Mitra IIPA
Types of Relationships
Y
X
Y
X
No relationship
9/9/2022 10
Roma Mitra IIPA
Assumptions of Regression
• Linearity
– The underlying relationship between X and Y is linear
• Independence of Errors
– Error values are statistically independent
• Normality of Error
– Error values (ε) are normally distributed for any given value of X
• Equal Variance (Homoscedasticity)
– The probability distribution of the errors has constant variance
9/9/2022 11
Roma Mitra IIPA
Scatter Diagram
9/9/2022 Roma Mitra IIPA 12
0
200
400
600
800
1000
1200
0 10 20 30 40 50 60 70
Sales
Sales
Excel Output
9/9/2022 Roma Mitra IIPA 13
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.947662558
R Square 0.898064324
Adjusted R Square 0.881075045
Standard Error 108.7575267
Observations 8
ANOVA
df SS MS F Significance F
Regression 1 625246.3024 625246.3 52.86065 0.000344486
Residual 6 70969.19765 11828.2
Total 7 696215.5
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept -46.29180548 64.89096049 -0.71338 0.502402 -205.0742654 112.4906545
Advertising 15.23977165 2.096101053 7.270533 0.000344 10.11079715 20.36874615
Sample Data for House Price
Model : Quiz
House Price in $1000s
(Y)
Square Feet
(X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700
9/9/2022 14
Roma Mitra IIPA
• The duration of QUIZ is 20 minutes. There are 20
questions.
• Filling up the Feedback form is mandatory.
• Please use the link ( shared on Email, Whatsapp
and Chat box) and fill it only once.
• Certificates will be uploaded on IIPA Portal on
Tuesday (September 13, 2022) . Please check in
the evening (Indian Standard Time).
9/9/2022 Roma Mitra IIPA 15

More Related Content

Similar to PREDICTIVE ANALYTICS

Further6 displaying bivariate data
Further6  displaying bivariate dataFurther6  displaying bivariate data
Further6 displaying bivariate data
kmcmullen
 

Similar to PREDICTIVE ANALYTICS (7)

Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Forecasting peer-to-peer lending risk
Forecasting peer-to-peer lending riskForecasting peer-to-peer lending risk
Forecasting peer-to-peer lending risk
 
Informs2020 using machine learning to identify the factors of people's mobi...
Informs2020   using machine learning to identify the factors of people's mobi...Informs2020   using machine learning to identify the factors of people's mobi...
Informs2020 using machine learning to identify the factors of people's mobi...
 
Revised understanding predictive models limit to growth model
Revised understanding predictive models limit to growth modelRevised understanding predictive models limit to growth model
Revised understanding predictive models limit to growth model
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 
Further6 displaying bivariate data
Further6  displaying bivariate dataFurther6  displaying bivariate data
Further6 displaying bivariate data
 
Representative Of The Populationseek Your Dream/Tutorialoutletdotcom
Representative Of The Populationseek Your Dream/TutorialoutletdotcomRepresentative Of The Populationseek Your Dream/Tutorialoutletdotcom
Representative Of The Populationseek Your Dream/Tutorialoutletdotcom
 

More from TomiListrani (9)

004 Historia del Arte - Mesopotamia.pptx
004 Historia del Arte - Mesopotamia.pptx004 Historia del Arte - Mesopotamia.pptx
004 Historia del Arte - Mesopotamia.pptx
 
arte-prehistoria
arte-prehistoriaarte-prehistoria
arte-prehistoria
 
INICIACION VINO 2021
INICIACION VINO 2021INICIACION VINO 2021
INICIACION VINO 2021
 
Clase 1 IDAES Bellini
Clase 1 IDAES BelliniClase 1 IDAES Bellini
Clase 1 IDAES Bellini
 
Dos décadas perdidas
Dos décadas perdidasDos décadas perdidas
Dos décadas perdidas
 
Miguel Angel Pintura.pptx
Miguel Angel Pintura.pptxMiguel Angel Pintura.pptx
Miguel Angel Pintura.pptx
 
Miguel Angel escultura
Miguel Angel esculturaMiguel Angel escultura
Miguel Angel escultura
 
EPISTEMOLOGY OF FEMINIST INTERNATIONAL RELATIONS THEORY.pptx
EPISTEMOLOGY OF FEMINIST INTERNATIONAL RELATIONS THEORY.pptxEPISTEMOLOGY OF FEMINIST INTERNATIONAL RELATIONS THEORY.pptx
EPISTEMOLOGY OF FEMINIST INTERNATIONAL RELATIONS THEORY.pptx
 
Patrimonio Cultural Inmaterial y desarrollo sostenible
 Patrimonio Cultural Inmaterial y desarrollo sostenible Patrimonio Cultural Inmaterial y desarrollo sostenible
Patrimonio Cultural Inmaterial y desarrollo sostenible
 

Recently uploaded

Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 

Recently uploaded (20)

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

PREDICTIVE ANALYTICS

  • 1. Data Analytics for Beginners Predictive Analytics September 5-9, 2022 Dr. Roma Mitra Debnath Associate Professor (Applied Statistics) Indian Instituteof Public Administration Email: roma.mitra@gmail.com 9/9/2022 1 Roma Mitra IIPA
  • 2. Regression and Correlation • Regression analysis is the process of constructing a mathematical model that can be used to predict or determine one variable by another variable. • Correlation is a measure of the degree of relatedness of two variables. 9/9/2022 Roma Mitra IIPA 2
  • 3. Roma Mitra IIPA Chap 12-3 • Only one independentvariable, X • Relationship between X and Y is described by a linear function • Changes in Y are assumed to be caused by changes in X Regression Analysis 9/9/2022
  • 4. Regression Analysis • ForecastingTool • Assumes Causal Relation between dependent(DV) and independentvariable (IV) • Linear/non-linearRelation • Select a model based on relation  Causal::  Education (IV) reduces Poverty (DV) • Government expenditureon health Infant mortality rate • Efficiency of Government depends on Governance • Corporate social responsibilityand financial performance of the organization • … 9/9/2022 Roma Mitra IIPA 4
  • 5. Simple Regression Analysis • bivariate (two variables) linear regression -- the most elementary regression model –dependent variable, the variable to be predicted, usually called Y –independent variable, the predictor or explanatory variable, usually called X 9/9/2022 Roma Mitra IIPA 5
  • 6. i i 1 0 i ε X β β Y    Linear component Simple Linear Regression Model Population Y intercept Population Slope Coefficient Random Error term Dependent Variable Independent Variable Random Error component 9/9/2022 6 Roma Mitra IIPA
  • 7. i 1 0 i X b b Ŷ   The simple linear regressionequation provides an estimateof the population regression line Simple Linear Regression Equation (Prediction Line) Estimate of the regression intercept Estimate of the regression slope Estimated (or predicted) Y value for observation i Value of X for observation i The individual random error terms ei have a mean of zero 9/9/2022 7 Roma Mitra IIPA
  • 8. Types of Relationships Y X Y X Y Y X X Linear relationships Curvilinear relationships 9/9/2022 8 Roma Mitra IIPA
  • 9. Types of Relationships Y X Y X Y Y X X Strong relationships Weak relationships 9/9/2022 9 Roma Mitra IIPA
  • 10. Types of Relationships Y X Y X No relationship 9/9/2022 10 Roma Mitra IIPA
  • 11. Assumptions of Regression • Linearity – The underlying relationship between X and Y is linear • Independence of Errors – Error values are statistically independent • Normality of Error – Error values (ε) are normally distributed for any given value of X • Equal Variance (Homoscedasticity) – The probability distribution of the errors has constant variance 9/9/2022 11 Roma Mitra IIPA
  • 12. Scatter Diagram 9/9/2022 Roma Mitra IIPA 12 0 200 400 600 800 1000 1200 0 10 20 30 40 50 60 70 Sales Sales
  • 13. Excel Output 9/9/2022 Roma Mitra IIPA 13 SUMMARY OUTPUT Regression Statistics Multiple R 0.947662558 R Square 0.898064324 Adjusted R Square 0.881075045 Standard Error 108.7575267 Observations 8 ANOVA df SS MS F Significance F Regression 1 625246.3024 625246.3 52.86065 0.000344486 Residual 6 70969.19765 11828.2 Total 7 696215.5 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept -46.29180548 64.89096049 -0.71338 0.502402 -205.0742654 112.4906545 Advertising 15.23977165 2.096101053 7.270533 0.000344 10.11079715 20.36874615
  • 14. Sample Data for House Price Model : Quiz House Price in $1000s (Y) Square Feet (X) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 1700 9/9/2022 14 Roma Mitra IIPA
  • 15. • The duration of QUIZ is 20 minutes. There are 20 questions. • Filling up the Feedback form is mandatory. • Please use the link ( shared on Email, Whatsapp and Chat box) and fill it only once. • Certificates will be uploaded on IIPA Portal on Tuesday (September 13, 2022) . Please check in the evening (Indian Standard Time). 9/9/2022 Roma Mitra IIPA 15