SlideShare a Scribd company logo
1 of 23
Download to read offline
Data Analytic: The Use of Statistical Methods
Univariate and Multivariate Regressions
Hasan Dwi Cahyono1
1Informatika
Universitas Sebelas Maret (UNS) Surakarta
Business Intelligence Course, February 2023
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 1 / 23
Table of Contents
1 One Estimator (Univariate)
2 Multiple Linear Regression (Multivariate)
3 Reference
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 2 / 23
Table of Contents
1 One Estimator (Univariate)
2 Multiple Linear Regression (Multivariate)
3 Reference
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 3 / 23
The coefficient estimate: How to measure the accuracy?
After having the β0 and β1 (previous meeting), we need to know how well
our estimates are. To do this assessment, let’s make a plot:
Figure 1: The linear function is Y=10+5X+error
Each of those lines provides a reasonable estimate.
So, which line provides the best estimate?
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 4 / 23
We can see from the previous plot that the true (population)
regression line red is being surrounded by some (ten) estimates.
The ideal comparison would be the real-world (population) data
which is likely difficult to obtain.
We can start by estimating the true mean µ using the sample mean
µ̂. This is because µ̂ will not systematically over/under estimate µ
(unbiased).
So, using sufficiently large sample to estimate µ̂ (for each
observation), the average will exactly equal to µ.
Var(µ̂) = SE(µ̂)2
=
σ2
n
(1)
where σ2 = Var(). Using Law of Large Number (LLM) – not
mention in this course, the standard error decreases as µ̂ will get
closer to the µ.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 5 / 23
Measure how close our ˆ
β0 and ˆ
β1 are to β0 and β1
To make the intercept β0 and slope β1 estimation much convenience, we
will assume that X is fixed (not random). Remember that:
β̂1 =
P
Xi − X̄

Yi − Ȳ

P
Xi − X̄
2
(2)
So, we can have:
SE

β̂1
2
= Var
P
i (xi − x̄) (yi − ȳ)
P
i (xi − x̄)2
!
(3)
As
X
i
(xi − x̄) ȳ = ȳ
X
i
(xi − x̄) = ȳ
X
i
xi
!
− nx̄
!
= ȳ(nx̄ − nx̄) = 0
(4)
Therefore, we have:
X
(xi − x̄) (yi − ȳ) =
X
(xi − x̄) yi (5)
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 6 / 23
Measure how close our ˆ
β0 and ˆ
β1 are to β0 and β1
Var

β̂1

= Var
P
i (xi − x̄) yi
P
i (xi − x̄)2
!
=
1
P
i (xi − x̄)2
2
Var
X
i
(xi − x̄) yi
!
=
1
P
i (xi − x̄)2
2
X
i
(xi − x̄)2
Var (yi )
=
1
P
i (xi − x̄)2
2
X
i
(xi − x̄)2
σ2
=
1
P
i (xi − x̄)2
2
σ2
X
i
(xi − x̄)2
=
σ2
P
i (xi − x̄)2
(6)
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 7 / 23
Measure how close our ˆ
β0 and ˆ
β1 are to β0 and β1
Let’s start with the definition of ˆ
β0:
β̂0 = ȳ − β̂1x̄ (7)
So,
Var

β̂0

= Var

ȳ − β̂1x̄

= Var(ȳ) + Var

−β̂1x̄

= Var(ȳ) + (−x̄)2
Var

β̂1

= Var(ȳ) + x̄2 σ2
Pn
i=1 (xi − x̄)2
= Var
1
n
n
X
i=1
yi
!
+ x̄2 σ2
Pn
i=1 (xi − x̄)2
= Var
1
n
n
X
i=1
(β0 + β1xi + i )
!
+ x̄2 σ2
Pn
i=1 (xi − x̄)2
(8)
As i is independent (uncorrelated),
Pn
i=1 Var (β0 + β1xi + i ) =
Pn
i=1 Var (i )
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 8 / 23
Measure how close our ˆ
β0 and ˆ
β1 are to β0 and β1
Let’s start with the definition of ˆ
β0:
β̂0 = ȳ − β̂1x̄ (9)
So,
Var

β̂0

=
1
n2
n
X
i=1
Var (i ) + x̄2 σ2
Pn
i=1 (xi − x̄)2
=
1
n2
n
X
i=1
σ2
+ x̄2 σ2
Pn
i=1 (xi − x̄)2
=
1
n
σ2
+ x̄2 σ2
Pn
i=1 (xi − x̄)2
= σ2

1
n
+
x̄2
Pn
i=1 (xi − x̄)2
#
(10)
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 9 / 23
Figure 2: Confidence interval (Source: Quora).
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 10 / 23
The range of ˆ
β0 and ˆ
β1: Confidence Interval
The common choice is 95%
confidence interval for either ˆ
β0 and
ˆ
β1. And the equation is denoted as:
β̂1 ± 2 ∗ SE

β̂1

β̂0 ± 2 ∗ SE

β̂0
 (11)
We can also use Standard Error (SE)
to do the hypotheses tests
H0 : There is no relationship
between X and Y , or β1 = 0
(null hypothesis)
H1 : There exists a relationship
between X and Y , or β1 6= 0
(alternate hypothesis)
Figure 3: The different between normal
distribution and t-distribution
Typically, we consider the null
hypothesis more than the alternate
hypothesis. In this regard, we can
measure how far β̂1 from 0 and we
need this to know if β̂1 6= 0.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 11 / 23
t − statistics: to Measure how far ˆ
β1 from zero
We can use the t-distribution as:
t = (β1 − 0) /SE

β̂1

(12)
When there’s no relationship between X and Y, t will follow a
t-distribution.
We can use t-distribution to compute the probability of any observed
number |t|
This probability is known as p-value with a common threshold α is
0.05.
Example: we have t-value using two tail sample test is -1.608761 and
find that a corresponding p-value for this -1.608761 is 0.121926.
Since the threshold is 0.05, we fail to reject the null hypothesis.
Meaning that we don’t have enough evidence to say X has no
relationship to Y. Therefore, we can accept the alternate hypothesis.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 12 / 23
Table of Contents
1 One Estimator (Univariate)
2 Multiple Linear Regression (Multivariate)
3 Reference
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 13 / 23
Simple linear regression
Univariate regression works well when the data only needs a single
predictor. In real world application, however, more variables are commonly
involved. Thus, we need to extend the simple univariate regression to
involve more predictors.We can see the multivariate variable regression as:
Y = β0 + β1X1 + β2X2 + . . . + βpXp +  (13)
Just like what we have with simple univariate regression, we first need to
estimate the regression coefficients:
ŷ = β̂0 + β̂1X1 + β̂2X2 + . . . + β̂pXp (14)
with the goal to minimize the residual sum of squared (RSS):
RSS =
n
X
i=1
(yi − ŷi )2
=
n
X
i=1

yi − β̂0 − β̂1xi1 − β̂2xi2 − . . . − β̂pxip
2
(15)
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 14 / 23
Finding a relationship between Response and Predictor
We need to test the null hypotheses
H0 : β1 = β2 = . . . = βp = 0 (16)
in relative to
H1 : at least one βj is non-zero (17)
With the hypothesis test is performed by calculating the F-Statistic
F =
(TSS − RSS)/p
RSS/(n − p − 1)
(18)
where TSS =
Pn
i=1 (yi − ȳ)2
is the total sum of squares.
We can consider TSS as the amount of total variability in the
response variable before any model is fitted to it.
On the other hand, RSS is performed after fitting a model, and
measures the amount of unexplained variance remaining in the data.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 15 / 23
Assuming that the linear model is correct, we can show that:
E{RSS/(n − p − 1)} = σ2
(19)
and when H0 is true,
E{(TSS − RSS)/p} = σ2
(20)
In brief, when H0 were true, we expect to have regression coefficients
of 0.
If H0 were true, the unexplained variance of the model to be
approximately equal to that of the total variance with numerator and
denominator of F-statistic has an equal value.
When there’s no relationship between response and predictors,
F-statistic will have almost 1.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 16 / 23
Selecting the important variables
In a multiple regression, finding the F-statistic is the first thing we do and
then we can use at least one of the predictors has relation to the response
Y.
We can say that this step is the variable selection. In this step, integrating
different models of regression functions with combinations of predictors,
2p.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 17 / 23
A common approach involves three different ways:
Forward selection: first, we select a model without any predictor
(null model). Then, we can fit p simple linear regressions to the
nullmodel one at a time and find the model with the lowest RSS.
Next, we can add another variable and finding the lowest RSS. We
can continue doing this step until a predetermined rule is fulfilled.
Backward selection: we can start with all the predictors in the model
and then gradually remove the variable with the largest p − value.
Then, we keep on doing this elimination step with (p − 1) predictors.
We can keep on doing this elimination step until the stopping criteria
is met (typically, p − value threshold.
Mixed selection: We can start with Forward selection. Then we add a
new predictor. But when the added predictor passes the p − value, we
remove it. This process continue forward and backward until we can
find the predictor combinations that have a relatively low p − value.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 18 / 23
Model Fitting
To decide the best fit model, we need a measure. Root square error RSE
and R2
are the common metrics.
RSE =
r
1
n − 2
RSS =
v
u
u
t 1
n − 2
n
X
i=1
(yi − ŷi )2
(21)
R2
= 1 −
Unexplained Variation
Total Variation
(22)
How about trying to make the proof of the RSS and R2?
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 19 / 23
RSE
We can use RSE to measure
how bad our model is to fit the
data.
When the RSE is very close to
the actual value, we can say
that your model fits the data
quite well and vice-versa.
However, you may need to have
general understanding about the
data. Not much information
about RSE. So, use RSE
wisely.
R2
Closer to 1 means a better fit.
You may need to think carefully
when using R2
, since adding
more variables potentially
increase the it. But, adding
more variables that less
significantly increase the R2
can
lead to over-fitting.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 20 / 23
Predictions
For any given models, we should consider 3 different things:
There will a inaccuracy when we try to estimate β̂0 + β̂1 . . . , β̂p for
the true estimates β0 + β1 . . . , βp. This inaccuracy is known as
reducible error. We can use confidence interval to see how close Ŷ is
to Y.
Model bias can occur when we use a linear approximation to the true
surface of f (X).
When we know the true function f (X), but we still have random error
ε, this is unavoidable. We can use prediction intervals to estimate
how far Y is to Ŷ . We can see that prediction intervals is always
wider than confidence intervals.
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 21 / 23
Table of Contents
1 One Estimator (Univariate)
2 Multiple Linear Regression (Multivariate)
3 Reference
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 22 / 23
Reference
James, G., Witten, D., Hastie, T., Tibshirani, R. (2016). An
introduction to statistical learning (Vol. 6). New York: Springer
https://dionysus.psych.wisc.edu/iaml/unit-03.html
Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 23 / 23

More Related Content

Similar to Materi_Business_Intelligence_1.pdf

Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).pptMuhammadAftab89
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.pptRidaIrfan10
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptkrunal soni
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptMoinPasha12
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Sciencessuser71ac73
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Temporal disaggregation methods
Temporal disaggregation methodsTemporal disaggregation methods
Temporal disaggregation methodsStephen Bradley
 
Production involves transformation of inputs such as capital, equipment, labo...
Production involves transformation of inputs such as capital, equipment, labo...Production involves transformation of inputs such as capital, equipment, labo...
Production involves transformation of inputs such as capital, equipment, labo...Krushna Ktk
 
TTests.ppt
TTests.pptTTests.ppt
TTests.pptMUzair21
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionDexlab Analytics
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer Sammer Qader
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and RegressionShubham Mehta
 
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptxSM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptxManjulasingh17
 

Similar to Materi_Business_Intelligence_1.pdf (20)

9. parametric regression
9. parametric regression9. parametric regression
9. parametric regression
 
Corr And Regress
Corr And RegressCorr And Regress
Corr And Regress
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Correlation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social ScienceCorrelation & Regression for Statistics Social Science
Correlation & Regression for Statistics Social Science
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Temporal disaggregation methods
Temporal disaggregation methodsTemporal disaggregation methods
Temporal disaggregation methods
 
Production involves transformation of inputs such as capital, equipment, labo...
Production involves transformation of inputs such as capital, equipment, labo...Production involves transformation of inputs such as capital, equipment, labo...
Production involves transformation of inputs such as capital, equipment, labo...
 
Chapter 14 Part Ii
Chapter 14 Part IiChapter 14 Part Ii
Chapter 14 Part Ii
 
TTests.ppt
TTests.pptTTests.ppt
TTests.ppt
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling Distribution
 
Data classification sammer
Data classification sammer Data classification sammer
Data classification sammer
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
probability assignment help (2)
probability assignment help (2)probability assignment help (2)
probability assignment help (2)
 
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptxSM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
SM_d89ccf05-7de1-4a30-a134-3143e9b3bf3f_38.pptx
 

Recently uploaded

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactisticshameyhk98
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 

Recently uploaded (20)

HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 

Materi_Business_Intelligence_1.pdf

  • 1. Data Analytic: The Use of Statistical Methods Univariate and Multivariate Regressions Hasan Dwi Cahyono1 1Informatika Universitas Sebelas Maret (UNS) Surakarta Business Intelligence Course, February 2023 Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 1 / 23
  • 2. Table of Contents 1 One Estimator (Univariate) 2 Multiple Linear Regression (Multivariate) 3 Reference Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 2 / 23
  • 3. Table of Contents 1 One Estimator (Univariate) 2 Multiple Linear Regression (Multivariate) 3 Reference Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 3 / 23
  • 4. The coefficient estimate: How to measure the accuracy? After having the β0 and β1 (previous meeting), we need to know how well our estimates are. To do this assessment, let’s make a plot: Figure 1: The linear function is Y=10+5X+error Each of those lines provides a reasonable estimate. So, which line provides the best estimate? Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 4 / 23
  • 5. We can see from the previous plot that the true (population) regression line red is being surrounded by some (ten) estimates. The ideal comparison would be the real-world (population) data which is likely difficult to obtain. We can start by estimating the true mean µ using the sample mean µ̂. This is because µ̂ will not systematically over/under estimate µ (unbiased). So, using sufficiently large sample to estimate µ̂ (for each observation), the average will exactly equal to µ. Var(µ̂) = SE(µ̂)2 = σ2 n (1) where σ2 = Var(). Using Law of Large Number (LLM) – not mention in this course, the standard error decreases as µ̂ will get closer to the µ. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 5 / 23
  • 6. Measure how close our ˆ β0 and ˆ β1 are to β0 and β1 To make the intercept β0 and slope β1 estimation much convenience, we will assume that X is fixed (not random). Remember that: β̂1 = P Xi − X̄ Yi − Ȳ P Xi − X̄ 2 (2) So, we can have: SE β̂1 2 = Var P i (xi − x̄) (yi − ȳ) P i (xi − x̄)2 ! (3) As X i (xi − x̄) ȳ = ȳ X i (xi − x̄) = ȳ X i xi ! − nx̄ ! = ȳ(nx̄ − nx̄) = 0 (4) Therefore, we have: X (xi − x̄) (yi − ȳ) = X (xi − x̄) yi (5) Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 6 / 23
  • 7. Measure how close our ˆ β0 and ˆ β1 are to β0 and β1 Var β̂1 = Var P i (xi − x̄) yi P i (xi − x̄)2 ! = 1 P i (xi − x̄)2 2 Var X i (xi − x̄) yi ! = 1 P i (xi − x̄)2 2 X i (xi − x̄)2 Var (yi ) = 1 P i (xi − x̄)2 2 X i (xi − x̄)2 σ2 = 1 P i (xi − x̄)2 2 σ2 X i (xi − x̄)2 = σ2 P i (xi − x̄)2 (6) Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 7 / 23
  • 8. Measure how close our ˆ β0 and ˆ β1 are to β0 and β1 Let’s start with the definition of ˆ β0: β̂0 = ȳ − β̂1x̄ (7) So, Var β̂0 = Var ȳ − β̂1x̄ = Var(ȳ) + Var −β̂1x̄ = Var(ȳ) + (−x̄)2 Var β̂1 = Var(ȳ) + x̄2 σ2 Pn i=1 (xi − x̄)2 = Var 1 n n X i=1 yi ! + x̄2 σ2 Pn i=1 (xi − x̄)2 = Var 1 n n X i=1 (β0 + β1xi + i ) ! + x̄2 σ2 Pn i=1 (xi − x̄)2 (8) As i is independent (uncorrelated), Pn i=1 Var (β0 + β1xi + i ) = Pn i=1 Var (i ) Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 8 / 23
  • 9. Measure how close our ˆ β0 and ˆ β1 are to β0 and β1 Let’s start with the definition of ˆ β0: β̂0 = ȳ − β̂1x̄ (9) So, Var β̂0 = 1 n2 n X i=1 Var (i ) + x̄2 σ2 Pn i=1 (xi − x̄)2 = 1 n2 n X i=1 σ2 + x̄2 σ2 Pn i=1 (xi − x̄)2 = 1 n σ2 + x̄2 σ2 Pn i=1 (xi − x̄)2 = σ2 1 n + x̄2 Pn i=1 (xi − x̄)2 # (10) Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 9 / 23
  • 10. Figure 2: Confidence interval (Source: Quora). Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 10 / 23
  • 11. The range of ˆ β0 and ˆ β1: Confidence Interval The common choice is 95% confidence interval for either ˆ β0 and ˆ β1. And the equation is denoted as: β̂1 ± 2 ∗ SE β̂1 β̂0 ± 2 ∗ SE β̂0 (11) We can also use Standard Error (SE) to do the hypotheses tests H0 : There is no relationship between X and Y , or β1 = 0 (null hypothesis) H1 : There exists a relationship between X and Y , or β1 6= 0 (alternate hypothesis) Figure 3: The different between normal distribution and t-distribution Typically, we consider the null hypothesis more than the alternate hypothesis. In this regard, we can measure how far β̂1 from 0 and we need this to know if β̂1 6= 0. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 11 / 23
  • 12. t − statistics: to Measure how far ˆ β1 from zero We can use the t-distribution as: t = (β1 − 0) /SE β̂1 (12) When there’s no relationship between X and Y, t will follow a t-distribution. We can use t-distribution to compute the probability of any observed number |t| This probability is known as p-value with a common threshold α is 0.05. Example: we have t-value using two tail sample test is -1.608761 and find that a corresponding p-value for this -1.608761 is 0.121926. Since the threshold is 0.05, we fail to reject the null hypothesis. Meaning that we don’t have enough evidence to say X has no relationship to Y. Therefore, we can accept the alternate hypothesis. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 12 / 23
  • 13. Table of Contents 1 One Estimator (Univariate) 2 Multiple Linear Regression (Multivariate) 3 Reference Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 13 / 23
  • 14. Simple linear regression Univariate regression works well when the data only needs a single predictor. In real world application, however, more variables are commonly involved. Thus, we need to extend the simple univariate regression to involve more predictors.We can see the multivariate variable regression as: Y = β0 + β1X1 + β2X2 + . . . + βpXp + (13) Just like what we have with simple univariate regression, we first need to estimate the regression coefficients: ŷ = β̂0 + β̂1X1 + β̂2X2 + . . . + β̂pXp (14) with the goal to minimize the residual sum of squared (RSS): RSS = n X i=1 (yi − ŷi )2 = n X i=1 yi − β̂0 − β̂1xi1 − β̂2xi2 − . . . − β̂pxip 2 (15) Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 14 / 23
  • 15. Finding a relationship between Response and Predictor We need to test the null hypotheses H0 : β1 = β2 = . . . = βp = 0 (16) in relative to H1 : at least one βj is non-zero (17) With the hypothesis test is performed by calculating the F-Statistic F = (TSS − RSS)/p RSS/(n − p − 1) (18) where TSS = Pn i=1 (yi − ȳ)2 is the total sum of squares. We can consider TSS as the amount of total variability in the response variable before any model is fitted to it. On the other hand, RSS is performed after fitting a model, and measures the amount of unexplained variance remaining in the data. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 15 / 23
  • 16. Assuming that the linear model is correct, we can show that: E{RSS/(n − p − 1)} = σ2 (19) and when H0 is true, E{(TSS − RSS)/p} = σ2 (20) In brief, when H0 were true, we expect to have regression coefficients of 0. If H0 were true, the unexplained variance of the model to be approximately equal to that of the total variance with numerator and denominator of F-statistic has an equal value. When there’s no relationship between response and predictors, F-statistic will have almost 1. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 16 / 23
  • 17. Selecting the important variables In a multiple regression, finding the F-statistic is the first thing we do and then we can use at least one of the predictors has relation to the response Y. We can say that this step is the variable selection. In this step, integrating different models of regression functions with combinations of predictors, 2p. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 17 / 23
  • 18. A common approach involves three different ways: Forward selection: first, we select a model without any predictor (null model). Then, we can fit p simple linear regressions to the nullmodel one at a time and find the model with the lowest RSS. Next, we can add another variable and finding the lowest RSS. We can continue doing this step until a predetermined rule is fulfilled. Backward selection: we can start with all the predictors in the model and then gradually remove the variable with the largest p − value. Then, we keep on doing this elimination step with (p − 1) predictors. We can keep on doing this elimination step until the stopping criteria is met (typically, p − value threshold. Mixed selection: We can start with Forward selection. Then we add a new predictor. But when the added predictor passes the p − value, we remove it. This process continue forward and backward until we can find the predictor combinations that have a relatively low p − value. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 18 / 23
  • 19. Model Fitting To decide the best fit model, we need a measure. Root square error RSE and R2 are the common metrics. RSE = r 1 n − 2 RSS = v u u t 1 n − 2 n X i=1 (yi − ŷi )2 (21) R2 = 1 − Unexplained Variation Total Variation (22) How about trying to make the proof of the RSS and R2? Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 19 / 23
  • 20. RSE We can use RSE to measure how bad our model is to fit the data. When the RSE is very close to the actual value, we can say that your model fits the data quite well and vice-versa. However, you may need to have general understanding about the data. Not much information about RSE. So, use RSE wisely. R2 Closer to 1 means a better fit. You may need to think carefully when using R2 , since adding more variables potentially increase the it. But, adding more variables that less significantly increase the R2 can lead to over-fitting. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 20 / 23
  • 21. Predictions For any given models, we should consider 3 different things: There will a inaccuracy when we try to estimate β̂0 + β̂1 . . . , β̂p for the true estimates β0 + β1 . . . , βp. This inaccuracy is known as reducible error. We can use confidence interval to see how close Ŷ is to Y. Model bias can occur when we use a linear approximation to the true surface of f (X). When we know the true function f (X), but we still have random error ε, this is unavoidable. We can use prediction intervals to estimate how far Y is to Ŷ . We can see that prediction intervals is always wider than confidence intervals. Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 21 / 23
  • 22. Table of Contents 1 One Estimator (Univariate) 2 Multiple Linear Regression (Multivariate) 3 Reference Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 22 / 23
  • 23. Reference James, G., Witten, D., Hastie, T., Tibshirani, R. (2016). An introduction to statistical learning (Vol. 6). New York: Springer https://dionysus.psych.wisc.edu/iaml/unit-03.html Cahyono, Hasan D. (UNS) Informatika UNS BI 2023 23 / 23