Analisa Regresi:
Uji Asumsi Klasik
ARIF RAHMAN
1
Statistika
Statistika adalah cabang ilmu matematika yang
mempelajari metode ilmiah untuk mengumpulkan,
mengorganisasi, merangkum, menyederhanakan,
menyajikan, menginterpretasikan, menganalisa dan
mensintesa data (numerik atau nonnumerik) untuk
menghasilkan informasi dan/atau kesimpulan, yang
membantu dalam penyelesaian masalah dan/atau
pengambilan keputusan.
2
Statistika
3
Mengorganisasi,
Merangkum,
Menyederhanakan,
Menyajikan,
Menginterpretasikan
Menganalisa
Mensintesa
Mengumpulkan data
Menghasilkan informasi dan/atau kesimpulan
Menggeneralisasi
Mengestimasi,
Menguji hipotesa,
Menilai relasi,
Memprediksi
Menyelesaikan masalah Mengambil keputusan
Statistika Inferensia
Statistika inferensia adalah cabang statistika yang
menganalisa atau mensintesa data untuk
menggeneralisasi sampel terhadap populasi,
mengestimasi parameter, menguji hipotesa, menilai
relasi, dan membuat prediksi untuk menghasilkan
informasi dan/atau kesimpulan.
Terdapat banyak alat bantu statistika (statistical tools)
yang dapat dipergunakan untuk menginferensi
populasi atau sistem yang menjadi sumber asal data
sampel
4
Statistika Inferensia
5
Tujuan studi terhadap populasi Observasi atau eksperimen pada sampel
SAMPLING
INFERENSI
Parameter :
N (banyaknya anggota populasi),
μ (rata-rata populasi),
σ (simpangan baku populasi),
π (proporsi populasi)
Statistik :
n (banyaknya anggota sampel),
ẋ (rata-rata sampel),
s (simpangan baku sampel),
p (proporsi sampel)
Tipe Data
Data Nominal, data yang hanya berupa simbol (meski berupa
angka) untuk membedakan nilainya tanpa menunjukkan tingkatan
Data Ordinal, data yang mempunyai nilai untuk menunjukkan
tingkatan, namun tanpa skala yang baku dan jelas antar tingkatan.
Data Interval, data yang mempunyai nilai untuk menunjukkan
tingkatan dengan skala tertentu sesuai intervalnya. Nilai nol hanya
untuk menunjukkan titik acuan (baseline).
Data Rasio, data yang mempunyai nilai untuk menunjukkan
tingkatan dengan skala indikasi rasio perbandingan. Nilai nol
menunjukkan titik asal (origin) yang bernilai kosong (null).
6
Tipe Data
Data Parametrik, data kuantitatif yang mempunyai
sebaran variabel acak mengikuti pola distribusi
probabilitas dengan parameter tertentu (independent
and identically distributed random variables)
Data Nonparametrik, data yang tidak mempunyai
distribusi probabilitas (distribution-free)
7
Tipe Data
Data Diskrit, data hasil pencacahan atau
penghitungan, sehingga biasanya dalam angka
bilangan bulat.
Data Kontinyu, data hasil pengukuran yang
memungkinkan dalam angka bilangan nyata
(meskipun dapat pula dibulatkan)
8
Statistika Alat Bantu Problem Solving
9
Penting memperhatikan
cara memperoleh
data yang akan diolah
Demikian pula
cara mengolah data
juga penting diperhatikan
Statistika Alat Bantu Problem Solving
10
Metode statistika bukan
ramuan sihir
Alat statistika bukan
tongkat sihir
Ketelitian &
Tipe Kesalahan
11
Akurasi dan Presisi
Akurasi (accuracy), kesesuaian hasil pengukuran
terhadap nilai obyek sesungguhnya (bias kecil)
Presisi (precision), tingkat skala ketelitian
pengukuran dari alat pengukur, atau ketersebaran
yang relatif mengumpul (variansi atau deviasi kecil)
12
Akurat dan Presisi
Tidak presisi, akibat pola sebaran sampel
lebih melebar daripada pola sebaran
populasi menyebabkan deviasi yang besar.
Tidak akurat, akibat pergeseran
pemusatan sampel menjauh dari
pemusatan populasi menyebabkan bias
yang besar.
Akurat dan presisi, bias dan deviasi kecil,
membutuhkan sampel sedikit.
13
Kesalahan Pengambilan Kesimpulan
Galat tipe 1 () : kesalahan menyimpulkan karena
menolak hipotesa yang semestinya diterima
Galat tipe 2 () : kesalahan menyimpulkan karena
menerima hipotesa yang semestinya ditolak
14
 
Kesalahan Pengambilan Kesimpulan
15
The true state of nature
Decision H0 is true H0 is false
Reject H0 Type I error Exact decision
Fail to reject H0 Exact decision Type II error
The true state of nature
Decision H0 is true H0 is false
Reject H0  1 – 
Fail to reject H0 1 –  
Ukuran Ketelitian Pendugaan
Tingkat keberartian (significance level, ), probabilitas
penolakan data observasi, karena menyimpang signifikan terhadap
sasaran.
Tingkat kepercayaan (confidence coefficient,1-), persentase
data observasi yang diyakini tidak berbeda signifikan dengan target.
Kuasa statistik (power,1-), persentase data observasi yang
diyakini berbeda signifikan dengan target.
Derajat kebebasan (degree of freedom, df=n-k), besaran
yang menunjukkan bebas terhadap bias dari n data observasi.
16
Kekeliruan pada Analisa Regresi
 Tidak ada logika penalaran atau alasan logis yang mendasari hipotesa
variabel bebas mempengaruhi variabel terikat.
 Deskripsi dari variabel bebas tidak mempunyai hubungan kausal
dengan deskripsi dari variabel terikat.
 Pengukuran atau pengumpulan data dilakukan oleh/dari pihak yang
mempunyai konflik kepentingan atau yang tidak punya kewenangan
atas data.
 Variabel bebas dan/atau variabel terikat diukur pada hanya satu obyek
yang bernilai tunggal atau statis.
 Rentang data sampel sangat sempit, namun dipergunakan untuk
menginduksi rentang populasi yang sangat lebar dengan ekstrapolasi.
17
Kekeliruan pada Analisa Regresi
 Variabel bebas (x) = tinggi badan anak-anak di desa A (pertumbuhan tiap tahunnya);
Variabel terikat (y) = harga emas (kenaikan tiap tahunnya). Meskipun jika dihitung,
menunjukkan pertumbuhan tinggi badan anak-anak di desa A mempunyai korelasi
kuat dengan kenaikan harga emas.
 Variabel bebas (x) = motivasi kerja; Variabel terikat (y) = kinerja. Deskripsi “motivasi
kerja” adalah mendapatkan pengakuan dari keluarga besar karena menjadi karyawan
pabrik.Deskripsi “kinerja” adalah waktu penyelesaian pekerjaan lebih cepat daripada
batas waktu yang ditargetkan.
 Data yang diukur : Banyaknya pelanggaran yang dilakukan. Ditanyakan pada pelaku
pelanggaran, atau pada pihak yang tidak pernah melihat pelaku pelanggaran.
 Data yang diukur : Aturan pengupahan dari satu perusahaan di satu waktu.
 Rentang data sampel pada 10 < x < 50, namun dipergunakan untuk menduga nilai Y
jika x=200.
18
Kekeliruan pada Analisa Regresi
19
Analisa Regresi
20
Perbedaan Korelasi dan Regresi
21
Correlation Regression
Perbedaan Korelasi dan Regresi
22
Correlation Regression
Analisa Regresi
23
Analisa Regresi
24
Analisa Regresi
25
Analisa Regresi
26
Analisa Regresi
27
Analisa Regresi
28
Analisa Regresi Linier Sederhana
29
Analisa Regresi Linier Sederhana
30
Analisa Regresi Berganda
31
Analisa Regresi Berganda
32
Analisa Regresi Berganda
33
Asumsi Klasik
(classical linear regression
model assumptions)
34
The Gauss-Markov Theorem
Given Classical Assumptions, the ordinary least
squares (OLS) estimator βk is the minimum variance
estimator from among the set of all linear unbiased
estimators of βk.
In other words, OLS is BLUE
Best Linear Unbiased Estimator
Where Best = Minimum Variance
35
Asumsi Klasik dalam Regresi
1. The regression model is linear, y = Xβ + ε, correctly
specified and has an additive error term or residual, defined
by ei = yi – ŷi.  linearity
2. The number of observations must be greater than the
number of parameters to be estimated, n > k.
3. There must be variability in data value for the response
variable and explanatory variable(s).
4. The X matrix is non-stochastic. X values are fixed in
repeated sampling.
5. All explanatory variables are uncorrelated with the error
term, Cov(xi, ei) = 0.
36
Asumsi Klasik dalam Regresi
6. Observations of the error term are uncorrelated with each
other  no serial correlation or autocorrelation.
7. No explanatory variable is a perfect linear function of any
other explanatory variables.  no perfect multicollinearity
8. The error term is normally distributed, ~N(0,σ2) 
normality
9. The error term has a zero population mean, E(ei)= 0. 
unbiased estimator
10.The error term has a constant variance, Var(ei) = σ2 < . 
homoskedasticity or no heteroskedasticity
37
Uji Asumsi Klasik :
Linearity
38
Uji Asumsi Klasik: linearity
39
Uji Asumsi Klasik: linearity
1. Select baseline set of levels for each explanatory
variables (x10, x20, ..., xk0).
2. Successively estimate revised response variable (y'i•)
for each explanatory variable (xi•) with the other
explanatory variables held constant at the baseline
level (xj• = xj0 for j ≠ i).
3. Plot the data (xi•, y'i•) and the regression line with the
other explanatory variables held constant.
40
 
)
(
)
(
)
( 0
20
2
2
10
1
1 k
k
k
i
i x
x
x
x
x
x
y
y 










 



 

 
Uji Asumsi Klasik: linearity
41
Uji Asumsi Klasik: linearity
42
Uji Asumsi Klasik: linearity
43
Example 1
44
Example 1
45
Response Surface Methodology
3
2
1 .
3433
,
0
.
8616
,
1
.
0161
,
1
1574
,
39 x
x
x
y 



Example 1
1. Select baseline set of levels for each explanatory
variables (x10, x20, x30).
x10 = 6
x20 = 6
x30 = 9
46
Example 1
2. Estimate revised response variable (y'i•)
47
x1 x2 x3 y x10 x20 x30 y'1 y‘2 y‘3
1,74 5,30 10,8 25,5 6 6 9 24,815 30,446 29,588
6,32 5,42 9,4 31,2 6 6 9 30,258 31,012 30,676
6,22 8,41 7,2 25,9 6 6 9 29,769 25,059 26,504
10,52 4,63 8,5 38,4 6 6 9 35,678 33,636 33,337
1,19 11,60 9,4 18,4 6 6 9 28,963 23,425 25,210
1,22 5,85 9,9 26,7 6 6 9 26,730 31,866 31,505
4,10 6,62 8,0 26,4 6 6 9 27,211 27,987 28,543
6,32 8,72 9,1 25,9 6 6 9 30,998 25,609 26,509
4,08 4,42 8,7 32,0 6 6 9 28,956 33,848 33,409
4,15 7,60 9,2 25,2 6 6 9 28,247 27,148 27,629
10,15 4,83 9,4 39,7 6 6 9 37,659 35,620 35,082
1,72 3,12 7,6 35,7 6 6 9 29,858 39,568 39,060
1,70 5,30 8,2 26,5 6 6 9 24,922 30,595 30,629
59,43 81,82 115,4 377,5
 
 
678
,
35
)
9
5
,
8
(
3433
,
0
)
6
63
,
4
(
0161
,
1
4
,
38
)
(
)
( 30
34
3
20
24
2
14
14
















 x
x
x
x
y
y 

Example 1
3. Plot the data (xi•, y'i•) and the regression line.
48
Estimated Regression Equation
3
2
1 .
3433
,
0
.
8616
,
1
.
0161
,
1
1574
,
39 x
x
x
y 



Example 2
49
Example 2
50
Example 2
51
2
1 .
01253
,
0
.
74427
,
2
26379
,
2 x
x
y 


Example 2
1. Select baseline set of levels for each explanatory
variables (x10, x20).
x10 = 10
x20 = 300
52
Example 2
2. Estimate revised response
variable (y'i•)
53
x1 x2 y x1 x2 y'1 y‘2
2 50 9,95 10 300 13,082 31,904
8 110 24,45 10 300 26,830 29,939
11 120 31,75 10 300 34,005 29,006
10 550 35,00 10 300 31,868 35,000
8 295 25,02 10 300 25,083 30,509
4 200 16,86 10 300 18,113 33,326
2 375 14,38 10 300 13,440 36,334
2 52 9,60 10 300 12,707 31,554
9 100 24,35 10 300 26,856 27,094
8 300 27,50 10 300 27,500 32,989
4 412 17,08 10 300 15,677 33,546
11 400 37,00 10 300 35,747 34,256
12 500 41,95 10 300 39,444 36,461
2 360 11,66 10 300 10,908 33,614
4 205 21,65 10 300 22,840 38,116
4 400 17,89 10 300 16,637 34,356
20 600 69,00 10 300 65,242 41,557
1 585 10,30 10 300 6,730 34,998
10 540 34,93 10 300 31,923 34,930
15 250 46,59 10 300 47,216 32,869
15 290 44,88 10 300 45,005 31,159
16 510 54,12 10 300 51,489 37,654
17 590 56,63 10 300 52,997 37,420
6 100 22,13 10 300 24,636 33,107
5 400 21,15 10 300 19,897 34,871
206 8294 725,82
 
 
868
,
31
)
300
550
(
01253
,
0
00
,
35
)
( 20
24
2
14
14









 x
x
y
y 
Example 2
3. Plot the data (xi•, y'i•) and the regression line.
54
Estimated Regression Equation
2
1 .
01253
,
0
.
74427
,
2
26379
,
2 x
x
y 


Uji Asumsi Klasik :
Normality
55
Uji Asumsi Klasik: normality
The error term is normally distributed, ~N(0,σ2), with
a zero population mean, E(ei)= 0, and a constant
variance, Var(ei) = σ2.
It needs to assume normality for parametric
hypothesis testing based on gaussian distribution.
Normality assumption can be tested using goodness
of fit test or the Bera Jarque normality test.
56
Uji Asumsi Klasik: normality
Goodness of Fit Test
 Chi Square goodness of fit test
 Kolmogorov–Smirnov test
 Liliefors test
 Geary test
 Anderson–Darling test
 Shapiro–Wilk test
 Bayesian information criterion
 Cramér–von Mises criterion
 Akaike information criterion
 Kuiper's test
 Moran test
 Hosmer–Lemeshow test
57
Uji Asumsi Klasik: normality
The Bera Jarque normality test
A normal distribution is not skewed and is defined to
have a coefficient of kurtosis of 3.
The kurtosis of the normal distribution is 3 so its
excess kurtosis (b2-3) is zero.
Skewness and kurtosis are the (standardised) third
and fourth moments of a distribution.
58
Uji Asumsi Klasik: normality
59
 3
2
3
s
m
skewness 
Mo
Me
͞x Mo Me ͞x
͞x
Skewness = 0 Skewness > 0
Skewness < 0
Positive or right skew
Mo
Me
Symmetric
Negative or left skew
 Normal = 0
Uji Asumsi Klasik: normality
60
 2
2
4
s
m
kurtosis
Leptokurtik Mesokurtik Platikurtik
Kurtosis > 3 Kurtosis = 3 Kurtosis < 3
Normal
 Normal = 3
Uji Asumsi Klasik: normality
The Bera Jarque normality test
Bera and Jarque testing the residuals for normality by testing
whether the coefficient of skewness and the coefficient of
excess kurtosis are jointly zero.
It can be proved that the coefficients of skewness and
kurtosis can be expressed respectively as:
and
The Bera Jarque test statistic is given by
61
 
b
E u
1
3
2 3 2

[ ]
/
  
b
E u
2
4
2 2

[ ]

   
2
~
24
3
6
2
2
2
2
1






 


b
b
T
W
Uji Asumsi Klasik :
No Serial Correlation or
Autocorrelation
62
Uji Asumsi Klasik: no autocorrelation
Observations of the error term are uncorrelated with
each other.
Autocorrelation or serial correlation is a violation of
the classical assumption that assumes uncorrelated
observations of the error term.
Issue: Is et related to et-1? Such would be the case in
a time-series when a random shock has an impact
over a number of time periods.
63
Uji Asumsi Klasik: no autocorrelation
It assumed the error terms are not interrelated, i.e.
Cov (ei , ej) = 0 for i≠j.
If the covariance of two error terms is not equal to
zero, then autocorrelation exists.
This is essentially the same as saying there is no
pattern in the error term.
If there are patterns in the error term from a model,
we say that they are autocorrelated.
64
Uji Asumsi Klasik: no autocorrelation
65
Uji Asumsi Klasik: no autocorrelation
66
Uji Asumsi Klasik: no autocorrelation
67
Uji Asumsi Klasik: no autocorrelation
Autocorrelation can be categorized into 2 kinds:
Pure autocorrelation (autocorrelation that that occurs
when classical assumption, which assumes uncorrelated
observations of the error term, is violated).
Impure autocorrelation (autocorrelation that is caused by
specification errors: omitted variables or incorrect
functional form).
Autocorrelation mostly happens in a data set where
order of observations has some meaning (e.g. time-
series data).
68
Uji Asumsi Klasik: no autocorrelation
The most commonly assumed kind of autocorrelation is first-
order autocorrelation, in which the posterior value of the
error term is a function of the prior value of the error term:
et = ρ.et–1 + ut
where:
et = the error term in the model at time period t
ρ = the first-order autocorrelation coefficient depicting the
functional relationship between observations of the error
term.
ut = a classical (not serially correlated) error term
69
Uji Asumsi Klasik: no autocorrelation
The magnitude of ρ indicates the strength of the
autocorrelation or serial correlation:
If ρ is zero, ρ ≈ 0, there is no autocorrelation
As ρ approaches one in absolute value, |ρ| ≈ 1, there is significant
autocorrelation
For ρ to exceed one is unreasonable, since the error term
effectively would “explode”
As a result of this, we can state that:
–1 < ρ < +1
70
Uji Asumsi Klasik: no autocorrelation
ρ < 0 indicates negative autocorrelation (the signs of
the error term switch back and forth).
ρ > 0 indicates positive autocorrelation (a positive
error term tends to be followed by a positive error
term and a negative error term tends to be followed
by a negative error term).
Positive autocorrelation is more common than
negative autocorrelation. Situations where negative
autocorrelation occurs are not often encountered.
71
Uji Asumsi Klasik: no autocorrelation
 Examples of higher order autocorrelation:
1. Seasonal autocorrelation:
et = .et-4 + ut
2. Second-order autocorrelation:
et = 1.et-1 + 2.et-2 + ut
2. r-th-order autocorrelation:
et = 1.et-1 + 2.et-2 + . . . + r.et-r + ut
72
Uji Asumsi Klasik: no autocorrelation
Some tests to detect autocorrelation or serial correlation
The Graphical Run Test
The Durbin Watson Test
The Breusch-Godfrey Test
The Box-Pierce Q Test
The Cumby-Huizinga test
The Ljung-Box Q Test
The Portmanteau Test
The Lagrange Multiplier Test
73
Uji Asumsi Klasik: no autocorrelation
The Durbin-Watson (DW) is a test for first order
autocorrelation - i.e. it assumes that the relationship is
between an error and the previous one
et = ρ.et–1 + ut
where ut  N(0, u
2).
The DW test statistic actually tests
H0 : =0 and H1 : 0
The test statistic is calculated by
74
)
1
(
2
)
(
)
.
(
2
)
(
)
(
)
(
)
(
2
2
2
1
2
2
1
2
2
2
2
2
2
1























T
t
t
T
t
t
t
T
t
t
T
t
t
T
t
t
T
t
t
t
e
e
e
e
e
e
e
e
DW
Uji Asumsi Klasik: no autocorrelation
We can also write
DW ≈ 2(1 – ρ) and
where ρ is the estimated correlation coefficient.
Since ρ is a correlation, it implies that –1 < ρ < 1. Subtituting
ρ by DW would give 0 < DW < 4.
If ρ ≈ 0, DW ≈ 2. There is little evidence to reject the null hypothesis if DW is
near 2  the error terms are not autocorrelated
If ρ ≈ 1, DW ≈ 0. There is significant evidence to reject the null hypothesis if
DW is near 0  the error terms are positive autocorrelated
If ρ ≈ -1, DW ≈ 4. There is significant evidence to reject the null hypothesis if
DW is near 4  the error terms are negative autocorrelated
75
2
1
)
(
)
.
(
2
2
2
1
DW
e
e
e
T
t
t
T
t
t
t









Uji Asumsi Klasik: no autocorrelation
 Unfortunately, DW has 2 critical values, an upper critical value (dU) and
a lower critical value (dL), and there is also an intermediate region
(inconclusive) where we can neither reject nor not reject H0.
 The decision procedure is as follows:
If DW < dL reject H0 : ρ = 0, strong positive autocorrelation
If dL < DW < dU inconclusive, weak positive autocorrelation
If dU < DW < 4-dU do not reject H0 : ρ = 0, no autocorrelation
If 4-dU < DW < 4-dL inconclusive, weak negative autocorrelation
If DW > 4-dL reject H0 : ρ = 0, strong negative autocorrelation
76
Uji Asumsi Klasik: no autocorrelation
77
Conditions which Must be Fulfilled for DW to be a Valid Test
1. Constant term in regression
2. Regressors are non-stochastic
3. No lags of dependent variable
78
Example 3
79
Example 3
80
x1 x2 x3 y y pred e
1,74 5,30 10,8 25,5 27,35 -1,851
6,32 5,42 9,4 31,2 32,26 -1,062
6,22 8,41 7,2 25,9 27,35 -1,450
10,52 4,63 8,5 38,4 38,31 0,090
1,19 11,60 9,4 18,4 15,54 2,855
1,22 5,85 9,9 26,7 26,11 0,592
4,10 6,62 8,0 26,4 28,25 -1,853
6,32 8,72 9,1 25,9 26,22 -0,322
4,08 4,42 8,7 32,0 32,09 -0,088
4,15 7,60 9,2 25,2 26,07 -0,868
10,15 4,83 9,4 39,7 37,25 2,448
1,72 3,12 7,6 35,7 32,49 3,212
1,70 5,30 8,2 26,5 28,20 -1,703
59,43 81,82 115,4 377,5 377,50 0,00
Example 3
81
e(t-1) e(t) (e(t))2 (e(t) – e(t-1)) (e(t) – e(t-1))2
-1,851 -1,062 1,129 0,789 0,623
-1,062 -1,450 2,101 -0,387 0,150
-1,450 0,090 0,008 1,540 2,372
0,090 2,855 8,153 2,765 7,644
2,855 0,592 0,350 -2,263 5,123
0,592 -1,853 3,434 -2,445 5,978
-1,853 -0,322 0,104 1,531 2,345
-0,322 -0,088 0,008 0,234 0,055
-0,088 -0,868 0,753 -0,779 0,608
-0,868 2,448 5,991 3,315 10,991
2,448 3,212 10,317 0,764 0,584
3,212 -1,703 2,901 -4,915 24,160
1,70 5,30 1,129 0,789 0,623
35,249 0,148 60,633
720
,
1
249
,
35
633
,
60
)
(
)
(
2
2
2
2
1









T
t
t
T
t
t
t
e
e
e
DW
748
,
0
)
82
,
0
00
,
1
(
15
20
15
13
82
,
0 





L
d
778
,
1
)
75
,
1
68
,
1
(
15
20
15
13
75
,
1 





U
d
Example 3
82
Example 4
83
Example 4
84
Example 4
85
Example 4
86
x1 x2 y ypred e
2 50 9,95 8,379 1,571
8 110 24,45 25,596 -1,146
11 120 31,75 33,954 -2,204
10 550 35,00 36,597 -1,597
8 295 25,02 27,914 -2,894
4 200 16,86 15,746 1,114
2 375 14,38 12,450 1,930
2 52 9,60 8,404 1,196
9 100 24,35 28,215 -3,865
8 300 27,50 27,976 -0,476
4 412 17,08 18,402 -1,322
11 400 37,00 37,462 -0,462
12 500 41,95 41,459 0,491
2 360 11,66 12,262 -0,602
4 205 21,65 15,809 5,841
4 400 17,89 18,252 -0,362
20 600 69,00 64,666 4,334
1 585 10,30 12,337 -2,037
10 540 34,93 36,472 -1,542
15 250 46,59 46,560 0,030
15 290 44,88 47,061 -2,181
16 510 54,12 52,561 1,559
17 590 56,63 56,308 0,322
6 100 22,13 19,982 2,148
5 400 21,15 20,996 0,154
206 8294 725,82 725,820 0,000
Example 4
87
e(t-1) e(t) (e(t))2 (e(t) – e(t-1)) (e(t) – e(t-1))2
1,571 -1,146 1,313 -2,717 7,384
-1,146 -2,204 4,858 -1,058 1,120
-2,204 -1,597 2,550 0,607 0,369
-1,597 -2,894 8,373 -1,297 1,682
-2,894 1,114 1,240 4,007 16,058
1,114 1,930 3,724 0,816 0,666
1,930 1,196 1,431 -0,734 0,538
1,196 -3,865 14,938 -5,061 25,616
-3,865 -0,476 0,227 3,389 11,483
-0,476 -1,322 1,749 -0,846 0,716
-1,322 -0,462 0,213 0,860 0,740
-0,462 0,491 0,241 0,953 0,908
0,491 -0,602 0,363 -1,093 1,196
-0,602 5,841 34,116 6,443 41,516
5,841 -0,362 0,131 -6,203 38,476
-0,362 4,334 18,785 4,696 22,054
4,334 -2,037 4,149 -6,371 40,589
-2,037 -1,542 2,376 0,495 0,245
-1,542 0,030 0,001 1,572 2,470
0,030 -2,181 4,756 -2,211 4,889
-2,181 1,559 2,430 3,740 13,985
1,559 0,322 0,104 -1,236 1,529
0,322 2,148 4,613 1,826 3,333
2,148 0,154 0,024 -1,994 3,976
112,705 -1,418 241,537
143
,
2
705
,
112
537
,
241
)
(
)
(
2
2
2
2
1









T
t
t
T
t
t
t
e
e
e
DW
Example 4
88
Uji Asumsi Klasik: no autocorrelation
The Breusch-Godfrey test is a test for r-th order
autocorrelation - i.e. it assumes that the relationship is
between an error and the previous ones
et = ρ1.et–1 + ρ2.et–2 + . . . + ρr.et–r + ut
where ut  N(0, u
2).
The Breusch-Godfrey test statistic actually tests
H0 : ρ1=0 and ρ2=0 and . . . and ρr=0
H1 : 10 or 20 or . . . or r0
89
Uji Asumsi Klasik: no autocorrelation
The Breusch-Godfrey test has three basic steps:
1. Obtain the residuals of the estimated regression
equation:
ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki)
2. Use the first regression and the residuals to form the
residual model in a second regression:
et = b0 + b1x1t + b2x2t + ... + bkxkt +
ρ1.et–1 + ρ2.et–2 + . . . + ρr.et–r + ut
90
Uji Asumsi Klasik: no autocorrelation
The Breusch-Godfrey test has three basic steps:
3. Obtain R2 from this regression.
(degree of freedom, ν = n-r-2) :
H0: (n-r)R2 < χ2 (No autocorrelated errors)
H1: (n-r)R2 > χ2 (Autocorrelated errors)
If the test statistic exceeds the critical value from the
statistical tables, reject the null hypothesis of no
autocorrelation.
91
92
chi-square χ2 distribution
Uji Asumsi Klasik: no autocorrelation
Autocorrelation consequences:
 Autocorrelation does not cause bias in the β coefficient estimates
 Autocorrelation increases the variance of the β coefficient estimates
 Autocorrelation causes the dependent variable to fluctuate in a fashion that the
estimation procedure (OLS) attributes to the independent variables. Hence the
variance of the estimates of β increases. These estimates are still unbiased,
since over-estimation and under-estimation are still as likely
 Autocorrelation causes OLS to underestimate the variances (and
standard errors) of the β coefficients.
 Intuitively - Autocorrelation increases the fit of the model. Hence the estimation of
the variance and standard errors is lower. This can lead the researcher to
conclude a relationship exists when in fact they are unrelated.
 Hence the t-stats and F-stats can not be relied upon for statistical inference.
 Spurious Regressions.
93
Uji Asumsi Klasik: no autocorrelation
Methods to correct or remedy autocorrelation
Use the Generalized Least Squares to restore the
minimum variance property of the OLS estimation.
Use the Newey-West standard errors
Use the Cochrane-Orcutt method
Use the Hildreth-Lu Procedure
Use the first order Autoregressive, AR(1), model
Use the Maximum Likelihood approach
94
Uji Asumsi Klasik :
Homoskedasticity or
No Heteroskedasticity
95
Uji Asumsi Klasik: no heteroskedasticity
There should be homoskedasticity that the error term
has a constant variance, Var(ei) = σ2.
Heteroskedasticity is a violation of the classical
assumption that assumes the observations of the
error term are drawn from distributions that have a
constant variance.
Issue: Does ei differ across levels of the explanatory
variables? Such would be the case in a capacity constraint
when any particular state has an impact over the capacity.
96
Uji Asumsi Klasik: no heteroskedasticity
 In homoskedasticity the distribution of
the error term has a constant
variance, so the observations are
continually drawn from the same
distribution.
 In the simplest heteroskedastic case,
discrete heteroskedasticity, there
would be two different error term
variances, and therefore, two different
distributions. One distribution is wider
than the other.
97
Uji Asumsi Klasik: no heteroskedasticity
It assumed all explanatory variables are uncorrelated
with the error term, i.e. Cov(xi, ei) = 0 for each
explanatory variable(s).
If the covariance is not equal to zero, then
heteroskedasticity exists.
This is essentially the same as saying there is no
pattern in the error term by explanatory variable(s).
If there are patterns from a model, we say that they
are correlated.
98
Uji Asumsi Klasik: no heteroskedasticity
99
Uji Asumsi Klasik: no heteroskedasticity
Heteroskedasticity takes on many more complex forms,
however, than the discrete heteroskedasticity case
Perhaps the most frequently specified model of pure
heteroskedasticity relates the variance of the error term to an
exogenous variable Zi as follows:
yi = β0 + β1x1i + β2x2i + ... + βkxki + ei
VAR(ei) = σ2.f(Zi)
where Z, the “proportionality factor”, may or may not be in
the equation
100
Uji Asumsi Klasik: no heteroskedasticity
 If the error term is homoskedastic
with respect to Zi, the variance of the
distribution of the error term is the
same (constant) no matter what the
value of Zi, as in VAR(ei) = σ2.
 If the error term is heteroskedastic
with respect to Zi, the variance of the
distribution of the error term changes
systematically as function of Zi. In this
example, the variance is an
increasing function of Zi, as in
VAR(ei) = σ2Zi
2
101
Uji Asumsi Klasik: no heteroskedasticity
102
Uji Asumsi Klasik: no heteroskedasticity
103
Uji Asumsi Klasik: no heteroskedasticity
Heteroskedasticity can be categorized into 2 kinds:
Pure heteroskedasticity (heteroskedasticity that occurs
when classical assumption, which assumes constant
variance of the error term, is violated).
Impure heteroskedasticity (heteroskedasticity that is
caused by specification errors: omitted variables or
incorrect functional form).
Heteroskedasticity mostly happens in a data set where any
particular state may loosen/tighten the dispersion of response
variable (e.g. cross-sectional data).
104
Uji Asumsi Klasik: no heteroskedasticity
Heteroskedasticity can occur in any situations, such as:
When there is significant change in the variable(s) of a time-
series model.
When there is capacity constraint that limit the option
When there is different control or consistency
When there is different behavior in the different strata of
population.
When there are different amounts of measurement errors in
the sample of different periods or different sub-samples.
105
Uji Asumsi Klasik: no heteroskedasticity
Before using any test for heteroskedasticity, however, ask the following:
1. Are there any obvious specification errors? Is the regression model
already correctly specified?
 Fix those before testing!
2. Is the subject of the research likely to be afflicted with
heteroskedasticity?
 The cross-sectional studies with large variations in the size of the dependent
variable are particularly susceptible to heteroskedasticity
3. Does a graph of the residuals show any evidence of
heteroskedasticity?
 Specifically, plot the residuals against a potential Z proportionality factor
 In such cases, the graph can often show that heteroskedasticity is or is not likely
 Any graph shows an expanding (or contracting) range of the residuals
106
Uji Asumsi Klasik: no heteroskedasticity
Because heteroskedasticity can take on many forms,
therefore, there is no specific test to test for
heteroskedasticity.
Scientists and researchers do not all use the same
test for heteroskedasticity since heteroskedasticity
takes a number of different forms, and its precise
manifestation in a given equation is almost never
known.
107
Uji Asumsi Klasik: no heteroskedasticity
Some tests to detect heteroskedasticity
 The Graphical Zpredictor-Sresidual Test
 The Park Test
 The White Test
 The Glejser Test
 The Levene’s test
 The Goldfeld-Quandt Test
 The Brown-Forsythe Test
 The Harrison-McCabe Test
 The Breusch-Pagan Test
 The Cook-Weisberg Test
108
Uji Asumsi Klasik: no heteroskedasticity
The Park test is a test for heteroskedasticity - i.e. it assumes
that the relationship is between variable(s) and the error
yi = β0 + β1x1i + β2x2i + ... + βkxki + ei
VAR(ei) = σ2Zi
b1  ln(ei
2) = ln(σ2) + b1.ln(Zi) + ui
where Z: the “proportionality factor”
b1 : the slope of logarithmic model of squared residual
ui: a classical (homoskedastic) error term
One difficulty with the Park test is the specification of the Z
factor. The Z factor may one of the explanatory variables, but
not always.
109
Uji Asumsi Klasik: no heteroskedasticity
The Park test has three basic steps:
1. Obtain the residuals of the estimated regression
equation:
ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki)
2. Use these residuals to form the logarithmic model of
squared residual in a second regression:
ln(ei
2) = ln(σ2) + b1.ln(Zi) + ui
where yi' = ln(ei
2) and xi' = ln(Zi)
b0 = ln(σ2) and b1 : the slope of linear model
ui : a classical (homoskedastic) error term
110
Uji Asumsi Klasik: no heteroskedasticity
The Park test has three basic steps:
3. Check to see whether b1 is significant or not. Test the
significance of the b1 coefficient of Z with t-test
(degree of freedom, ν = n-2) :
H0: b1 = 0 (Homoskedastic errors)
H1: b1 ≠ 0 (Heteroskedastic errors)
111
Uji Asumsi Klasik: no heteroskedasticity
112
Advantages of the Park test:
a. The test is simple.
b. It provides information about the variance structure.
Limitations of the Park test:
a. The distribution of the dependent variable is problematic.
b. It assumes a specific functional form.
c. It does not work when the variance depends on two or more variables.
d. The correct variable with which to order the observations must be
identified first.
e. It cannot handle partitioned data.
113
Student’s t distribution
Example 5
114
Example 5
115
x1 x2 x3 y y pred e e2 ln(x1) ln(x2) ln(x3) ln(e2)
1,74 5,30 10,8 25,5 27,35 -1,851 3,428 0,554 1,668 2,380 1,232
6,32 5,42 9,4 31,2 32,26 -1,062 1,129 1,844 1,690 2,241 0,121
6,22 8,41 7,2 25,9 27,35 -1,450 2,101 1,828 2,129 1,974 0,743
10,52 4,63 8,5 38,4 38,31 0,090 0,008 2,353 1,533 2,140 -4,807
1,19 11,60 9,4 18,4 15,54 2,855 8,153 0,174 2,451 2,241 2,098
1,22 5,85 9,9 26,7 26,11 0,592 0,350 0,199 1,766 2,293 -1,049
4,10 6,62 8,0 26,4 28,25 -1,853 3,434 1,411 1,890 2,079 1,234
6,32 8,72 9,1 25,9 26,22 -0,322 0,104 1,844 2,166 2,208 -2,267
4,08 4,42 8,7 32,0 32,09 -0,088 0,008 1,406 1,486 2,163 -4,857
4,15 7,60 9,2 25,2 26,07 -0,868 0,753 1,423 2,028 2,219 -0,284
10,15 4,83 9,4 39,7 37,25 2,448 5,991 2,317 1,575 2,241 1,790
1,72 3,12 7,6 35,7 32,49 3,212 10,317 0,542 1,138 2,028 2,334
1,70 5,30 8,2 26,5 28,20 -1,703 2,901 0,531 1,668 2,104 1,065
59,43 81,82 115,4 377,5 377,50 0,00 38,676 16,426 23,188 28,311 -2,647
Regression equation for Park test
VAR(ei) = σ2Zi
b1
ln(ei
2) = ln(σ2) + b1.ln(Zi) + ui
where Zi = x1i , x2i , x3i
Example 5
116
X
ln(x1)
Y
ln(e2)
X2 XY Y2
0,554 1,232 0,307 0,682 1,518
1,844 0,121 3,399 0,223 0,015
1,828 0,743 3,341 1,357 0,551
2,353 -4,807 5,538 -11,311 23,102
0,174 2,098 0,030 0,365 4,403
0,199 -1,049 0,040 -0,209 1,100
1,411 1,234 1,991 1,741 1,522
1,844 -2,267 3,399 -4,180 5,141
1,406 -4,857 1,977 -6,829 23,587
1,423 -0,284 2,025 -0,404 0,081
2,317 1,790 5,371 4,149 3,205
0,542 2,334 0,294 1,266 5,447
0,531 1,065 0,282 0,565 1,134
16,426 -2,647 27,993 -12,585 70,806
  239
.
7
13
426
.
16
993
.
27
2
2
2







n
x
x
SXX
  241
.
9
13
)
647
.
2
(
426
.
16
585
.
12 











n
y
x
xy
SXY
  267
.
70
13
)
647
.
2
(
806
.
70
2
2
2








n
y
y
SYY
i
i
i
i
i
i
b
i
i
u
x
e
u
x
b
e
x
e









)
ln(
277
.
1
409
.
1
)
ln(
)
ln(
)
ln(
)
ln(
)
VAR(
1
2
1
1
2
2
1
1
2


277
.
1
239
.
7
241
.
9
1 




XX
XY
S
S
b
409
.
1
13
)
426
.
16
)
277
.
1
((
)
647
.
2
(
.
)
ln( 1
2










n
x
b
y

Example 5
117
H0: b1 = 0 (Homoskedastic errors)
H1: b1 ≠ 0 (Heteroskedastic errors)
316
.
5
2
13
))
241
,
9
(
)
277
.
1
((
267
.
70
2
.
1
2










n
S
b
S
s XY
YY
i
i
i u
x
e 

 )
ln(
277
.
1
409
.
1
)
ln( 1
2
48971
.
1
239
.
7
316
.
5
0
277
.
1
0
2
1







XX
S
s
b
t
267
.
70
241
.
9
239
.
7




YY
XY
XX
S
S
S
Degree of freedom: ν = n – 2 = 13 – 2 = 11
P-value = 2 X P(t < -1.48971) = 2 X 0.0822 = 0.1644
Conclusion: P-value (0.1644) > Sign.level (0.05).
There is no evidence to reject H0. The error term
is Homoskedastic with respect to x1.
Example 5
118
X
ln(x2)
Y
ln(e2)
X2 XY Y2
1,668 1,232 2,781 2,054 1,518
1,690 0,121 2,856 0,204 0,015
2,129 0,743 4,534 1,581 0,551
1,533 -4,807 2,349 -7,366 23,102
2,451 2,098 6,007 5,143 4,403
1,766 -1,049 3,120 -1,853 1,100
1,890 1,234 3,572 2,332 1,522
2,166 -2,267 4,690 -4,910 5,141
1,486 -4,857 2,209 -7,218 23,587
2,028 -0,284 4,113 -0,576 0,081
1,575 1,790 2,480 2,819 3,205
1,138 2,334 1,295 2,656 5,447
1,668 1,065 2,781 1,776 1,134
23,188 -2,647 42,789 -3,356 70,806
  430
.
1
13
188
.
23
789
.
42
2
2
2







n
x
x
SXX
  364
.
1
13
)
647
.
2
(
188
.
23
356
.
3 










n
y
x
xy
SXY
  267
.
70
13
)
647
.
2
(
806
.
70
2
2
2








n
y
y
SYY
i
i
i
i
i
i
b
i
i
u
x
e
u
x
b
e
x
e










)
ln(
954
.
0
905
.
1
)
ln(
)
ln(
)
ln(
)
ln(
)
VAR(
2
2
2
1
2
2
1
2
2


954
.
0
430
.
1
364
.
1
1 


XX
XY
S
S
b
905
.
1
13
)
188
.
23
954
.
0
(
)
647
.
2
(
.
)
ln( 1
2










n
x
b
y

Example 5
119
H0: b1 = 0 (Homoskedastic errors)
H1: b1 ≠ 0 (Heteroskedastic errors)
270
.
6
2
13
)
364
.
1
954
.
0
(
267
.
70
2
.
1
2








n
S
b
S
s XY
YY
i
i
i u
x
e 


 )
ln(
954
.
0
905
.
1
)
ln( 2
2
45556
.
0
430
.
1
270
.
6
0
954
.
0
0
2
1





XX
S
s
b
t
267
.
70
364
.
1
430
.
1



YY
XY
XX
S
S
S
Degree of freedom: ν = n – 2 = 13 – 2 = 11
P-value = 2 X P(t > 0.45556) = 2 X 0.3288 = 0.6576
Conclusion: P-value (0.6576) > Sign.level (0.05).
There is no evidence to reject H0. The error term
is Homoskedastic with respect to x2.
Example 5
120
X
ln(x3)
Y
ln(e2)
X2 XY Y2
2,380 1,232 5,662 2,931 1,518
2,241 0,121 5,021 0,271 0,015
1,974 0,743 3,897 1,466 0,551
2,140 -4,807 4,580 -10,286 23,102
2,241 2,098 5,021 4,702 4,403
2,293 -1,049 5,256 -2,404 1,100
2,079 1,234 4,324 2,566 1,522
2,208 -2,267 4,876 -5,007 5,141
2,163 -4,857 4,680 -10,507 23,587
2,219 -0,284 4,925 -0,630 0,081
2,241 1,790 5,021 4,011 3,205
2,028 2,334 4,113 4,733 5,447
2,104 1,065 4,427 2,241 1,134
28,311 -2,647 61,803 -5,913 70,806
  149
.
0
13
311
.
28
803
.
61
2
2
2







n
x
x
SXX
  149
.
0
13
)
647
.
2
(
311
.
28
913
.
5 











n
y
x
xy
SXY
  267
.
70
13
)
647
.
2
(
806
.
70
2
2
2








n
y
y
SYY
i
i
i
i
i
i
b
i
i
u
x
e
u
x
b
e
x
e









)
ln(
001
.
1
977
.
1
)
ln(
)
ln(
)
ln(
)
ln(
)
VAR(
3
2
3
1
2
2
1
3
2


001
.
1
149
.
0
149
.
0
1 




XX
XY
S
S
b
977
.
1
13
)
311
.
28
)
001
.
1
((
)
647
.
2
(
.
)
ln( 1
2










n
x
b
y

Example 5
121
H0: b1 = 0 (Homoskedastic errors)
H1: b1 ≠ 0 (Heteroskedastic errors)
374
.
6
2
13
))
149
.
0
(
)
001
.
1
((
267
.
70
2
.
1
2










n
S
b
S
s XY
YY
i
i
i u
x
e 

 )
ln(
001
.
1
977
.
1
)
ln( 3
2
15306
.
0
149
.
0
374
.
6
0
001
.
1
0
2
1







XX
S
s
b
t
267
.
70
149
.
0
149
.
0




YY
XY
XX
S
S
S
Degree of freedom: ν = n – 2 = 13 – 2 = 11
P-value = 2 X P(t < -0.15306) = 2 X 0.4406 = 0.8812
Conclusion: P-value (0.8812) > Sign.level (0.05).
There is no evidence to reject H0. The error term
is Homoskedastic with respect to x3.
Example 6
122
Example 6
123
Example 6
124
Example 6
125
x1 x2 y ypred e e2 ln(x1) ln(x2) ln(e2)
2 50 9,95 8,379 1,571 2,469 0,693 3,912 0,904
8 110 24,45 25,596 -1,146 1,313 2,079 4,700 0,273
11 120 31,75 33,954 -2,204 4,858 2,398 4,787 1,581
10 550 35,00 36,597 -1,597 2,550 2,303 6,310 0,936
8 295 25,02 27,914 -2,894 8,373 2,079 5,687 2,125
4 200 16,86 15,746 1,114 1,240 1,386 5,298 0,215
2 375 14,38 12,450 1,930 3,724 0,693 5,927 1,315
2 52 9,60 8,404 1,196 1,431 0,693 3,951 0,358
9 100 24,35 28,215 -3,865 14,938 2,197 4,605 2,704
8 300 27,50 27,976 -0,476 0,227 2,079 5,704 -1,483
4 412 17,08 18,402 -1,322 1,749 1,386 6,021 0,559
11 400 37,00 37,462 -0,462 0,213 2,398 5,991 -1,545
12 500 41,95 41,459 0,491 0,241 2,485 6,215 -1,422
2 360 11,66 12,262 -0,602 0,363 0,693 5,886 -1,014
4 205 21,65 15,809 5,841 34,116 1,386 5,323 3,530
4 400 17,89 18,252 -0,362 0,131 1,386 5,991 -2,032
20 600 69,00 64,666 4,334 18,785 2,996 6,397 2,933
1 585 10,30 12,337 -2,037 4,149 0,000 6,372 1,423
10 540 34,93 36,472 -1,542 2,376 2,303 6,292 0,866
15 250 46,59 46,560 0,030 0,001 2,708 5,521 -6,999
15 290 44,88 47,061 -2,181 4,756 2,708 5,670 1,559
16 510 54,12 52,561 1,559 2,430 2,773 6,234 0,888
17 590 56,63 56,308 0,322 0,104 2,833 6,380 -2,265
6 100 22,13 19,982 2,148 4,613 1,792 4,605 1,529
5 400 21,15 20,996 0,154 0,024 1,609 5,991 -3,745
206 8294 725,82 725,820 0,000 115,173 46,058 139,773 3,190
Regression equation for Park test
VAR(ei) = σ2Zi
b1
ln(ei
2) = ln(σ2) + b1.ln(Zi) + ui
where Zi = x1i , x2i , x3i
Example 6
126
X
ln(x1)
Y
ln(e2)
X2 XY Y2
0,693 0,904 0,480 0,626 0,817
2,079 0,273 4,324 0,567 0,074
2,398 1,581 5,750 3,790 2,498
2,303 0,936 5,302 2,155 0,876
2,079 2,125 4,324 4,419 4,516
1,386 0,215 1,922 0,298 0,046
0,693 1,315 0,480 0,911 1,729
0,693 0,358 0,480 0,248 0,128
2,197 2,704 4,828 5,941 7,311
2,079 -1,483 4,324 -3,085 2,201
1,386 0,559 1,922 0,775 0,312
2,398 -1,545 5,750 -3,704 2,387
2,485 -1,422 6,175 -3,534 2,023
0,693 -1,014 0,480 -0,703 1,028
1,386 3,530 1,922 4,893 12,459
1,386 -2,032 1,922 -2,817 4,130
2,996 2,933 8,974 8,787 8,603
0,000 1,423 0,000 0,000 2,024
2,303 0,866 5,302 1,993 0,749
2,708 -6,999 7,334 -18,954 48,987
2,708 1,559 7,334 4,223 2,432
2,773 0,888 7,687 2,461 0,788
2,833 -2,265 8,027 -6,417 5,131
1,792 1,529 3,210 2,739 2,338
1,609 -3,745 2,590 -6,027 14,025
46,058 3,190 100,844 -0,414 127,613
  990
.
15
25
058
.
46
844
.
100
2
2
2







n
x
x
SXX
  292
.
6
25
190
.
3
058
.
46
414
.
0 










n
y
x
xy
SXY
  206
.
127
25
190
.
3
613
.
127
2
2
2







n
y
y
SYY
i
i
i
i
i
i
b
i
i
u
x
e
u
x
b
e
x
e









)
ln(
393
.
0
853
.
0
)
ln(
)
ln(
)
ln(
)
ln(
)
VAR(
1
2
1
1
2
2
1
1
2


393
.
0
990
.
15
292
.
6
1 




XX
XY
S
S
b
853
.
0
25
)
058
.
46
)
393
.
0
((
190
.
3
.
)
ln( 1
2









n
x
b
y

Example 6
127
H0: b1 = 0 (Homoskedastic errors)
H1: b1 ≠ 0 (Heteroskedastic errors)
423
.
5
2
25
))
292
,
6
(
)
393
.
0
((
206
.
127
2
.
1
2










n
S
b
S
s XY
YY
i
i
i u
x
e 

 )
ln(
393
.
0
853
.
0
)
ln( 1
2
67566
.
0
990
.
15
423
.
5
0
393
.
0
0
2
1







XX
S
s
b
t
206
.
127
292
.
6
990
.
15




YY
XY
XX
S
S
S
Degree of freedom: ν = n – 2 = 25 – 2 = 23
P-value = 2 X P(t < -0.67566) = 2 X 0.2530 = 0.5060
Conclusion: P-value (0.5060) > Sign.level (0.05).
There is no evidence to reject H0. The error term
is Homoskedastic with respect to x1.
Example 6
128
X
ln(x2)
Y
ln(e2)
X2 XY Y2
3,912 0,904 15,304 3,536 0,817
4,700 0,273 22,095 1,281 0,074
4,787 1,581 22,920 7,567 2,498
6,310 0,936 39,815 5,906 0,876
5,687 2,125 32,342 12,085 4,516
5,298 0,215 28,072 1,140 0,046
5,927 1,315 35,128 7,793 1,729
3,951 0,358 15,612 1,416 0,128
4,605 2,704 21,208 12,452 7,311
5,704 -1,483 32,533 -8,461 2,201
6,021 0,559 36,253 3,364 0,312
5,991 -1,545 35,898 -9,256 2,387
6,215 -1,422 38,621 -8,839 2,023
5,886 -1,014 34,646 -5,968 1,028
5,323 3,530 28,334 18,789 12,459
5,991 -2,032 35,898 -12,176 4,130
6,397 2,933 40,921 18,762 8,603
6,372 1,423 40,597 9,065 2,024
6,292 0,866 39,584 5,445 0,749
5,521 -6,999 30,487 -38,645 48,987
5,670 1,559 32,148 8,842 2,432
6,234 0,888 38,868 5,534 0,788
6,380 -2,265 40,706 -14,451 5,131
4,605 1,529 21,208 7,041 2,338
5,991 -3,745 35,898 -22,438 14,025
139,773 3,190 795,094 9,784 127,613
  639
.
13
25
773
.
139
094
.
795
2
2
2







n
x
x
SXX
  052
.
8
25
190
.
3
773
.
139
784
.
9 









n
y
x
xy
SXY
  206
.
127
25
190
.
3
613
.
127
2
2
2







n
y
y
SYY
i
i
i
i
i
i
b
i
i
u
x
e
u
x
b
e
x
e









)
ln(
590
.
0
428
.
3
)
ln(
)
ln(
)
ln(
)
ln(
)
VAR(
2
2
2
1
2
2
1
2
2


590
.
0
639
.
13
052
.
8
1 




XX
XY
S
S
b
428
.
3
25
)
773
.
139
)
590
.
0
((
190
.
3
.
)
ln( 1
2









n
x
b
y

Example 6
129
H0: b1 = 0 (Homoskedastic errors)
H1: b1 ≠ 0 (Heteroskedastic errors)
324
.
5
2
25
))
052
,
8
(
)
590
.
0
((
206
.
127
2
.
1
2










n
S
b
S
s XY
YY
i
i
i u
x
e 

 )
ln(
590
.
0
428
.
3
)
ln( 2
2
94492
.
0
639
.
13
324
.
5
0
590
.
0
0
2
1







XX
S
s
b
t
206
.
127
052
.
8
639
.
13




YY
XY
XX
S
S
S
Degree of freedom: ν = n – 2 = 25 – 2 = 23
P-value = 2 X P(t < -0.94492) = 2 X 0.1773 = 0.3546
Conclusion: P-value (0.3546) > Sign.level (0.05).
There is no evidence to reject H0. The error term
is Homoskedastic with respect to x2.
Uji Asumsi Klasik: no heteroskedasticity
The White test is a test for heteroskedasticity - i.e. it assumes
that the relationship is between variable(s) and the error
yi = β0 + β1x1i + β2x2i + ... + βkxki + ei
ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki)
quadratic regression of the squared residual that consists of
the square of each x, and the product of each x times every
other x from the original equation
ei
2 = b0 + b1(x1i)2 + b2(x2i)2 + ... + bk(xki)2 +
bk+1(x1i.x2i) + bk+2(x1i.x3i) + ... + b2k-1(x1i.xki) +
... + b½.k(k+1)(x(k-1)i.xki)
130
Uji Asumsi Klasik: no heteroskedasticity
The White test has three basic steps:
1. Obtain the residuals of the estimated regression
equation:
ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki)
2. Use these residuals to form the quadratic model of
squared residual in a second regression:
ei
2 = b0 + b1(x1i)2 + b2(x2i)2 + ... + bk(xki)2 +
bk+1(x1i.x2i) + bk+2(x1i.x3i) + ... + b2k-1(x1i.xki) +
... + b½.k(k+1)(x(k-1)i.xki)
131
Uji Asumsi Klasik: no heteroskedasticity
The White test has three basic steps:
3. Check to see whether quadratic model is significant
or not. Test the significance of the n.R2 (the sample
size, n, times the coefficient of determination, R2)
with chi-square χ2-test (degree of freedom, ν =
number of regressors = ½.k(k+1))
H0: n.R2 < χ2
critical (Homoskedastic errors)
H1: n.R2 > χ2
critical (Heteroskedastic errors)
132
133
chi-square χ2 distribution
Uji Asumsi Klasik: no heteroskedasticity
134
Advantages of the White test:
a. It does not assume a specific functional form.
b. It is applicable when the variance depends on two or more variables.
Limitations of the White test:
a.It is an large-sample test.
b.It provides no information about the variance structure.
c.It loses many degrees of freedom when there are many regressors.
d.It cannot handle partitioned data.
e.It also captures specification errors.
Example 7
135
Example 7
136
x1 x2 x3 y y pred e e2 x12 x22 x32 x1x2 x1x3 x2x3 e4
1,74 5,30 10,8 25,5 27,35 -1,851 3,428 3,028 28,090 116,640 9,222 18,792 57,240 11,749
6,32 5,42 9,4 31,2 32,26 -1,062 1,129 39,942 29,376 88,360 34,254 59,408 50,948 1,274
6,22 8,41 7,2 25,9 27,35 -1,450 2,101 38,688 70,728 51,840 52,310 44,784 60,552 4,415
10,52 4,63 8,5 38,4 38,31 0,090 0,008 110,670 21,437 72,250 48,708 89,420 39,355 0,000
1,19 11,60 9,4 18,4 15,54 2,855 8,153 1,416 134,560 88,360 13,804 11,186 109,040 66,464
1,22 5,85 9,9 26,7 26,11 0,592 0,350 1,488 34,223 98,010 7,137 12,078 57,915 0,123
4,10 6,62 8,0 26,4 28,25 -1,853 3,434 16,810 43,824 64,000 27,142 32,800 52,960 11,794
6,32 8,72 9,1 25,9 26,22 -0,322 0,104 39,942 76,038 82,810 55,110 57,512 79,352 0,011
4,08 4,42 8,7 32,0 32,09 -0,088 0,008 16,646 19,536 75,690 18,034 35,496 38,454 0,000
4,15 7,60 9,2 25,2 26,07 -0,868 0,753 17,223 57,760 84,640 31,540 38,180 69,920 0,567
10,15 4,83 9,4 39,7 37,25 2,448 5,991 103,023 23,329 88,360 49,025 95,410 45,402 35,892
1,72 3,12 7,6 35,7 32,49 3,212 10,317 2,958 9,734 57,760 5,366 13,072 23,712 106,450
1,70 5,30 8,2 26,5 28,20 -1,703 2,901 2,890 28,090 67,240 9,010 13,940 43,460 8,416
59,43 81,82 115,4 377,5 377,50 0,00 38,676 394,726 576,726 1035,960 360,662 522,078 728,310 247,154
Quadratic model of squared residual
ei
2 = b0 + b1(x1i)2 + b2(x2i)2 + b3(x3i)2 + b4(x1i.x2i) + b5(x1i.x3i) + b6(x2i.x3i)
Example 7
137
13 394,726 576,726 1035,960 360,662 522,078 728,310
394,726 28435,940 14215,890 30724,415 17434,014 28097,685 19924,252
576,726 14215,890 39239,948 46312,736 17114,038 20729,897 40540,700
A = X'X = 1035,960 30724,415 46312,736 86244,604 27835,366 41298,355 59564,444
360,662 17434,014 17114,038 27835,366 14215,890 19924,252 20729,897
522,078 28097,685 20729,897 41298,355 19924,252 30724,415 27835,366
728,310 19924,252 40540,700 59564,444 20729,897 27835,366 46312,736
2,57712 0,00801 -0,02334 -0,04105 -0,02487 -0,00143 0,04125
0,00801 0,00144 -0,00119 -0,00038 0,00064 -0,00220 0,00183
-0,02334 -0,00119 0,00641 0,00400 0,00091 0,00112 -0,01095
A-1 = (X'X)-1= -0,04105 -0,00038 0,00400 0,00314 0,00151 -0,00031 -0,00721
-0,02487 0,00064 0,00091 0,00151 0,00274 -0,00240 -0,00241
-0,00143 -0,00220 0,00112 -0,00031 -0,00240 0,00431 -0,00112
0,04125 0,00183 -0,01095 -0,00721 -0,00241 -0,00112 0,01920
Example 7
138
38,676
880,789
1910,925
g = X'Y = 2976,781
793,631
1216,240
2176,259
8,21240
-0,03213
0,44284
β =A-1.g = 0,17661
-0,14725
0,10519
-0,68045
The estimated regression equation of squared residual quadratic model
ei
2 = 8.212 – 0.032(x1i)2 + 0.440(x2i)2 + 0.177(x3i)2 – 0.147(x1i.x2i) + 0.105(x1i.x3i)
– 0.680(x2i.x3i)
Example 7
139
Response Surface Methodology
ei
2 = 8.212 – 0.032(x1i)2 + 0.440(x2i)2 +
0.177(x3i)2 – 0.147(x1i.x2i) +
0.105(x1i.x3i) – 0.680(x2i.x3i)
Example 7
140
β X'Y Β(X'Y)
8,21240 38,676 317,626
-0,03213 880,789 -28,299
0,44284 1910,925 846,242
0,17661 2976,781 525,737
-0,14725 793,631 -116,866
0,10519 1216,240 127,937
-0,68045 2176,259 -1480,825
191,553
   
 
487
.
76
066
.
115
553
.
191
13
676
.
38
553
.
191
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
088
.
132
066
.
115
154
.
247
13
676
.
38
154
.
247
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
601
.
55
553
.
191
154
.
247
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

Example 7
141
52780
.
7
57906
.
0
13
57906
.
0
088
.
132
487
.
76
2
2






nR
SST
SSR
R
P-value = P(χ2 > 7.52780) = 0.2748
Conclusion: P-value (0.2748) > Sign.level (0.05).
There is no evidence to reject H0. The error term
is Homoskedastic.
H0: n.R2 < χ2
critical (Homoskedastic errors)
H1: n.R2 > χ2
critical (Heteroskedastic errors)
Sign.level = 0.05
Degree of freedom: ν = ½.k(k+1) = ½.3(3+1) = 6
χ2=12.59
Example 8
142
Example 8
143
Example 8
144
Example 8
145
x1 x2 y ypred e e2 x12 x22 x1x2 e4
2 50 9,95 8,379 1,571 2,469 4 2500 100 6,096
8 110 24,45 25,596 -1,146 1,313 64 12100 880 1,725
11 120 31,75 33,954 -2,204 4,858 121 14400 1320 23,600
10 550 35,00 36,597 -1,597 2,550 100 302500 5500 6,501
8 295 25,02 27,914 -2,894 8,373 64 87025 2360 70,111
4 200 16,86 15,746 1,114 1,240 16 40000 800 1,538
2 375 14,38 12,450 1,930 3,724 4 140625 750 13,867
2 52 9,60 8,404 1,196 1,431 4 2704 104 2,048
9 100 24,35 28,215 -3,865 14,938 81 10000 900 223,150
8 300 27,50 27,976 -0,476 0,227 64 90000 2400 0,051
4 412 17,08 18,402 -1,322 1,749 16 169744 1648 3,057
11 400 37,00 37,462 -0,462 0,213 121 160000 4400 0,046
12 500 41,95 41,459 0,491 0,241 144 250000 6000 0,058
2 360 11,66 12,262 -0,602 0,363 4 129600 720 0,132
4 205 21,65 15,809 5,841 34,116 16 42025 820 1163,932
4 400 17,89 18,252 -0,362 0,131 16 160000 1600 0,017
20 600 69,00 64,666 4,334 18,785 400 360000 12000 352,864
1 585 10,30 12,337 -2,037 4,149 1 342225 585 17,212
10 540 34,93 36,472 -1,542 2,376 100 291600 5400 5,647
15 250 46,59 46,560 0,030 0,001 225 62500 3750 0,000
15 290 44,88 47,061 -2,181 4,756 225 84100 4350 22,623
16 510 54,12 52,561 1,559 2,430 256 260100 8160 5,903
17 590 56,63 56,308 0,322 0,104 289 348100 10030 0,011
6 100 22,13 19,982 2,148 4,613 36 10000 600 21,281
5 400 21,15 20,996 0,154 0,024 25 160000 2000 0,001
206 8294 725,82 725,820 0,000 115,173 2396 3531848 77177 1941,469
Quadratic model of squared residual
ei
2 = b0 + b1(x1i)2 + b2(x2i)2 + b3(x1i.x2i)
Example 8
146
25 2396 3531848 77177
2396 502184 485990145 14846879
A = X'X = 3531848 485990145 847350799652 17764206203
77177 14846879 17764206203 485990145
1,192E-01 -8,904E-04 -6,830E-07 3,323E-05
-8,904E-04 3,753E-05 1,394E-08 -1,515E-06
A-1 = (X'X)-1= -6,830E-07 1,394E-08 1,149E-11 -7,377E-10
3,323E-05 -1,515E-06 -7,377E-10 7,002E-08
115,173
13020,171
g = X'Y = 14225116,948
378285,711
4,99502
0,01142
β =A-1.g = -0,00001
0,00010
Quadratic model of squared residual
ei
2 = 4.99502 + 0.01142(x1i)2 – 0.00001(x2i)2 + 0.00010(x1i.x2i)
Example 8
147
β X'Y Β(X'Y)
4,99502 115,173 575,294
0,01142 13020,171 148,677
-0,00001 14225116,948 -179,972
0,00010 378285,711 37,360
581,359
   
 
761
.
50
597
.
530
359
.
581
25
173
.
115
359
.
581
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
872
.
1410
597
.
530
469
.
1941
25
173
.
115
469
.
1941
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
110
.
1360
359
.
581
469
.
1941
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

Example 8
148
8995
.
0
03598
.
0
25
03598
.
0
872
.
1410
761
.
50
2
2






nR
SST
SSR
R
P-value = P(χ2 > 0.03598) = 0.82556
Conclusion:
P-value (0.82556) > Sign.level (0.05).
There is no evidence to reject H0.
The error term is Homoskedastic.
H0: n.R2 < χ2
critical (Homoskedastic errors)
H1: n.R2 > χ2
critical (Heteroskedastic errors)
Sign.level = 0.05
Degree of freedom: ν = ½.k(k+1) = ½.2(2+1) = 3
χ2=7.81
Quadratic model of squared residual
ei
2 = 4.99502 + 0.01142(x1i)2 – 0.00001(x2i)2 + 0.00010(x1i.x2i)
Uji Asumsi Klasik: no heteroskedasticity
The Goldfeld-Quandt Test is a test for heteroskedasticity -
i.e. it assumes that the relationship is between variable(s) and
the error
yi = β0 + β1x1i + β2x2i + ... + βkxki + ei
ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki)
Error Sum of Squares, SSE, is calculated by the differences
between each observed response variable and predicted
response variable.
SSE = Σei
2 = Σ(yi-ŷi)2 = Σ(yi
2) – (β0Σyi + β1Σx1i yi +
β2Σx2iyi + ... + βkΣxkiyi)
149
Uji Asumsi Klasik: no heteroskedasticity
The Goldfeld-Quandt Test has the following steps:
1. Order the data by the magnitude of Explanatory
Variable in ascending order.
2. Divide the data into three parts: less(n1), middle(n2)
and greater(n3). There is no rule that specify how
many observations each parts. n1+n2+n3=n the
number of observation
3. Omit the middle part, and drop these n2 observations.
150
Uji Asumsi Klasik: no heteroskedasticity
The Goldfeld-Quandt Test has the following steps:
4. Obtain the error sum of squares of each part : SSE1
(less part with degree of freedom ν1=n1–k) and
SSE2 (greater part with degree of freedom ν2=n2–k)
5. Assuming the error process is normally distributed,
then calculate the ratio of the error sum of squares:
F=(SSE2/ν2)/(SSE1/ν1).
6. Apply the F-test. F>Fcritical at the right tail or F<Fcritical
at the left tail indicate that the variances are different.
151
Uji Asumsi Klasik: no heteroskedasticity
Heteroskedasticity consequences:
 Heteroskedasticity does not cause bias in the β coefficient estimates
 Heteroskedasticity increases the variance of the β coefficient estimates
 Heteroskedasticity causes the dependent variable to fluctuate in a fashion that the
estimation procedure (OLS) attributes to the independent variables. Hence the
variance of the estimates of β increases. These estimates are still unbiased,
since over-estimation and under-estimation are still as likely
 Heteroskedasticity causes OLS to underestimate the variances (and
standard errors) of the β coefficients.
 Intuitively - Heteroskedasticity increases the fit of the model. Hence the
estimation of the variance and standard errors is lower. This can lead the
researcher to conclude a relationship exists when in fact they are unrelated.
 Hence the t-stats and F-stats can not be relied upon for statistical inference.
 Spurious Regressions.
152
Uji Asumsi Klasik: no heteroskedasticity
Methods to correct or remedy heteroskedasticity
View logarithmized data
Redefine the variables
Apply a weighted least squares estimation
Use Heteroskedasticity-consistent standard errors
Use Heteroskedasticity-corrected standard errors
Use Minimum Norm Quadratic Unbiased Estimation
(MINQUE)
153
Uji Asumsi Klasik :
No Multicollinearity
154
Uji Asumsi Klasik: no multicollinearity
No explanatory variable is highly correlated with one
or more other explanatory variables.
Multicollinearity is a violation of the classical
assumption that assumes no explanatory variable is a
perfect linear function of any other explanatory
variables.
Issue: Is xi related to xj? Such would be the case in
clustered samples when any variables are similar
within cluster but different between clusters.
155
Uji Asumsi Klasik: no multicollinearity
It assumed the explanatory variables are not
interrelated, i.e. Cov (xi , xj) = 0 for i≠j.
If the covariance of two explanatory variables is not
equal to zero, then multicollinearity exists.
This is essentially the same as saying there is no
pattern between explanatory variables.
If an explanatory variable can be explained by a
pattern with respect to one or more other explanatory
variables, we say that there are multicollinearity.
156
Uji Asumsi Klasik: no multicollinearity
Multicollinearity can be categorized into 2 kinds:
Perfect multicollinearity (multicollinearity which an
explanatory variable can be written as a linear
combination of any other explanatory variables).
Impure multicollinearity (multicollinearity which an
explanatory variable is close to being represented by a
linear function of any other explanatory variables).
Multicollinearity mostly happens in clustering or
segmenting where samples are grouped by similarity of some
characteristics (e.g. clustered data).
157
Uji Asumsi Klasik: no multicollinearity
 With perfect multicollinearity, an
explanatory variable can be
completely explained by the
movement of one or more other
explanatory variable(s).
 Perfect multicollinearity can usually
be avoided by careful screening of
the explanatory variables before a
regression is run.
 With imperfect multicollinearity, an
explanatory variable is a strong but
not perfect linear function of one or
more other explanatoy variable(s).
 Imperfect multicollinearity varies in
degree from sample to sample.
158
Uji Asumsi Klasik: no multicollinearity
Multicollinearity can be categorized into another 2
kinds:
Structural multicollinearity (multicollinearity occurs when
we create a model term using other terms. In other words,
it’s a byproduct of the model that we specify rather than
being present in the data itself.).
Impure multicollinearity (multicollinearity is present in the
data itself rather than being an artifact of our model.
Observational experiments are more likely to exhibit this
kind of multicollinearity).
159
Uji Asumsi Klasik: no multicollinearity
Regression coefficient, βi, is the impact of explanatory
variable, xi, has on response variable, y, holding all other
explanatory variable(s) constant.
If x1 is related to x2 then β1 will also capture the impact of
changes in x2.
In other words, interpretation of the regression coefficients or
parameters becomes difficult.
The easiest way to test multicollinearity is to examine the
standard errors of the coefficients.
Reasonable method to relieve multicollinearity is to drop
some highly correlated variables.
160
Uji Asumsi Klasik: no multicollinearity
In the Venn diagrams, the overlapping
area between Y and X(X1, X2) is the
variance explained.
 In case 1, X1 and X2 are related; X1 and Y are
related, but X2 and Y has no relationship.
 In case 2, both X1 and X2 contribute to some
unique variance explained to Y, but they also
have some common variance explained.
 In case 3, again both X1 and X2 contribute
unique variance explained to Y, but X1 and X2
are totally unrelated (orthogonal).
 In case 4, although both X1 and X2 could
predict Y. The variance explained contributed
by X2 has been covered by X1 because X1 and
X2 are too correlated (collinear).
The above cases are not exhaustive. There
are many other possible combinations
between Y and Xs.
161
Uji Asumsi Klasik: no multicollinearity
162
Severe multicollinearity
produces a distribution of
the βs that is centered
around the true β, but that
has a much wider
variance.
Thus the distribution of βs
with multicollinearity is
much wider than
otherwise.
Uji Asumsi Klasik: no multicollinearity
A special case of multicollinearity problem is a dominant
variable.
The dominant variable is an explanatory variable, x, which is
definitionally related to the response variable, y.
 The dominant variable is generally a part of or a complement of the
response variable. For example, if the Y variable is the number of
computers and the X variable is the number of processors.
 The dominant variable is highly correlated with response variable and it
will make other explanatory variables unimportant in determining the
response variable.
 Do not confuse a dominant variable with a highly significant explanatory
variable.
163
Uji Asumsi Klasik: no multicollinearity
First realize that that some multicollinearity exists in
every equation: all variables are correlated to some
degree (even if completely at random)
So it’s really a question of how much multicollinearity
exists in an equation, rather than whether any
multicollinearity exists
164
Uji Asumsi Klasik: no multicollinearity
Some tests to detect multicollinearity
 The Determination Coefficient R2 and t-Test
 The Simple Correlation Coefficients Test
 The Variance Inflation Factor (VIF) Test
 The Farrar–Glauber Test
 The Condition Number test
 The Perturbing the data Test
165
Uji Asumsi Klasik: no multicollinearity
High Determination Coefficient, R2, with all low t-
scores (individual coefficient estimator test)
If this is the case, you have multicollinearity.
If this is not the case, you may or may not have
multicollinearity.
If all the t-scores are significant and in the expected
direction than we can conclude that multicollinearity is not
likely to be a problem.
166
Uji Asumsi Klasik: no multicollinearity
High Simple Correlation Coefficients
If a simple correlation coefficient, rij, between any two
explanatory variables (xi and xj with i≠j) is high in absolute
value, these two particular Xs are highly correlated and this
evidence indicates the potential for multicollinearity.
How high is high?
Some researchers pick an arbitrary number, such as 0.80
A better answer might be that rij is high if it causes unacceptably
large variances in the coefficient estimates in which we’re
interested.
167
Uji Asumsi Klasik: no multicollinearity
High Simple Correlation Coefficients
Caution in case of more than two explanatory variables:
An explanatory variable is correlated with a group of any other
explanatory variables, acting together simultaneously.
An explanatory variable is correlated with an interaction of any other
explanatory variables.
An explanatory variable is a nonlinear function of any other
explanatory variables
It may cause multicollinearity without any single simple
correlation coefficient being high enough to indicate that
multicollinearity is present.
168
Uji Asumsi Klasik: no multicollinearity
High Simple Correlation Coefficients
The matrix plot between two individual variables.
The matrix of correlations between two individual
variables
Note that high correlation between the response
variable, y, and one of the explanatory variable, x’s, is
not muticollinearity.
169
Uji Asumsi Klasik: no multicollinearity
High Simple Correlation Coefficients
the matrix plot between two individual variables
170
Uji Asumsi Klasik: no multicollinearity
High Simple Correlation Coefficients
the matrix of correlations between two individual variables
171
Correlation X1 X2 ... Xk
X1 1 r12 ... r1k
X2 R21 1 ... r2k
: : : :
Xk rk1 rk2 ... 1
Uji Asumsi Klasik: no multicollinearity
The Variance Inflation Factors, VIF, is calculated from
the following steps:
1. For each explanatory variable (xi), run an OLS
regression that has xi as a function of all the other
explanatory variables in the equation—
For i = 1, this equation would be:
x1 = b0 + b2.x2 + b3.x3 + . . . + bk.xk + u
where
u  N(0, u
2).: a classical (no multicollinearity) error
term
172
Uji Asumsi Klasik: no multicollinearity
The Variance Inflation Factors, VIF, is calculated from
the following steps:
2. Obtain a value of determination coefficient, Ri
2, from
the regression equation of specific explanatory
variable, xi.
3. Calculate the variance inflation factor for βi :
VIF(βi) = 1 / (1 – Ri
2)
4. If VIF > 5, multicollinearity problem is potentially
severe. Repeat for all X’s
173
Uji Asumsi Klasik: no multicollinearity
The Variance Inflation Factors, VIF
 The higher the VIF, the more severe the effects of
mulitcollinearity
 While there is no table of formal critical VIF values, a
common rule of thumb is that if a given VIF is greater
than 5, the multicollinearity is severe
 As the number of independent variables increases, it
makes sense to increase this number slightly
174
Example 9
175
Example 9
176
x1 x2 x3 y y pred
1,74 5,30 10,8 25,5 27,35
6,32 5,42 9,4 31,2 32,26
6,22 8,41 7,2 25,9 27,35
10,52 4,63 8,5 38,4 38,31
1,19 11,60 9,4 18,4 15,54
1,22 5,85 9,9 26,7 26,11
4,10 6,62 8,0 26,4 28,25
6,32 8,72 9,1 25,9 26,22
4,08 4,42 8,7 32,0 32,09
4,15 7,60 9,2 25,2 26,07
10,15 4,83 9,4 39,7 37,25
1,72 3,12 7,6 35,7 32,49
1,70 5,30 8,2 26,5 28,20
59,43 81,82 115,4 377,5 377,50
Cor y x1 x2 x3
y 1 0,65385 -0,78581 -0,18627
x1 0,65385 1 -0,15350 -0,14522
x2 -0,78581 -0,15350 1 0,07484
x3 -0,18627 -0,14522 0,07484 1
Example 9
177
i ≠ 1 β X'Y Β(X'Y)
0 9,73992 59,43000 578,84326
2 -0,20244 360,66210 -73,01385
3 -0,43869 522,07800 -229,03100
276,79842
   
 
11189
.
5
68653
.
271
79842
.
276
13
43
.
59
79842
.
276
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
03897
.
123
68653
.
271
72550
.
394
13
43
.
59
72550
.
394
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
92708
.
117
79842
.
276
72550
.
394
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

For x1, the regression equation:
x1 = b0 + b2.x2 + b3.x3 + u
04155
.
0
03897
.
123
11189
.
5
2



SST
SSR
R 04335
.
1
04155
.
0
1
1
1
1
2





R
VIF
For x1,
no multicollinearity
Example 9
178
i ≠ 1 β X'Y Β(X'Y)
0 5,66436 81,82000 463,45828
1 -0,10323 360,66210 -37,23184
3 0,12408 728,31000 90,36559
516,59203
   
 
62953
.
1
96249
.
514
59203
.
516
13
82
.
81
59203
.
516
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
76391
.
61
96249
.
514
72640
.
576
13
82
.
81
72640
.
576
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
13437
.
60
59203
.
516
72640
.
576
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

For x2, the regression equation:
x2 = b0 + b1.x1 + b3.x3 + u
02638
.
0
76391
.
61
62953
.
1
2



SST
SSR
R 02710
.
1
02638
.
0
1
1
1
1
2





R
VIF
For x2,
no multicollinearity
Example 9
179
i ≠ 1 β X'Y Β(X'Y)
0 8,92230 115,40000 1029,63293
1 -0,04199 522,07800 -21,92001
2 0,02329 728,31000 16,96056
1024,67348
   
 
27656
.
0
39692
.
1024
67348
.
1024
13
4
.
115
67348
.
1024
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
56308
.
11
39692
.
1024
96000
.
1035
13
4
.
115
96000
.
1035
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
28652
.
11
67348
.
1024
96000
.
1035
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

For x3, the regression equation:
x3 = b0 + b1.x1 + b2.x2 + u
02392
.
0
56308
.
11
27656
.
0
2



SST
SSR
R 02450
.
1
02392
.
0
1
1
1
1
2





R
VIF
For x3,
no multicollinearity
Example 10
180
Example 10
181
Example 10
182
Cor y x1 x2
y 1 0,98181 0,49287
x1 0,98181 1 0,37841
x2 0,49287 0,37841 1
Example 10
183
i ≠ 1 β X'Y Β(X'Y)
0 4,48353 206 923.60688
2 0,01132 77177 873.86423
1797.47111
   
 
03111
.
100
44
.
1697
47111
.
1797
25
206
47111
.
1797
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
56
.
698
44
.
1697
2396
25
206
2396
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
52889
.
598
47111
.
1797
2396
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

For x1, the regression equation:
x1 = b0 + b2.x2 + u
14320
.
0
56
.
698
03111
.
100
2



SST
SSR
R 16713
.
1
14320
.
0
1
1
1
1
2





R
VIF
For x1,
no multicollinearity
Example 10
184
i ≠ 1 β X'Y Β(X'Y)
0 227,55165 8294 1887313,3777
1 12,64664 77177 976030,0846
2863343,4623
   
 
0223
.
111726
44
.
2751617
4623
.
2863343
25
8294
4623
.
2863343
)
(
)
(
)
(
)
(
)
(
2
2
1
1
0
2
















 n
y
y
x
y
x
y
y
y
SSR k
k
i 

 

 
 
56
.
780230
44
.
2751617
3531848
25
8294
3531848
)
(
)
(
2
2
2
2











 n
y
y
y
y
SST i
 
5377
.
668504
4623
.
2863343
3531848
)
(
)
(
)
(
)
(
)
( 1
1
0
2
2









 



 y
x
y
x
y
y
y
y
SSE k
k
i
i 

 

For x2, the regression equation:
x2 = b0 + b1.x1 + u
14320
.
0
56
.
780230
0223
.
111726
2



SST
SSR
R 16713
.
1
14320
.
0
1
1
1
1
2





R
VIF
For x2,
no multicollinearity
Uji Asumsi Klasik: no multicollinearity
Multicollinearity consequences:
 The β coefficient estimates will still remain unbiased
 Estimates will still be centered around the true values.
 R2 will be high but the individual coefficients will have high standard errors
 The variances and standard errors of the estimates will increase
 Harder to distinguish the effect of one variable from the effect of another, so much
more likely to make large errors in estimating the βs than without multicollinearity.
 As a result, the estimated coefficients, although still unbiased, now come from
distributions with much larger variances and, therefore, larger standard errors.
 The computed t-scores will fall.
 Variance and standard error are increased.
 A relatively high R2 in an equation with few significant t statistics.
 Thus confidence intervals for the estimates will be very wide, and significance
tests might therefore give inappropriate conclusions.
185
Uji Asumsi Klasik: no multicollinearity
Multicollinearity consequences:
 Estimates will become very sensitive to changes in specification.
 The addition or deletion of an explanatory variable or of a few observations will
often cause major changes in the values of the βs when significant
multicollinearity exists
 For example, if you drop a variable, even one that appears to be statistically
insignificant, the coefficients of the remaining variables in the equation sometimes
will change dramatically
 This is again because with multicollinearity, it is much harder to distinguish the
effect of one variable from the effect of another (holding all else constant)
 The overall fit of the equation and the estimation of the coefficients of
nonmulticollinear variables will be largely unaffected.
 If the multicollinearity occurs in the population as well as the sample, then the
predictive power of the model is unaffected.
186
Uji Asumsi Klasik: no multicollinearity
Methods to correct or remedy multicollinearity
Do nothing:
a. Multicollinearity will not necessarily reduce the t-scores enough to
make them statistically insignificant and/or change the estimated
coefficients to make them differ from expectations
b. the deletion of a multicollinear variable that belongs in an equation
will cause specification bias
Drop a redundant variable:
a. Viable strategy when two variables measure essentially the same
thing
b. Always use theory as the basis for this decision!
187
Uji Asumsi Klasik: no multicollinearity
Methods to correct or remedy multicollinearity
Increase the sample size:
a. This is frequently impossible but a useful alternative to be
considered if feasible
b. The idea is that the larger sample normally will reduce the variance
of the estimated coefficients, diminishing the impact of the
multicollinearity
188
189
Terima kasih ...
... Ada pertanyaan ???

Modul Ajar Statistika Inferensia ke-12: Uji Asumsi Klasik pada Regresi Linier Berganda

  • 1.
    Analisa Regresi: Uji AsumsiKlasik ARIF RAHMAN 1
  • 2.
    Statistika Statistika adalah cabangilmu matematika yang mempelajari metode ilmiah untuk mengumpulkan, mengorganisasi, merangkum, menyederhanakan, menyajikan, menginterpretasikan, menganalisa dan mensintesa data (numerik atau nonnumerik) untuk menghasilkan informasi dan/atau kesimpulan, yang membantu dalam penyelesaian masalah dan/atau pengambilan keputusan. 2
  • 3.
    Statistika 3 Mengorganisasi, Merangkum, Menyederhanakan, Menyajikan, Menginterpretasikan Menganalisa Mensintesa Mengumpulkan data Menghasilkan informasidan/atau kesimpulan Menggeneralisasi Mengestimasi, Menguji hipotesa, Menilai relasi, Memprediksi Menyelesaikan masalah Mengambil keputusan
  • 4.
    Statistika Inferensia Statistika inferensiaadalah cabang statistika yang menganalisa atau mensintesa data untuk menggeneralisasi sampel terhadap populasi, mengestimasi parameter, menguji hipotesa, menilai relasi, dan membuat prediksi untuk menghasilkan informasi dan/atau kesimpulan. Terdapat banyak alat bantu statistika (statistical tools) yang dapat dipergunakan untuk menginferensi populasi atau sistem yang menjadi sumber asal data sampel 4
  • 5.
    Statistika Inferensia 5 Tujuan studiterhadap populasi Observasi atau eksperimen pada sampel SAMPLING INFERENSI Parameter : N (banyaknya anggota populasi), μ (rata-rata populasi), σ (simpangan baku populasi), π (proporsi populasi) Statistik : n (banyaknya anggota sampel), ẋ (rata-rata sampel), s (simpangan baku sampel), p (proporsi sampel)
  • 6.
    Tipe Data Data Nominal,data yang hanya berupa simbol (meski berupa angka) untuk membedakan nilainya tanpa menunjukkan tingkatan Data Ordinal, data yang mempunyai nilai untuk menunjukkan tingkatan, namun tanpa skala yang baku dan jelas antar tingkatan. Data Interval, data yang mempunyai nilai untuk menunjukkan tingkatan dengan skala tertentu sesuai intervalnya. Nilai nol hanya untuk menunjukkan titik acuan (baseline). Data Rasio, data yang mempunyai nilai untuk menunjukkan tingkatan dengan skala indikasi rasio perbandingan. Nilai nol menunjukkan titik asal (origin) yang bernilai kosong (null). 6
  • 7.
    Tipe Data Data Parametrik,data kuantitatif yang mempunyai sebaran variabel acak mengikuti pola distribusi probabilitas dengan parameter tertentu (independent and identically distributed random variables) Data Nonparametrik, data yang tidak mempunyai distribusi probabilitas (distribution-free) 7
  • 8.
    Tipe Data Data Diskrit,data hasil pencacahan atau penghitungan, sehingga biasanya dalam angka bilangan bulat. Data Kontinyu, data hasil pengukuran yang memungkinkan dalam angka bilangan nyata (meskipun dapat pula dibulatkan) 8
  • 9.
    Statistika Alat BantuProblem Solving 9 Penting memperhatikan cara memperoleh data yang akan diolah Demikian pula cara mengolah data juga penting diperhatikan
  • 10.
    Statistika Alat BantuProblem Solving 10 Metode statistika bukan ramuan sihir Alat statistika bukan tongkat sihir
  • 11.
  • 12.
    Akurasi dan Presisi Akurasi(accuracy), kesesuaian hasil pengukuran terhadap nilai obyek sesungguhnya (bias kecil) Presisi (precision), tingkat skala ketelitian pengukuran dari alat pengukur, atau ketersebaran yang relatif mengumpul (variansi atau deviasi kecil) 12
  • 13.
    Akurat dan Presisi Tidakpresisi, akibat pola sebaran sampel lebih melebar daripada pola sebaran populasi menyebabkan deviasi yang besar. Tidak akurat, akibat pergeseran pemusatan sampel menjauh dari pemusatan populasi menyebabkan bias yang besar. Akurat dan presisi, bias dan deviasi kecil, membutuhkan sampel sedikit. 13
  • 14.
    Kesalahan Pengambilan Kesimpulan Galattipe 1 () : kesalahan menyimpulkan karena menolak hipotesa yang semestinya diterima Galat tipe 2 () : kesalahan menyimpulkan karena menerima hipotesa yang semestinya ditolak 14  
  • 15.
    Kesalahan Pengambilan Kesimpulan 15 Thetrue state of nature Decision H0 is true H0 is false Reject H0 Type I error Exact decision Fail to reject H0 Exact decision Type II error The true state of nature Decision H0 is true H0 is false Reject H0  1 –  Fail to reject H0 1 –  
  • 16.
    Ukuran Ketelitian Pendugaan Tingkatkeberartian (significance level, ), probabilitas penolakan data observasi, karena menyimpang signifikan terhadap sasaran. Tingkat kepercayaan (confidence coefficient,1-), persentase data observasi yang diyakini tidak berbeda signifikan dengan target. Kuasa statistik (power,1-), persentase data observasi yang diyakini berbeda signifikan dengan target. Derajat kebebasan (degree of freedom, df=n-k), besaran yang menunjukkan bebas terhadap bias dari n data observasi. 16
  • 17.
    Kekeliruan pada AnalisaRegresi  Tidak ada logika penalaran atau alasan logis yang mendasari hipotesa variabel bebas mempengaruhi variabel terikat.  Deskripsi dari variabel bebas tidak mempunyai hubungan kausal dengan deskripsi dari variabel terikat.  Pengukuran atau pengumpulan data dilakukan oleh/dari pihak yang mempunyai konflik kepentingan atau yang tidak punya kewenangan atas data.  Variabel bebas dan/atau variabel terikat diukur pada hanya satu obyek yang bernilai tunggal atau statis.  Rentang data sampel sangat sempit, namun dipergunakan untuk menginduksi rentang populasi yang sangat lebar dengan ekstrapolasi. 17
  • 18.
    Kekeliruan pada AnalisaRegresi  Variabel bebas (x) = tinggi badan anak-anak di desa A (pertumbuhan tiap tahunnya); Variabel terikat (y) = harga emas (kenaikan tiap tahunnya). Meskipun jika dihitung, menunjukkan pertumbuhan tinggi badan anak-anak di desa A mempunyai korelasi kuat dengan kenaikan harga emas.  Variabel bebas (x) = motivasi kerja; Variabel terikat (y) = kinerja. Deskripsi “motivasi kerja” adalah mendapatkan pengakuan dari keluarga besar karena menjadi karyawan pabrik.Deskripsi “kinerja” adalah waktu penyelesaian pekerjaan lebih cepat daripada batas waktu yang ditargetkan.  Data yang diukur : Banyaknya pelanggaran yang dilakukan. Ditanyakan pada pelaku pelanggaran, atau pada pihak yang tidak pernah melihat pelaku pelanggaran.  Data yang diukur : Aturan pengupahan dari satu perusahaan di satu waktu.  Rentang data sampel pada 10 < x < 50, namun dipergunakan untuk menduga nilai Y jika x=200. 18
  • 19.
  • 20.
  • 21.
    Perbedaan Korelasi danRegresi 21 Correlation Regression
  • 22.
    Perbedaan Korelasi danRegresi 22 Correlation Regression
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
    Asumsi Klasik (classical linearregression model assumptions) 34
  • 35.
    The Gauss-Markov Theorem GivenClassical Assumptions, the ordinary least squares (OLS) estimator βk is the minimum variance estimator from among the set of all linear unbiased estimators of βk. In other words, OLS is BLUE Best Linear Unbiased Estimator Where Best = Minimum Variance 35
  • 36.
    Asumsi Klasik dalamRegresi 1. The regression model is linear, y = Xβ + ε, correctly specified and has an additive error term or residual, defined by ei = yi – ŷi.  linearity 2. The number of observations must be greater than the number of parameters to be estimated, n > k. 3. There must be variability in data value for the response variable and explanatory variable(s). 4. The X matrix is non-stochastic. X values are fixed in repeated sampling. 5. All explanatory variables are uncorrelated with the error term, Cov(xi, ei) = 0. 36
  • 37.
    Asumsi Klasik dalamRegresi 6. Observations of the error term are uncorrelated with each other  no serial correlation or autocorrelation. 7. No explanatory variable is a perfect linear function of any other explanatory variables.  no perfect multicollinearity 8. The error term is normally distributed, ~N(0,σ2)  normality 9. The error term has a zero population mean, E(ei)= 0.  unbiased estimator 10.The error term has a constant variance, Var(ei) = σ2 < .  homoskedasticity or no heteroskedasticity 37
  • 38.
    Uji Asumsi Klasik: Linearity 38
  • 39.
    Uji Asumsi Klasik:linearity 39
  • 40.
    Uji Asumsi Klasik:linearity 1. Select baseline set of levels for each explanatory variables (x10, x20, ..., xk0). 2. Successively estimate revised response variable (y'i•) for each explanatory variable (xi•) with the other explanatory variables held constant at the baseline level (xj• = xj0 for j ≠ i). 3. Plot the data (xi•, y'i•) and the regression line with the other explanatory variables held constant. 40   ) ( ) ( ) ( 0 20 2 2 10 1 1 k k k i i x x x x x x y y                     
  • 41.
    Uji Asumsi Klasik:linearity 41
  • 42.
    Uji Asumsi Klasik:linearity 42
  • 43.
    Uji Asumsi Klasik:linearity 43
  • 44.
  • 45.
    Example 1 45 Response SurfaceMethodology 3 2 1 . 3433 , 0 . 8616 , 1 . 0161 , 1 1574 , 39 x x x y    
  • 46.
    Example 1 1. Selectbaseline set of levels for each explanatory variables (x10, x20, x30). x10 = 6 x20 = 6 x30 = 9 46
  • 47.
    Example 1 2. Estimaterevised response variable (y'i•) 47 x1 x2 x3 y x10 x20 x30 y'1 y‘2 y‘3 1,74 5,30 10,8 25,5 6 6 9 24,815 30,446 29,588 6,32 5,42 9,4 31,2 6 6 9 30,258 31,012 30,676 6,22 8,41 7,2 25,9 6 6 9 29,769 25,059 26,504 10,52 4,63 8,5 38,4 6 6 9 35,678 33,636 33,337 1,19 11,60 9,4 18,4 6 6 9 28,963 23,425 25,210 1,22 5,85 9,9 26,7 6 6 9 26,730 31,866 31,505 4,10 6,62 8,0 26,4 6 6 9 27,211 27,987 28,543 6,32 8,72 9,1 25,9 6 6 9 30,998 25,609 26,509 4,08 4,42 8,7 32,0 6 6 9 28,956 33,848 33,409 4,15 7,60 9,2 25,2 6 6 9 28,247 27,148 27,629 10,15 4,83 9,4 39,7 6 6 9 37,659 35,620 35,082 1,72 3,12 7,6 35,7 6 6 9 29,858 39,568 39,060 1,70 5,30 8,2 26,5 6 6 9 24,922 30,595 30,629 59,43 81,82 115,4 377,5     678 , 35 ) 9 5 , 8 ( 3433 , 0 ) 6 63 , 4 ( 0161 , 1 4 , 38 ) ( ) ( 30 34 3 20 24 2 14 14                  x x x x y y  
  • 48.
    Example 1 3. Plotthe data (xi•, y'i•) and the regression line. 48 Estimated Regression Equation 3 2 1 . 3433 , 0 . 8616 , 1 . 0161 , 1 1574 , 39 x x x y    
  • 49.
  • 50.
  • 51.
  • 52.
    Example 2 1. Selectbaseline set of levels for each explanatory variables (x10, x20). x10 = 10 x20 = 300 52
  • 53.
    Example 2 2. Estimaterevised response variable (y'i•) 53 x1 x2 y x1 x2 y'1 y‘2 2 50 9,95 10 300 13,082 31,904 8 110 24,45 10 300 26,830 29,939 11 120 31,75 10 300 34,005 29,006 10 550 35,00 10 300 31,868 35,000 8 295 25,02 10 300 25,083 30,509 4 200 16,86 10 300 18,113 33,326 2 375 14,38 10 300 13,440 36,334 2 52 9,60 10 300 12,707 31,554 9 100 24,35 10 300 26,856 27,094 8 300 27,50 10 300 27,500 32,989 4 412 17,08 10 300 15,677 33,546 11 400 37,00 10 300 35,747 34,256 12 500 41,95 10 300 39,444 36,461 2 360 11,66 10 300 10,908 33,614 4 205 21,65 10 300 22,840 38,116 4 400 17,89 10 300 16,637 34,356 20 600 69,00 10 300 65,242 41,557 1 585 10,30 10 300 6,730 34,998 10 540 34,93 10 300 31,923 34,930 15 250 46,59 10 300 47,216 32,869 15 290 44,88 10 300 45,005 31,159 16 510 54,12 10 300 51,489 37,654 17 590 56,63 10 300 52,997 37,420 6 100 22,13 10 300 24,636 33,107 5 400 21,15 10 300 19,897 34,871 206 8294 725,82     868 , 31 ) 300 550 ( 01253 , 0 00 , 35 ) ( 20 24 2 14 14           x x y y 
  • 54.
    Example 2 3. Plotthe data (xi•, y'i•) and the regression line. 54 Estimated Regression Equation 2 1 . 01253 , 0 . 74427 , 2 26379 , 2 x x y   
  • 55.
    Uji Asumsi Klasik: Normality 55
  • 56.
    Uji Asumsi Klasik:normality The error term is normally distributed, ~N(0,σ2), with a zero population mean, E(ei)= 0, and a constant variance, Var(ei) = σ2. It needs to assume normality for parametric hypothesis testing based on gaussian distribution. Normality assumption can be tested using goodness of fit test or the Bera Jarque normality test. 56
  • 57.
    Uji Asumsi Klasik:normality Goodness of Fit Test  Chi Square goodness of fit test  Kolmogorov–Smirnov test  Liliefors test  Geary test  Anderson–Darling test  Shapiro–Wilk test  Bayesian information criterion  Cramér–von Mises criterion  Akaike information criterion  Kuiper's test  Moran test  Hosmer–Lemeshow test 57
  • 58.
    Uji Asumsi Klasik:normality The Bera Jarque normality test A normal distribution is not skewed and is defined to have a coefficient of kurtosis of 3. The kurtosis of the normal distribution is 3 so its excess kurtosis (b2-3) is zero. Skewness and kurtosis are the (standardised) third and fourth moments of a distribution. 58
  • 59.
    Uji Asumsi Klasik:normality 59  3 2 3 s m skewness  Mo Me ͞x Mo Me ͞x ͞x Skewness = 0 Skewness > 0 Skewness < 0 Positive or right skew Mo Me Symmetric Negative or left skew  Normal = 0
  • 60.
    Uji Asumsi Klasik:normality 60  2 2 4 s m kurtosis Leptokurtik Mesokurtik Platikurtik Kurtosis > 3 Kurtosis = 3 Kurtosis < 3 Normal  Normal = 3
  • 61.
    Uji Asumsi Klasik:normality The Bera Jarque normality test Bera and Jarque testing the residuals for normality by testing whether the coefficient of skewness and the coefficient of excess kurtosis are jointly zero. It can be proved that the coefficients of skewness and kurtosis can be expressed respectively as: and The Bera Jarque test statistic is given by 61   b E u 1 3 2 3 2  [ ] /    b E u 2 4 2 2  [ ]      2 ~ 24 3 6 2 2 2 2 1           b b T W
  • 62.
    Uji Asumsi Klasik: No Serial Correlation or Autocorrelation 62
  • 63.
    Uji Asumsi Klasik:no autocorrelation Observations of the error term are uncorrelated with each other. Autocorrelation or serial correlation is a violation of the classical assumption that assumes uncorrelated observations of the error term. Issue: Is et related to et-1? Such would be the case in a time-series when a random shock has an impact over a number of time periods. 63
  • 64.
    Uji Asumsi Klasik:no autocorrelation It assumed the error terms are not interrelated, i.e. Cov (ei , ej) = 0 for i≠j. If the covariance of two error terms is not equal to zero, then autocorrelation exists. This is essentially the same as saying there is no pattern in the error term. If there are patterns in the error term from a model, we say that they are autocorrelated. 64
  • 65.
    Uji Asumsi Klasik:no autocorrelation 65
  • 66.
    Uji Asumsi Klasik:no autocorrelation 66
  • 67.
    Uji Asumsi Klasik:no autocorrelation 67
  • 68.
    Uji Asumsi Klasik:no autocorrelation Autocorrelation can be categorized into 2 kinds: Pure autocorrelation (autocorrelation that that occurs when classical assumption, which assumes uncorrelated observations of the error term, is violated). Impure autocorrelation (autocorrelation that is caused by specification errors: omitted variables or incorrect functional form). Autocorrelation mostly happens in a data set where order of observations has some meaning (e.g. time- series data). 68
  • 69.
    Uji Asumsi Klasik:no autocorrelation The most commonly assumed kind of autocorrelation is first- order autocorrelation, in which the posterior value of the error term is a function of the prior value of the error term: et = ρ.et–1 + ut where: et = the error term in the model at time period t ρ = the first-order autocorrelation coefficient depicting the functional relationship between observations of the error term. ut = a classical (not serially correlated) error term 69
  • 70.
    Uji Asumsi Klasik:no autocorrelation The magnitude of ρ indicates the strength of the autocorrelation or serial correlation: If ρ is zero, ρ ≈ 0, there is no autocorrelation As ρ approaches one in absolute value, |ρ| ≈ 1, there is significant autocorrelation For ρ to exceed one is unreasonable, since the error term effectively would “explode” As a result of this, we can state that: –1 < ρ < +1 70
  • 71.
    Uji Asumsi Klasik:no autocorrelation ρ < 0 indicates negative autocorrelation (the signs of the error term switch back and forth). ρ > 0 indicates positive autocorrelation (a positive error term tends to be followed by a positive error term and a negative error term tends to be followed by a negative error term). Positive autocorrelation is more common than negative autocorrelation. Situations where negative autocorrelation occurs are not often encountered. 71
  • 72.
    Uji Asumsi Klasik:no autocorrelation  Examples of higher order autocorrelation: 1. Seasonal autocorrelation: et = .et-4 + ut 2. Second-order autocorrelation: et = 1.et-1 + 2.et-2 + ut 2. r-th-order autocorrelation: et = 1.et-1 + 2.et-2 + . . . + r.et-r + ut 72
  • 73.
    Uji Asumsi Klasik:no autocorrelation Some tests to detect autocorrelation or serial correlation The Graphical Run Test The Durbin Watson Test The Breusch-Godfrey Test The Box-Pierce Q Test The Cumby-Huizinga test The Ljung-Box Q Test The Portmanteau Test The Lagrange Multiplier Test 73
  • 74.
    Uji Asumsi Klasik:no autocorrelation The Durbin-Watson (DW) is a test for first order autocorrelation - i.e. it assumes that the relationship is between an error and the previous one et = ρ.et–1 + ut where ut  N(0, u 2). The DW test statistic actually tests H0 : =0 and H1 : 0 The test statistic is calculated by 74 ) 1 ( 2 ) ( ) . ( 2 ) ( ) ( ) ( ) ( 2 2 2 1 2 2 1 2 2 2 2 2 2 1                        T t t T t t t T t t T t t T t t T t t t e e e e e e e e DW
  • 75.
    Uji Asumsi Klasik:no autocorrelation We can also write DW ≈ 2(1 – ρ) and where ρ is the estimated correlation coefficient. Since ρ is a correlation, it implies that –1 < ρ < 1. Subtituting ρ by DW would give 0 < DW < 4. If ρ ≈ 0, DW ≈ 2. There is little evidence to reject the null hypothesis if DW is near 2  the error terms are not autocorrelated If ρ ≈ 1, DW ≈ 0. There is significant evidence to reject the null hypothesis if DW is near 0  the error terms are positive autocorrelated If ρ ≈ -1, DW ≈ 4. There is significant evidence to reject the null hypothesis if DW is near 4  the error terms are negative autocorrelated 75 2 1 ) ( ) . ( 2 2 2 1 DW e e e T t t T t t t         
  • 76.
    Uji Asumsi Klasik:no autocorrelation  Unfortunately, DW has 2 critical values, an upper critical value (dU) and a lower critical value (dL), and there is also an intermediate region (inconclusive) where we can neither reject nor not reject H0.  The decision procedure is as follows: If DW < dL reject H0 : ρ = 0, strong positive autocorrelation If dL < DW < dU inconclusive, weak positive autocorrelation If dU < DW < 4-dU do not reject H0 : ρ = 0, no autocorrelation If 4-dU < DW < 4-dL inconclusive, weak negative autocorrelation If DW > 4-dL reject H0 : ρ = 0, strong negative autocorrelation 76
  • 77.
    Uji Asumsi Klasik:no autocorrelation 77 Conditions which Must be Fulfilled for DW to be a Valid Test 1. Constant term in regression 2. Regressors are non-stochastic 3. No lags of dependent variable
  • 78.
  • 79.
  • 80.
    Example 3 80 x1 x2x3 y y pred e 1,74 5,30 10,8 25,5 27,35 -1,851 6,32 5,42 9,4 31,2 32,26 -1,062 6,22 8,41 7,2 25,9 27,35 -1,450 10,52 4,63 8,5 38,4 38,31 0,090 1,19 11,60 9,4 18,4 15,54 2,855 1,22 5,85 9,9 26,7 26,11 0,592 4,10 6,62 8,0 26,4 28,25 -1,853 6,32 8,72 9,1 25,9 26,22 -0,322 4,08 4,42 8,7 32,0 32,09 -0,088 4,15 7,60 9,2 25,2 26,07 -0,868 10,15 4,83 9,4 39,7 37,25 2,448 1,72 3,12 7,6 35,7 32,49 3,212 1,70 5,30 8,2 26,5 28,20 -1,703 59,43 81,82 115,4 377,5 377,50 0,00
  • 81.
    Example 3 81 e(t-1) e(t)(e(t))2 (e(t) – e(t-1)) (e(t) – e(t-1))2 -1,851 -1,062 1,129 0,789 0,623 -1,062 -1,450 2,101 -0,387 0,150 -1,450 0,090 0,008 1,540 2,372 0,090 2,855 8,153 2,765 7,644 2,855 0,592 0,350 -2,263 5,123 0,592 -1,853 3,434 -2,445 5,978 -1,853 -0,322 0,104 1,531 2,345 -0,322 -0,088 0,008 0,234 0,055 -0,088 -0,868 0,753 -0,779 0,608 -0,868 2,448 5,991 3,315 10,991 2,448 3,212 10,317 0,764 0,584 3,212 -1,703 2,901 -4,915 24,160 1,70 5,30 1,129 0,789 0,623 35,249 0,148 60,633 720 , 1 249 , 35 633 , 60 ) ( ) ( 2 2 2 2 1          T t t T t t t e e e DW 748 , 0 ) 82 , 0 00 , 1 ( 15 20 15 13 82 , 0       L d 778 , 1 ) 75 , 1 68 , 1 ( 15 20 15 13 75 , 1       U d
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.
    Example 4 86 x1 x2y ypred e 2 50 9,95 8,379 1,571 8 110 24,45 25,596 -1,146 11 120 31,75 33,954 -2,204 10 550 35,00 36,597 -1,597 8 295 25,02 27,914 -2,894 4 200 16,86 15,746 1,114 2 375 14,38 12,450 1,930 2 52 9,60 8,404 1,196 9 100 24,35 28,215 -3,865 8 300 27,50 27,976 -0,476 4 412 17,08 18,402 -1,322 11 400 37,00 37,462 -0,462 12 500 41,95 41,459 0,491 2 360 11,66 12,262 -0,602 4 205 21,65 15,809 5,841 4 400 17,89 18,252 -0,362 20 600 69,00 64,666 4,334 1 585 10,30 12,337 -2,037 10 540 34,93 36,472 -1,542 15 250 46,59 46,560 0,030 15 290 44,88 47,061 -2,181 16 510 54,12 52,561 1,559 17 590 56,63 56,308 0,322 6 100 22,13 19,982 2,148 5 400 21,15 20,996 0,154 206 8294 725,82 725,820 0,000
  • 87.
    Example 4 87 e(t-1) e(t)(e(t))2 (e(t) – e(t-1)) (e(t) – e(t-1))2 1,571 -1,146 1,313 -2,717 7,384 -1,146 -2,204 4,858 -1,058 1,120 -2,204 -1,597 2,550 0,607 0,369 -1,597 -2,894 8,373 -1,297 1,682 -2,894 1,114 1,240 4,007 16,058 1,114 1,930 3,724 0,816 0,666 1,930 1,196 1,431 -0,734 0,538 1,196 -3,865 14,938 -5,061 25,616 -3,865 -0,476 0,227 3,389 11,483 -0,476 -1,322 1,749 -0,846 0,716 -1,322 -0,462 0,213 0,860 0,740 -0,462 0,491 0,241 0,953 0,908 0,491 -0,602 0,363 -1,093 1,196 -0,602 5,841 34,116 6,443 41,516 5,841 -0,362 0,131 -6,203 38,476 -0,362 4,334 18,785 4,696 22,054 4,334 -2,037 4,149 -6,371 40,589 -2,037 -1,542 2,376 0,495 0,245 -1,542 0,030 0,001 1,572 2,470 0,030 -2,181 4,756 -2,211 4,889 -2,181 1,559 2,430 3,740 13,985 1,559 0,322 0,104 -1,236 1,529 0,322 2,148 4,613 1,826 3,333 2,148 0,154 0,024 -1,994 3,976 112,705 -1,418 241,537 143 , 2 705 , 112 537 , 241 ) ( ) ( 2 2 2 2 1          T t t T t t t e e e DW
  • 88.
  • 89.
    Uji Asumsi Klasik:no autocorrelation The Breusch-Godfrey test is a test for r-th order autocorrelation - i.e. it assumes that the relationship is between an error and the previous ones et = ρ1.et–1 + ρ2.et–2 + . . . + ρr.et–r + ut where ut  N(0, u 2). The Breusch-Godfrey test statistic actually tests H0 : ρ1=0 and ρ2=0 and . . . and ρr=0 H1 : 10 or 20 or . . . or r0 89
  • 90.
    Uji Asumsi Klasik:no autocorrelation The Breusch-Godfrey test has three basic steps: 1. Obtain the residuals of the estimated regression equation: ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki) 2. Use the first regression and the residuals to form the residual model in a second regression: et = b0 + b1x1t + b2x2t + ... + bkxkt + ρ1.et–1 + ρ2.et–2 + . . . + ρr.et–r + ut 90
  • 91.
    Uji Asumsi Klasik:no autocorrelation The Breusch-Godfrey test has three basic steps: 3. Obtain R2 from this regression. (degree of freedom, ν = n-r-2) : H0: (n-r)R2 < χ2 (No autocorrelated errors) H1: (n-r)R2 > χ2 (Autocorrelated errors) If the test statistic exceeds the critical value from the statistical tables, reject the null hypothesis of no autocorrelation. 91
  • 92.
  • 93.
    Uji Asumsi Klasik:no autocorrelation Autocorrelation consequences:  Autocorrelation does not cause bias in the β coefficient estimates  Autocorrelation increases the variance of the β coefficient estimates  Autocorrelation causes the dependent variable to fluctuate in a fashion that the estimation procedure (OLS) attributes to the independent variables. Hence the variance of the estimates of β increases. These estimates are still unbiased, since over-estimation and under-estimation are still as likely  Autocorrelation causes OLS to underestimate the variances (and standard errors) of the β coefficients.  Intuitively - Autocorrelation increases the fit of the model. Hence the estimation of the variance and standard errors is lower. This can lead the researcher to conclude a relationship exists when in fact they are unrelated.  Hence the t-stats and F-stats can not be relied upon for statistical inference.  Spurious Regressions. 93
  • 94.
    Uji Asumsi Klasik:no autocorrelation Methods to correct or remedy autocorrelation Use the Generalized Least Squares to restore the minimum variance property of the OLS estimation. Use the Newey-West standard errors Use the Cochrane-Orcutt method Use the Hildreth-Lu Procedure Use the first order Autoregressive, AR(1), model Use the Maximum Likelihood approach 94
  • 95.
    Uji Asumsi Klasik: Homoskedasticity or No Heteroskedasticity 95
  • 96.
    Uji Asumsi Klasik:no heteroskedasticity There should be homoskedasticity that the error term has a constant variance, Var(ei) = σ2. Heteroskedasticity is a violation of the classical assumption that assumes the observations of the error term are drawn from distributions that have a constant variance. Issue: Does ei differ across levels of the explanatory variables? Such would be the case in a capacity constraint when any particular state has an impact over the capacity. 96
  • 97.
    Uji Asumsi Klasik:no heteroskedasticity  In homoskedasticity the distribution of the error term has a constant variance, so the observations are continually drawn from the same distribution.  In the simplest heteroskedastic case, discrete heteroskedasticity, there would be two different error term variances, and therefore, two different distributions. One distribution is wider than the other. 97
  • 98.
    Uji Asumsi Klasik:no heteroskedasticity It assumed all explanatory variables are uncorrelated with the error term, i.e. Cov(xi, ei) = 0 for each explanatory variable(s). If the covariance is not equal to zero, then heteroskedasticity exists. This is essentially the same as saying there is no pattern in the error term by explanatory variable(s). If there are patterns from a model, we say that they are correlated. 98
  • 99.
    Uji Asumsi Klasik:no heteroskedasticity 99
  • 100.
    Uji Asumsi Klasik:no heteroskedasticity Heteroskedasticity takes on many more complex forms, however, than the discrete heteroskedasticity case Perhaps the most frequently specified model of pure heteroskedasticity relates the variance of the error term to an exogenous variable Zi as follows: yi = β0 + β1x1i + β2x2i + ... + βkxki + ei VAR(ei) = σ2.f(Zi) where Z, the “proportionality factor”, may or may not be in the equation 100
  • 101.
    Uji Asumsi Klasik:no heteroskedasticity  If the error term is homoskedastic with respect to Zi, the variance of the distribution of the error term is the same (constant) no matter what the value of Zi, as in VAR(ei) = σ2.  If the error term is heteroskedastic with respect to Zi, the variance of the distribution of the error term changes systematically as function of Zi. In this example, the variance is an increasing function of Zi, as in VAR(ei) = σ2Zi 2 101
  • 102.
    Uji Asumsi Klasik:no heteroskedasticity 102
  • 103.
    Uji Asumsi Klasik:no heteroskedasticity 103
  • 104.
    Uji Asumsi Klasik:no heteroskedasticity Heteroskedasticity can be categorized into 2 kinds: Pure heteroskedasticity (heteroskedasticity that occurs when classical assumption, which assumes constant variance of the error term, is violated). Impure heteroskedasticity (heteroskedasticity that is caused by specification errors: omitted variables or incorrect functional form). Heteroskedasticity mostly happens in a data set where any particular state may loosen/tighten the dispersion of response variable (e.g. cross-sectional data). 104
  • 105.
    Uji Asumsi Klasik:no heteroskedasticity Heteroskedasticity can occur in any situations, such as: When there is significant change in the variable(s) of a time- series model. When there is capacity constraint that limit the option When there is different control or consistency When there is different behavior in the different strata of population. When there are different amounts of measurement errors in the sample of different periods or different sub-samples. 105
  • 106.
    Uji Asumsi Klasik:no heteroskedasticity Before using any test for heteroskedasticity, however, ask the following: 1. Are there any obvious specification errors? Is the regression model already correctly specified?  Fix those before testing! 2. Is the subject of the research likely to be afflicted with heteroskedasticity?  The cross-sectional studies with large variations in the size of the dependent variable are particularly susceptible to heteroskedasticity 3. Does a graph of the residuals show any evidence of heteroskedasticity?  Specifically, plot the residuals against a potential Z proportionality factor  In such cases, the graph can often show that heteroskedasticity is or is not likely  Any graph shows an expanding (or contracting) range of the residuals 106
  • 107.
    Uji Asumsi Klasik:no heteroskedasticity Because heteroskedasticity can take on many forms, therefore, there is no specific test to test for heteroskedasticity. Scientists and researchers do not all use the same test for heteroskedasticity since heteroskedasticity takes a number of different forms, and its precise manifestation in a given equation is almost never known. 107
  • 108.
    Uji Asumsi Klasik:no heteroskedasticity Some tests to detect heteroskedasticity  The Graphical Zpredictor-Sresidual Test  The Park Test  The White Test  The Glejser Test  The Levene’s test  The Goldfeld-Quandt Test  The Brown-Forsythe Test  The Harrison-McCabe Test  The Breusch-Pagan Test  The Cook-Weisberg Test 108
  • 109.
    Uji Asumsi Klasik:no heteroskedasticity The Park test is a test for heteroskedasticity - i.e. it assumes that the relationship is between variable(s) and the error yi = β0 + β1x1i + β2x2i + ... + βkxki + ei VAR(ei) = σ2Zi b1  ln(ei 2) = ln(σ2) + b1.ln(Zi) + ui where Z: the “proportionality factor” b1 : the slope of logarithmic model of squared residual ui: a classical (homoskedastic) error term One difficulty with the Park test is the specification of the Z factor. The Z factor may one of the explanatory variables, but not always. 109
  • 110.
    Uji Asumsi Klasik:no heteroskedasticity The Park test has three basic steps: 1. Obtain the residuals of the estimated regression equation: ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki) 2. Use these residuals to form the logarithmic model of squared residual in a second regression: ln(ei 2) = ln(σ2) + b1.ln(Zi) + ui where yi' = ln(ei 2) and xi' = ln(Zi) b0 = ln(σ2) and b1 : the slope of linear model ui : a classical (homoskedastic) error term 110
  • 111.
    Uji Asumsi Klasik:no heteroskedasticity The Park test has three basic steps: 3. Check to see whether b1 is significant or not. Test the significance of the b1 coefficient of Z with t-test (degree of freedom, ν = n-2) : H0: b1 = 0 (Homoskedastic errors) H1: b1 ≠ 0 (Heteroskedastic errors) 111
  • 112.
    Uji Asumsi Klasik:no heteroskedasticity 112 Advantages of the Park test: a. The test is simple. b. It provides information about the variance structure. Limitations of the Park test: a. The distribution of the dependent variable is problematic. b. It assumes a specific functional form. c. It does not work when the variance depends on two or more variables. d. The correct variable with which to order the observations must be identified first. e. It cannot handle partitioned data.
  • 113.
  • 114.
  • 115.
    Example 5 115 x1 x2x3 y y pred e e2 ln(x1) ln(x2) ln(x3) ln(e2) 1,74 5,30 10,8 25,5 27,35 -1,851 3,428 0,554 1,668 2,380 1,232 6,32 5,42 9,4 31,2 32,26 -1,062 1,129 1,844 1,690 2,241 0,121 6,22 8,41 7,2 25,9 27,35 -1,450 2,101 1,828 2,129 1,974 0,743 10,52 4,63 8,5 38,4 38,31 0,090 0,008 2,353 1,533 2,140 -4,807 1,19 11,60 9,4 18,4 15,54 2,855 8,153 0,174 2,451 2,241 2,098 1,22 5,85 9,9 26,7 26,11 0,592 0,350 0,199 1,766 2,293 -1,049 4,10 6,62 8,0 26,4 28,25 -1,853 3,434 1,411 1,890 2,079 1,234 6,32 8,72 9,1 25,9 26,22 -0,322 0,104 1,844 2,166 2,208 -2,267 4,08 4,42 8,7 32,0 32,09 -0,088 0,008 1,406 1,486 2,163 -4,857 4,15 7,60 9,2 25,2 26,07 -0,868 0,753 1,423 2,028 2,219 -0,284 10,15 4,83 9,4 39,7 37,25 2,448 5,991 2,317 1,575 2,241 1,790 1,72 3,12 7,6 35,7 32,49 3,212 10,317 0,542 1,138 2,028 2,334 1,70 5,30 8,2 26,5 28,20 -1,703 2,901 0,531 1,668 2,104 1,065 59,43 81,82 115,4 377,5 377,50 0,00 38,676 16,426 23,188 28,311 -2,647 Regression equation for Park test VAR(ei) = σ2Zi b1 ln(ei 2) = ln(σ2) + b1.ln(Zi) + ui where Zi = x1i , x2i , x3i
  • 116.
    Example 5 116 X ln(x1) Y ln(e2) X2 XYY2 0,554 1,232 0,307 0,682 1,518 1,844 0,121 3,399 0,223 0,015 1,828 0,743 3,341 1,357 0,551 2,353 -4,807 5,538 -11,311 23,102 0,174 2,098 0,030 0,365 4,403 0,199 -1,049 0,040 -0,209 1,100 1,411 1,234 1,991 1,741 1,522 1,844 -2,267 3,399 -4,180 5,141 1,406 -4,857 1,977 -6,829 23,587 1,423 -0,284 2,025 -0,404 0,081 2,317 1,790 5,371 4,149 3,205 0,542 2,334 0,294 1,266 5,447 0,531 1,065 0,282 0,565 1,134 16,426 -2,647 27,993 -12,585 70,806   239 . 7 13 426 . 16 993 . 27 2 2 2        n x x SXX   241 . 9 13 ) 647 . 2 ( 426 . 16 585 . 12             n y x xy SXY   267 . 70 13 ) 647 . 2 ( 806 . 70 2 2 2         n y y SYY i i i i i i b i i u x e u x b e x e          ) ln( 277 . 1 409 . 1 ) ln( ) ln( ) ln( ) ln( ) VAR( 1 2 1 1 2 2 1 1 2   277 . 1 239 . 7 241 . 9 1      XX XY S S b 409 . 1 13 ) 426 . 16 ) 277 . 1 (( ) 647 . 2 ( . ) ln( 1 2           n x b y 
  • 117.
    Example 5 117 H0: b1= 0 (Homoskedastic errors) H1: b1 ≠ 0 (Heteroskedastic errors) 316 . 5 2 13 )) 241 , 9 ( ) 277 . 1 (( 267 . 70 2 . 1 2           n S b S s XY YY i i i u x e    ) ln( 277 . 1 409 . 1 ) ln( 1 2 48971 . 1 239 . 7 316 . 5 0 277 . 1 0 2 1        XX S s b t 267 . 70 241 . 9 239 . 7     YY XY XX S S S Degree of freedom: ν = n – 2 = 13 – 2 = 11 P-value = 2 X P(t < -1.48971) = 2 X 0.0822 = 0.1644 Conclusion: P-value (0.1644) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic with respect to x1.
  • 118.
    Example 5 118 X ln(x2) Y ln(e2) X2 XYY2 1,668 1,232 2,781 2,054 1,518 1,690 0,121 2,856 0,204 0,015 2,129 0,743 4,534 1,581 0,551 1,533 -4,807 2,349 -7,366 23,102 2,451 2,098 6,007 5,143 4,403 1,766 -1,049 3,120 -1,853 1,100 1,890 1,234 3,572 2,332 1,522 2,166 -2,267 4,690 -4,910 5,141 1,486 -4,857 2,209 -7,218 23,587 2,028 -0,284 4,113 -0,576 0,081 1,575 1,790 2,480 2,819 3,205 1,138 2,334 1,295 2,656 5,447 1,668 1,065 2,781 1,776 1,134 23,188 -2,647 42,789 -3,356 70,806   430 . 1 13 188 . 23 789 . 42 2 2 2        n x x SXX   364 . 1 13 ) 647 . 2 ( 188 . 23 356 . 3            n y x xy SXY   267 . 70 13 ) 647 . 2 ( 806 . 70 2 2 2         n y y SYY i i i i i i b i i u x e u x b e x e           ) ln( 954 . 0 905 . 1 ) ln( ) ln( ) ln( ) ln( ) VAR( 2 2 2 1 2 2 1 2 2   954 . 0 430 . 1 364 . 1 1    XX XY S S b 905 . 1 13 ) 188 . 23 954 . 0 ( ) 647 . 2 ( . ) ln( 1 2           n x b y 
  • 119.
    Example 5 119 H0: b1= 0 (Homoskedastic errors) H1: b1 ≠ 0 (Heteroskedastic errors) 270 . 6 2 13 ) 364 . 1 954 . 0 ( 267 . 70 2 . 1 2         n S b S s XY YY i i i u x e     ) ln( 954 . 0 905 . 1 ) ln( 2 2 45556 . 0 430 . 1 270 . 6 0 954 . 0 0 2 1      XX S s b t 267 . 70 364 . 1 430 . 1    YY XY XX S S S Degree of freedom: ν = n – 2 = 13 – 2 = 11 P-value = 2 X P(t > 0.45556) = 2 X 0.3288 = 0.6576 Conclusion: P-value (0.6576) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic with respect to x2.
  • 120.
    Example 5 120 X ln(x3) Y ln(e2) X2 XYY2 2,380 1,232 5,662 2,931 1,518 2,241 0,121 5,021 0,271 0,015 1,974 0,743 3,897 1,466 0,551 2,140 -4,807 4,580 -10,286 23,102 2,241 2,098 5,021 4,702 4,403 2,293 -1,049 5,256 -2,404 1,100 2,079 1,234 4,324 2,566 1,522 2,208 -2,267 4,876 -5,007 5,141 2,163 -4,857 4,680 -10,507 23,587 2,219 -0,284 4,925 -0,630 0,081 2,241 1,790 5,021 4,011 3,205 2,028 2,334 4,113 4,733 5,447 2,104 1,065 4,427 2,241 1,134 28,311 -2,647 61,803 -5,913 70,806   149 . 0 13 311 . 28 803 . 61 2 2 2        n x x SXX   149 . 0 13 ) 647 . 2 ( 311 . 28 913 . 5             n y x xy SXY   267 . 70 13 ) 647 . 2 ( 806 . 70 2 2 2         n y y SYY i i i i i i b i i u x e u x b e x e          ) ln( 001 . 1 977 . 1 ) ln( ) ln( ) ln( ) ln( ) VAR( 3 2 3 1 2 2 1 3 2   001 . 1 149 . 0 149 . 0 1      XX XY S S b 977 . 1 13 ) 311 . 28 ) 001 . 1 (( ) 647 . 2 ( . ) ln( 1 2           n x b y 
  • 121.
    Example 5 121 H0: b1= 0 (Homoskedastic errors) H1: b1 ≠ 0 (Heteroskedastic errors) 374 . 6 2 13 )) 149 . 0 ( ) 001 . 1 (( 267 . 70 2 . 1 2           n S b S s XY YY i i i u x e    ) ln( 001 . 1 977 . 1 ) ln( 3 2 15306 . 0 149 . 0 374 . 6 0 001 . 1 0 2 1        XX S s b t 267 . 70 149 . 0 149 . 0     YY XY XX S S S Degree of freedom: ν = n – 2 = 13 – 2 = 11 P-value = 2 X P(t < -0.15306) = 2 X 0.4406 = 0.8812 Conclusion: P-value (0.8812) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic with respect to x3.
  • 122.
  • 123.
  • 124.
  • 125.
    Example 6 125 x1 x2y ypred e e2 ln(x1) ln(x2) ln(e2) 2 50 9,95 8,379 1,571 2,469 0,693 3,912 0,904 8 110 24,45 25,596 -1,146 1,313 2,079 4,700 0,273 11 120 31,75 33,954 -2,204 4,858 2,398 4,787 1,581 10 550 35,00 36,597 -1,597 2,550 2,303 6,310 0,936 8 295 25,02 27,914 -2,894 8,373 2,079 5,687 2,125 4 200 16,86 15,746 1,114 1,240 1,386 5,298 0,215 2 375 14,38 12,450 1,930 3,724 0,693 5,927 1,315 2 52 9,60 8,404 1,196 1,431 0,693 3,951 0,358 9 100 24,35 28,215 -3,865 14,938 2,197 4,605 2,704 8 300 27,50 27,976 -0,476 0,227 2,079 5,704 -1,483 4 412 17,08 18,402 -1,322 1,749 1,386 6,021 0,559 11 400 37,00 37,462 -0,462 0,213 2,398 5,991 -1,545 12 500 41,95 41,459 0,491 0,241 2,485 6,215 -1,422 2 360 11,66 12,262 -0,602 0,363 0,693 5,886 -1,014 4 205 21,65 15,809 5,841 34,116 1,386 5,323 3,530 4 400 17,89 18,252 -0,362 0,131 1,386 5,991 -2,032 20 600 69,00 64,666 4,334 18,785 2,996 6,397 2,933 1 585 10,30 12,337 -2,037 4,149 0,000 6,372 1,423 10 540 34,93 36,472 -1,542 2,376 2,303 6,292 0,866 15 250 46,59 46,560 0,030 0,001 2,708 5,521 -6,999 15 290 44,88 47,061 -2,181 4,756 2,708 5,670 1,559 16 510 54,12 52,561 1,559 2,430 2,773 6,234 0,888 17 590 56,63 56,308 0,322 0,104 2,833 6,380 -2,265 6 100 22,13 19,982 2,148 4,613 1,792 4,605 1,529 5 400 21,15 20,996 0,154 0,024 1,609 5,991 -3,745 206 8294 725,82 725,820 0,000 115,173 46,058 139,773 3,190 Regression equation for Park test VAR(ei) = σ2Zi b1 ln(ei 2) = ln(σ2) + b1.ln(Zi) + ui where Zi = x1i , x2i , x3i
  • 126.
    Example 6 126 X ln(x1) Y ln(e2) X2 XYY2 0,693 0,904 0,480 0,626 0,817 2,079 0,273 4,324 0,567 0,074 2,398 1,581 5,750 3,790 2,498 2,303 0,936 5,302 2,155 0,876 2,079 2,125 4,324 4,419 4,516 1,386 0,215 1,922 0,298 0,046 0,693 1,315 0,480 0,911 1,729 0,693 0,358 0,480 0,248 0,128 2,197 2,704 4,828 5,941 7,311 2,079 -1,483 4,324 -3,085 2,201 1,386 0,559 1,922 0,775 0,312 2,398 -1,545 5,750 -3,704 2,387 2,485 -1,422 6,175 -3,534 2,023 0,693 -1,014 0,480 -0,703 1,028 1,386 3,530 1,922 4,893 12,459 1,386 -2,032 1,922 -2,817 4,130 2,996 2,933 8,974 8,787 8,603 0,000 1,423 0,000 0,000 2,024 2,303 0,866 5,302 1,993 0,749 2,708 -6,999 7,334 -18,954 48,987 2,708 1,559 7,334 4,223 2,432 2,773 0,888 7,687 2,461 0,788 2,833 -2,265 8,027 -6,417 5,131 1,792 1,529 3,210 2,739 2,338 1,609 -3,745 2,590 -6,027 14,025 46,058 3,190 100,844 -0,414 127,613   990 . 15 25 058 . 46 844 . 100 2 2 2        n x x SXX   292 . 6 25 190 . 3 058 . 46 414 . 0            n y x xy SXY   206 . 127 25 190 . 3 613 . 127 2 2 2        n y y SYY i i i i i i b i i u x e u x b e x e          ) ln( 393 . 0 853 . 0 ) ln( ) ln( ) ln( ) ln( ) VAR( 1 2 1 1 2 2 1 1 2   393 . 0 990 . 15 292 . 6 1      XX XY S S b 853 . 0 25 ) 058 . 46 ) 393 . 0 (( 190 . 3 . ) ln( 1 2          n x b y 
  • 127.
    Example 6 127 H0: b1= 0 (Homoskedastic errors) H1: b1 ≠ 0 (Heteroskedastic errors) 423 . 5 2 25 )) 292 , 6 ( ) 393 . 0 (( 206 . 127 2 . 1 2           n S b S s XY YY i i i u x e    ) ln( 393 . 0 853 . 0 ) ln( 1 2 67566 . 0 990 . 15 423 . 5 0 393 . 0 0 2 1        XX S s b t 206 . 127 292 . 6 990 . 15     YY XY XX S S S Degree of freedom: ν = n – 2 = 25 – 2 = 23 P-value = 2 X P(t < -0.67566) = 2 X 0.2530 = 0.5060 Conclusion: P-value (0.5060) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic with respect to x1.
  • 128.
    Example 6 128 X ln(x2) Y ln(e2) X2 XYY2 3,912 0,904 15,304 3,536 0,817 4,700 0,273 22,095 1,281 0,074 4,787 1,581 22,920 7,567 2,498 6,310 0,936 39,815 5,906 0,876 5,687 2,125 32,342 12,085 4,516 5,298 0,215 28,072 1,140 0,046 5,927 1,315 35,128 7,793 1,729 3,951 0,358 15,612 1,416 0,128 4,605 2,704 21,208 12,452 7,311 5,704 -1,483 32,533 -8,461 2,201 6,021 0,559 36,253 3,364 0,312 5,991 -1,545 35,898 -9,256 2,387 6,215 -1,422 38,621 -8,839 2,023 5,886 -1,014 34,646 -5,968 1,028 5,323 3,530 28,334 18,789 12,459 5,991 -2,032 35,898 -12,176 4,130 6,397 2,933 40,921 18,762 8,603 6,372 1,423 40,597 9,065 2,024 6,292 0,866 39,584 5,445 0,749 5,521 -6,999 30,487 -38,645 48,987 5,670 1,559 32,148 8,842 2,432 6,234 0,888 38,868 5,534 0,788 6,380 -2,265 40,706 -14,451 5,131 4,605 1,529 21,208 7,041 2,338 5,991 -3,745 35,898 -22,438 14,025 139,773 3,190 795,094 9,784 127,613   639 . 13 25 773 . 139 094 . 795 2 2 2        n x x SXX   052 . 8 25 190 . 3 773 . 139 784 . 9           n y x xy SXY   206 . 127 25 190 . 3 613 . 127 2 2 2        n y y SYY i i i i i i b i i u x e u x b e x e          ) ln( 590 . 0 428 . 3 ) ln( ) ln( ) ln( ) ln( ) VAR( 2 2 2 1 2 2 1 2 2   590 . 0 639 . 13 052 . 8 1      XX XY S S b 428 . 3 25 ) 773 . 139 ) 590 . 0 (( 190 . 3 . ) ln( 1 2          n x b y 
  • 129.
    Example 6 129 H0: b1= 0 (Homoskedastic errors) H1: b1 ≠ 0 (Heteroskedastic errors) 324 . 5 2 25 )) 052 , 8 ( ) 590 . 0 (( 206 . 127 2 . 1 2           n S b S s XY YY i i i u x e    ) ln( 590 . 0 428 . 3 ) ln( 2 2 94492 . 0 639 . 13 324 . 5 0 590 . 0 0 2 1        XX S s b t 206 . 127 052 . 8 639 . 13     YY XY XX S S S Degree of freedom: ν = n – 2 = 25 – 2 = 23 P-value = 2 X P(t < -0.94492) = 2 X 0.1773 = 0.3546 Conclusion: P-value (0.3546) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic with respect to x2.
  • 130.
    Uji Asumsi Klasik:no heteroskedasticity The White test is a test for heteroskedasticity - i.e. it assumes that the relationship is between variable(s) and the error yi = β0 + β1x1i + β2x2i + ... + βkxki + ei ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki) quadratic regression of the squared residual that consists of the square of each x, and the product of each x times every other x from the original equation ei 2 = b0 + b1(x1i)2 + b2(x2i)2 + ... + bk(xki)2 + bk+1(x1i.x2i) + bk+2(x1i.x3i) + ... + b2k-1(x1i.xki) + ... + b½.k(k+1)(x(k-1)i.xki) 130
  • 131.
    Uji Asumsi Klasik:no heteroskedasticity The White test has three basic steps: 1. Obtain the residuals of the estimated regression equation: ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki) 2. Use these residuals to form the quadratic model of squared residual in a second regression: ei 2 = b0 + b1(x1i)2 + b2(x2i)2 + ... + bk(xki)2 + bk+1(x1i.x2i) + bk+2(x1i.x3i) + ... + b2k-1(x1i.xki) + ... + b½.k(k+1)(x(k-1)i.xki) 131
  • 132.
    Uji Asumsi Klasik:no heteroskedasticity The White test has three basic steps: 3. Check to see whether quadratic model is significant or not. Test the significance of the n.R2 (the sample size, n, times the coefficient of determination, R2) with chi-square χ2-test (degree of freedom, ν = number of regressors = ½.k(k+1)) H0: n.R2 < χ2 critical (Homoskedastic errors) H1: n.R2 > χ2 critical (Heteroskedastic errors) 132
  • 133.
  • 134.
    Uji Asumsi Klasik:no heteroskedasticity 134 Advantages of the White test: a. It does not assume a specific functional form. b. It is applicable when the variance depends on two or more variables. Limitations of the White test: a.It is an large-sample test. b.It provides no information about the variance structure. c.It loses many degrees of freedom when there are many regressors. d.It cannot handle partitioned data. e.It also captures specification errors.
  • 135.
  • 136.
    Example 7 136 x1 x2x3 y y pred e e2 x12 x22 x32 x1x2 x1x3 x2x3 e4 1,74 5,30 10,8 25,5 27,35 -1,851 3,428 3,028 28,090 116,640 9,222 18,792 57,240 11,749 6,32 5,42 9,4 31,2 32,26 -1,062 1,129 39,942 29,376 88,360 34,254 59,408 50,948 1,274 6,22 8,41 7,2 25,9 27,35 -1,450 2,101 38,688 70,728 51,840 52,310 44,784 60,552 4,415 10,52 4,63 8,5 38,4 38,31 0,090 0,008 110,670 21,437 72,250 48,708 89,420 39,355 0,000 1,19 11,60 9,4 18,4 15,54 2,855 8,153 1,416 134,560 88,360 13,804 11,186 109,040 66,464 1,22 5,85 9,9 26,7 26,11 0,592 0,350 1,488 34,223 98,010 7,137 12,078 57,915 0,123 4,10 6,62 8,0 26,4 28,25 -1,853 3,434 16,810 43,824 64,000 27,142 32,800 52,960 11,794 6,32 8,72 9,1 25,9 26,22 -0,322 0,104 39,942 76,038 82,810 55,110 57,512 79,352 0,011 4,08 4,42 8,7 32,0 32,09 -0,088 0,008 16,646 19,536 75,690 18,034 35,496 38,454 0,000 4,15 7,60 9,2 25,2 26,07 -0,868 0,753 17,223 57,760 84,640 31,540 38,180 69,920 0,567 10,15 4,83 9,4 39,7 37,25 2,448 5,991 103,023 23,329 88,360 49,025 95,410 45,402 35,892 1,72 3,12 7,6 35,7 32,49 3,212 10,317 2,958 9,734 57,760 5,366 13,072 23,712 106,450 1,70 5,30 8,2 26,5 28,20 -1,703 2,901 2,890 28,090 67,240 9,010 13,940 43,460 8,416 59,43 81,82 115,4 377,5 377,50 0,00 38,676 394,726 576,726 1035,960 360,662 522,078 728,310 247,154 Quadratic model of squared residual ei 2 = b0 + b1(x1i)2 + b2(x2i)2 + b3(x3i)2 + b4(x1i.x2i) + b5(x1i.x3i) + b6(x2i.x3i)
  • 137.
    Example 7 137 13 394,726576,726 1035,960 360,662 522,078 728,310 394,726 28435,940 14215,890 30724,415 17434,014 28097,685 19924,252 576,726 14215,890 39239,948 46312,736 17114,038 20729,897 40540,700 A = X'X = 1035,960 30724,415 46312,736 86244,604 27835,366 41298,355 59564,444 360,662 17434,014 17114,038 27835,366 14215,890 19924,252 20729,897 522,078 28097,685 20729,897 41298,355 19924,252 30724,415 27835,366 728,310 19924,252 40540,700 59564,444 20729,897 27835,366 46312,736 2,57712 0,00801 -0,02334 -0,04105 -0,02487 -0,00143 0,04125 0,00801 0,00144 -0,00119 -0,00038 0,00064 -0,00220 0,00183 -0,02334 -0,00119 0,00641 0,00400 0,00091 0,00112 -0,01095 A-1 = (X'X)-1= -0,04105 -0,00038 0,00400 0,00314 0,00151 -0,00031 -0,00721 -0,02487 0,00064 0,00091 0,00151 0,00274 -0,00240 -0,00241 -0,00143 -0,00220 0,00112 -0,00031 -0,00240 0,00431 -0,00112 0,04125 0,00183 -0,01095 -0,00721 -0,00241 -0,00112 0,01920
  • 138.
    Example 7 138 38,676 880,789 1910,925 g =X'Y = 2976,781 793,631 1216,240 2176,259 8,21240 -0,03213 0,44284 β =A-1.g = 0,17661 -0,14725 0,10519 -0,68045 The estimated regression equation of squared residual quadratic model ei 2 = 8.212 – 0.032(x1i)2 + 0.440(x2i)2 + 0.177(x3i)2 – 0.147(x1i.x2i) + 0.105(x1i.x3i) – 0.680(x2i.x3i)
  • 139.
    Example 7 139 Response SurfaceMethodology ei 2 = 8.212 – 0.032(x1i)2 + 0.440(x2i)2 + 0.177(x3i)2 – 0.147(x1i.x2i) + 0.105(x1i.x3i) – 0.680(x2i.x3i)
  • 140.
    Example 7 140 β X'YΒ(X'Y) 8,21240 38,676 317,626 -0,03213 880,789 -28,299 0,44284 1910,925 846,242 0,17661 2976,781 525,737 -0,14725 793,631 -116,866 0,10519 1216,240 127,937 -0,68045 2176,259 -1480,825 191,553       487 . 76 066 . 115 553 . 191 13 676 . 38 553 . 191 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          088 . 132 066 . 115 154 . 247 13 676 . 38 154 . 247 ) ( ) ( 2 2 2 2             n y y y y SST i   601 . 55 553 . 191 154 . 247 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i     
  • 141.
    Example 7 141 52780 . 7 57906 . 0 13 57906 . 0 088 . 132 487 . 76 2 2       nR SST SSR R P-value =P(χ2 > 7.52780) = 0.2748 Conclusion: P-value (0.2748) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic. H0: n.R2 < χ2 critical (Homoskedastic errors) H1: n.R2 > χ2 critical (Heteroskedastic errors) Sign.level = 0.05 Degree of freedom: ν = ½.k(k+1) = ½.3(3+1) = 6 χ2=12.59
  • 142.
  • 143.
  • 144.
  • 145.
    Example 8 145 x1 x2y ypred e e2 x12 x22 x1x2 e4 2 50 9,95 8,379 1,571 2,469 4 2500 100 6,096 8 110 24,45 25,596 -1,146 1,313 64 12100 880 1,725 11 120 31,75 33,954 -2,204 4,858 121 14400 1320 23,600 10 550 35,00 36,597 -1,597 2,550 100 302500 5500 6,501 8 295 25,02 27,914 -2,894 8,373 64 87025 2360 70,111 4 200 16,86 15,746 1,114 1,240 16 40000 800 1,538 2 375 14,38 12,450 1,930 3,724 4 140625 750 13,867 2 52 9,60 8,404 1,196 1,431 4 2704 104 2,048 9 100 24,35 28,215 -3,865 14,938 81 10000 900 223,150 8 300 27,50 27,976 -0,476 0,227 64 90000 2400 0,051 4 412 17,08 18,402 -1,322 1,749 16 169744 1648 3,057 11 400 37,00 37,462 -0,462 0,213 121 160000 4400 0,046 12 500 41,95 41,459 0,491 0,241 144 250000 6000 0,058 2 360 11,66 12,262 -0,602 0,363 4 129600 720 0,132 4 205 21,65 15,809 5,841 34,116 16 42025 820 1163,932 4 400 17,89 18,252 -0,362 0,131 16 160000 1600 0,017 20 600 69,00 64,666 4,334 18,785 400 360000 12000 352,864 1 585 10,30 12,337 -2,037 4,149 1 342225 585 17,212 10 540 34,93 36,472 -1,542 2,376 100 291600 5400 5,647 15 250 46,59 46,560 0,030 0,001 225 62500 3750 0,000 15 290 44,88 47,061 -2,181 4,756 225 84100 4350 22,623 16 510 54,12 52,561 1,559 2,430 256 260100 8160 5,903 17 590 56,63 56,308 0,322 0,104 289 348100 10030 0,011 6 100 22,13 19,982 2,148 4,613 36 10000 600 21,281 5 400 21,15 20,996 0,154 0,024 25 160000 2000 0,001 206 8294 725,82 725,820 0,000 115,173 2396 3531848 77177 1941,469 Quadratic model of squared residual ei 2 = b0 + b1(x1i)2 + b2(x2i)2 + b3(x1i.x2i)
  • 146.
    Example 8 146 25 23963531848 77177 2396 502184 485990145 14846879 A = X'X = 3531848 485990145 847350799652 17764206203 77177 14846879 17764206203 485990145 1,192E-01 -8,904E-04 -6,830E-07 3,323E-05 -8,904E-04 3,753E-05 1,394E-08 -1,515E-06 A-1 = (X'X)-1= -6,830E-07 1,394E-08 1,149E-11 -7,377E-10 3,323E-05 -1,515E-06 -7,377E-10 7,002E-08 115,173 13020,171 g = X'Y = 14225116,948 378285,711 4,99502 0,01142 β =A-1.g = -0,00001 0,00010 Quadratic model of squared residual ei 2 = 4.99502 + 0.01142(x1i)2 – 0.00001(x2i)2 + 0.00010(x1i.x2i)
  • 147.
    Example 8 147 β X'YΒ(X'Y) 4,99502 115,173 575,294 0,01142 13020,171 148,677 -0,00001 14225116,948 -179,972 0,00010 378285,711 37,360 581,359       761 . 50 597 . 530 359 . 581 25 173 . 115 359 . 581 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          872 . 1410 597 . 530 469 . 1941 25 173 . 115 469 . 1941 ) ( ) ( 2 2 2 2             n y y y y SST i   110 . 1360 359 . 581 469 . 1941 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i     
  • 148.
    Example 8 148 8995 . 0 03598 . 0 25 03598 . 0 872 . 1410 761 . 50 2 2       nR SST SSR R P-value =P(χ2 > 0.03598) = 0.82556 Conclusion: P-value (0.82556) > Sign.level (0.05). There is no evidence to reject H0. The error term is Homoskedastic. H0: n.R2 < χ2 critical (Homoskedastic errors) H1: n.R2 > χ2 critical (Heteroskedastic errors) Sign.level = 0.05 Degree of freedom: ν = ½.k(k+1) = ½.2(2+1) = 3 χ2=7.81 Quadratic model of squared residual ei 2 = 4.99502 + 0.01142(x1i)2 – 0.00001(x2i)2 + 0.00010(x1i.x2i)
  • 149.
    Uji Asumsi Klasik:no heteroskedasticity The Goldfeld-Quandt Test is a test for heteroskedasticity - i.e. it assumes that the relationship is between variable(s) and the error yi = β0 + β1x1i + β2x2i + ... + βkxki + ei ei = yi – (β0 + β1x1i + β2x2i + ... + βkxki) Error Sum of Squares, SSE, is calculated by the differences between each observed response variable and predicted response variable. SSE = Σei 2 = Σ(yi-ŷi)2 = Σ(yi 2) – (β0Σyi + β1Σx1i yi + β2Σx2iyi + ... + βkΣxkiyi) 149
  • 150.
    Uji Asumsi Klasik:no heteroskedasticity The Goldfeld-Quandt Test has the following steps: 1. Order the data by the magnitude of Explanatory Variable in ascending order. 2. Divide the data into three parts: less(n1), middle(n2) and greater(n3). There is no rule that specify how many observations each parts. n1+n2+n3=n the number of observation 3. Omit the middle part, and drop these n2 observations. 150
  • 151.
    Uji Asumsi Klasik:no heteroskedasticity The Goldfeld-Quandt Test has the following steps: 4. Obtain the error sum of squares of each part : SSE1 (less part with degree of freedom ν1=n1–k) and SSE2 (greater part with degree of freedom ν2=n2–k) 5. Assuming the error process is normally distributed, then calculate the ratio of the error sum of squares: F=(SSE2/ν2)/(SSE1/ν1). 6. Apply the F-test. F>Fcritical at the right tail or F<Fcritical at the left tail indicate that the variances are different. 151
  • 152.
    Uji Asumsi Klasik:no heteroskedasticity Heteroskedasticity consequences:  Heteroskedasticity does not cause bias in the β coefficient estimates  Heteroskedasticity increases the variance of the β coefficient estimates  Heteroskedasticity causes the dependent variable to fluctuate in a fashion that the estimation procedure (OLS) attributes to the independent variables. Hence the variance of the estimates of β increases. These estimates are still unbiased, since over-estimation and under-estimation are still as likely  Heteroskedasticity causes OLS to underestimate the variances (and standard errors) of the β coefficients.  Intuitively - Heteroskedasticity increases the fit of the model. Hence the estimation of the variance and standard errors is lower. This can lead the researcher to conclude a relationship exists when in fact they are unrelated.  Hence the t-stats and F-stats can not be relied upon for statistical inference.  Spurious Regressions. 152
  • 153.
    Uji Asumsi Klasik:no heteroskedasticity Methods to correct or remedy heteroskedasticity View logarithmized data Redefine the variables Apply a weighted least squares estimation Use Heteroskedasticity-consistent standard errors Use Heteroskedasticity-corrected standard errors Use Minimum Norm Quadratic Unbiased Estimation (MINQUE) 153
  • 154.
    Uji Asumsi Klasik: No Multicollinearity 154
  • 155.
    Uji Asumsi Klasik:no multicollinearity No explanatory variable is highly correlated with one or more other explanatory variables. Multicollinearity is a violation of the classical assumption that assumes no explanatory variable is a perfect linear function of any other explanatory variables. Issue: Is xi related to xj? Such would be the case in clustered samples when any variables are similar within cluster but different between clusters. 155
  • 156.
    Uji Asumsi Klasik:no multicollinearity It assumed the explanatory variables are not interrelated, i.e. Cov (xi , xj) = 0 for i≠j. If the covariance of two explanatory variables is not equal to zero, then multicollinearity exists. This is essentially the same as saying there is no pattern between explanatory variables. If an explanatory variable can be explained by a pattern with respect to one or more other explanatory variables, we say that there are multicollinearity. 156
  • 157.
    Uji Asumsi Klasik:no multicollinearity Multicollinearity can be categorized into 2 kinds: Perfect multicollinearity (multicollinearity which an explanatory variable can be written as a linear combination of any other explanatory variables). Impure multicollinearity (multicollinearity which an explanatory variable is close to being represented by a linear function of any other explanatory variables). Multicollinearity mostly happens in clustering or segmenting where samples are grouped by similarity of some characteristics (e.g. clustered data). 157
  • 158.
    Uji Asumsi Klasik:no multicollinearity  With perfect multicollinearity, an explanatory variable can be completely explained by the movement of one or more other explanatory variable(s).  Perfect multicollinearity can usually be avoided by careful screening of the explanatory variables before a regression is run.  With imperfect multicollinearity, an explanatory variable is a strong but not perfect linear function of one or more other explanatoy variable(s).  Imperfect multicollinearity varies in degree from sample to sample. 158
  • 159.
    Uji Asumsi Klasik:no multicollinearity Multicollinearity can be categorized into another 2 kinds: Structural multicollinearity (multicollinearity occurs when we create a model term using other terms. In other words, it’s a byproduct of the model that we specify rather than being present in the data itself.). Impure multicollinearity (multicollinearity is present in the data itself rather than being an artifact of our model. Observational experiments are more likely to exhibit this kind of multicollinearity). 159
  • 160.
    Uji Asumsi Klasik:no multicollinearity Regression coefficient, βi, is the impact of explanatory variable, xi, has on response variable, y, holding all other explanatory variable(s) constant. If x1 is related to x2 then β1 will also capture the impact of changes in x2. In other words, interpretation of the regression coefficients or parameters becomes difficult. The easiest way to test multicollinearity is to examine the standard errors of the coefficients. Reasonable method to relieve multicollinearity is to drop some highly correlated variables. 160
  • 161.
    Uji Asumsi Klasik:no multicollinearity In the Venn diagrams, the overlapping area between Y and X(X1, X2) is the variance explained.  In case 1, X1 and X2 are related; X1 and Y are related, but X2 and Y has no relationship.  In case 2, both X1 and X2 contribute to some unique variance explained to Y, but they also have some common variance explained.  In case 3, again both X1 and X2 contribute unique variance explained to Y, but X1 and X2 are totally unrelated (orthogonal).  In case 4, although both X1 and X2 could predict Y. The variance explained contributed by X2 has been covered by X1 because X1 and X2 are too correlated (collinear). The above cases are not exhaustive. There are many other possible combinations between Y and Xs. 161
  • 162.
    Uji Asumsi Klasik:no multicollinearity 162 Severe multicollinearity produces a distribution of the βs that is centered around the true β, but that has a much wider variance. Thus the distribution of βs with multicollinearity is much wider than otherwise.
  • 163.
    Uji Asumsi Klasik:no multicollinearity A special case of multicollinearity problem is a dominant variable. The dominant variable is an explanatory variable, x, which is definitionally related to the response variable, y.  The dominant variable is generally a part of or a complement of the response variable. For example, if the Y variable is the number of computers and the X variable is the number of processors.  The dominant variable is highly correlated with response variable and it will make other explanatory variables unimportant in determining the response variable.  Do not confuse a dominant variable with a highly significant explanatory variable. 163
  • 164.
    Uji Asumsi Klasik:no multicollinearity First realize that that some multicollinearity exists in every equation: all variables are correlated to some degree (even if completely at random) So it’s really a question of how much multicollinearity exists in an equation, rather than whether any multicollinearity exists 164
  • 165.
    Uji Asumsi Klasik:no multicollinearity Some tests to detect multicollinearity  The Determination Coefficient R2 and t-Test  The Simple Correlation Coefficients Test  The Variance Inflation Factor (VIF) Test  The Farrar–Glauber Test  The Condition Number test  The Perturbing the data Test 165
  • 166.
    Uji Asumsi Klasik:no multicollinearity High Determination Coefficient, R2, with all low t- scores (individual coefficient estimator test) If this is the case, you have multicollinearity. If this is not the case, you may or may not have multicollinearity. If all the t-scores are significant and in the expected direction than we can conclude that multicollinearity is not likely to be a problem. 166
  • 167.
    Uji Asumsi Klasik:no multicollinearity High Simple Correlation Coefficients If a simple correlation coefficient, rij, between any two explanatory variables (xi and xj with i≠j) is high in absolute value, these two particular Xs are highly correlated and this evidence indicates the potential for multicollinearity. How high is high? Some researchers pick an arbitrary number, such as 0.80 A better answer might be that rij is high if it causes unacceptably large variances in the coefficient estimates in which we’re interested. 167
  • 168.
    Uji Asumsi Klasik:no multicollinearity High Simple Correlation Coefficients Caution in case of more than two explanatory variables: An explanatory variable is correlated with a group of any other explanatory variables, acting together simultaneously. An explanatory variable is correlated with an interaction of any other explanatory variables. An explanatory variable is a nonlinear function of any other explanatory variables It may cause multicollinearity without any single simple correlation coefficient being high enough to indicate that multicollinearity is present. 168
  • 169.
    Uji Asumsi Klasik:no multicollinearity High Simple Correlation Coefficients The matrix plot between two individual variables. The matrix of correlations between two individual variables Note that high correlation between the response variable, y, and one of the explanatory variable, x’s, is not muticollinearity. 169
  • 170.
    Uji Asumsi Klasik:no multicollinearity High Simple Correlation Coefficients the matrix plot between two individual variables 170
  • 171.
    Uji Asumsi Klasik:no multicollinearity High Simple Correlation Coefficients the matrix of correlations between two individual variables 171 Correlation X1 X2 ... Xk X1 1 r12 ... r1k X2 R21 1 ... r2k : : : : Xk rk1 rk2 ... 1
  • 172.
    Uji Asumsi Klasik:no multicollinearity The Variance Inflation Factors, VIF, is calculated from the following steps: 1. For each explanatory variable (xi), run an OLS regression that has xi as a function of all the other explanatory variables in the equation— For i = 1, this equation would be: x1 = b0 + b2.x2 + b3.x3 + . . . + bk.xk + u where u  N(0, u 2).: a classical (no multicollinearity) error term 172
  • 173.
    Uji Asumsi Klasik:no multicollinearity The Variance Inflation Factors, VIF, is calculated from the following steps: 2. Obtain a value of determination coefficient, Ri 2, from the regression equation of specific explanatory variable, xi. 3. Calculate the variance inflation factor for βi : VIF(βi) = 1 / (1 – Ri 2) 4. If VIF > 5, multicollinearity problem is potentially severe. Repeat for all X’s 173
  • 174.
    Uji Asumsi Klasik:no multicollinearity The Variance Inflation Factors, VIF  The higher the VIF, the more severe the effects of mulitcollinearity  While there is no table of formal critical VIF values, a common rule of thumb is that if a given VIF is greater than 5, the multicollinearity is severe  As the number of independent variables increases, it makes sense to increase this number slightly 174
  • 175.
  • 176.
    Example 9 176 x1 x2x3 y y pred 1,74 5,30 10,8 25,5 27,35 6,32 5,42 9,4 31,2 32,26 6,22 8,41 7,2 25,9 27,35 10,52 4,63 8,5 38,4 38,31 1,19 11,60 9,4 18,4 15,54 1,22 5,85 9,9 26,7 26,11 4,10 6,62 8,0 26,4 28,25 6,32 8,72 9,1 25,9 26,22 4,08 4,42 8,7 32,0 32,09 4,15 7,60 9,2 25,2 26,07 10,15 4,83 9,4 39,7 37,25 1,72 3,12 7,6 35,7 32,49 1,70 5,30 8,2 26,5 28,20 59,43 81,82 115,4 377,5 377,50 Cor y x1 x2 x3 y 1 0,65385 -0,78581 -0,18627 x1 0,65385 1 -0,15350 -0,14522 x2 -0,78581 -0,15350 1 0,07484 x3 -0,18627 -0,14522 0,07484 1
  • 177.
    Example 9 177 i ≠1 β X'Y Β(X'Y) 0 9,73992 59,43000 578,84326 2 -0,20244 360,66210 -73,01385 3 -0,43869 522,07800 -229,03100 276,79842       11189 . 5 68653 . 271 79842 . 276 13 43 . 59 79842 . 276 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          03897 . 123 68653 . 271 72550 . 394 13 43 . 59 72550 . 394 ) ( ) ( 2 2 2 2             n y y y y SST i   92708 . 117 79842 . 276 72550 . 394 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i      For x1, the regression equation: x1 = b0 + b2.x2 + b3.x3 + u 04155 . 0 03897 . 123 11189 . 5 2    SST SSR R 04335 . 1 04155 . 0 1 1 1 1 2      R VIF For x1, no multicollinearity
  • 178.
    Example 9 178 i ≠1 β X'Y Β(X'Y) 0 5,66436 81,82000 463,45828 1 -0,10323 360,66210 -37,23184 3 0,12408 728,31000 90,36559 516,59203       62953 . 1 96249 . 514 59203 . 516 13 82 . 81 59203 . 516 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          76391 . 61 96249 . 514 72640 . 576 13 82 . 81 72640 . 576 ) ( ) ( 2 2 2 2             n y y y y SST i   13437 . 60 59203 . 516 72640 . 576 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i      For x2, the regression equation: x2 = b0 + b1.x1 + b3.x3 + u 02638 . 0 76391 . 61 62953 . 1 2    SST SSR R 02710 . 1 02638 . 0 1 1 1 1 2      R VIF For x2, no multicollinearity
  • 179.
    Example 9 179 i ≠1 β X'Y Β(X'Y) 0 8,92230 115,40000 1029,63293 1 -0,04199 522,07800 -21,92001 2 0,02329 728,31000 16,96056 1024,67348       27656 . 0 39692 . 1024 67348 . 1024 13 4 . 115 67348 . 1024 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          56308 . 11 39692 . 1024 96000 . 1035 13 4 . 115 96000 . 1035 ) ( ) ( 2 2 2 2             n y y y y SST i   28652 . 11 67348 . 1024 96000 . 1035 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i      For x3, the regression equation: x3 = b0 + b1.x1 + b2.x2 + u 02392 . 0 56308 . 11 27656 . 0 2    SST SSR R 02450 . 1 02392 . 0 1 1 1 1 2      R VIF For x3, no multicollinearity
  • 180.
  • 181.
  • 182.
    Example 10 182 Cor yx1 x2 y 1 0,98181 0,49287 x1 0,98181 1 0,37841 x2 0,49287 0,37841 1
  • 183.
    Example 10 183 i ≠1 β X'Y Β(X'Y) 0 4,48353 206 923.60688 2 0,01132 77177 873.86423 1797.47111       03111 . 100 44 . 1697 47111 . 1797 25 206 47111 . 1797 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          56 . 698 44 . 1697 2396 25 206 2396 ) ( ) ( 2 2 2 2             n y y y y SST i   52889 . 598 47111 . 1797 2396 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i      For x1, the regression equation: x1 = b0 + b2.x2 + u 14320 . 0 56 . 698 03111 . 100 2    SST SSR R 16713 . 1 14320 . 0 1 1 1 1 2      R VIF For x1, no multicollinearity
  • 184.
    Example 10 184 i ≠1 β X'Y Β(X'Y) 0 227,55165 8294 1887313,3777 1 12,64664 77177 976030,0846 2863343,4623       0223 . 111726 44 . 2751617 4623 . 2863343 25 8294 4623 . 2863343 ) ( ) ( ) ( ) ( ) ( 2 2 1 1 0 2                  n y y x y x y y y SSR k k i          56 . 780230 44 . 2751617 3531848 25 8294 3531848 ) ( ) ( 2 2 2 2             n y y y y SST i   5377 . 668504 4623 . 2863343 3531848 ) ( ) ( ) ( ) ( ) ( 1 1 0 2 2                y x y x y y y y SSE k k i i      For x2, the regression equation: x2 = b0 + b1.x1 + u 14320 . 0 56 . 780230 0223 . 111726 2    SST SSR R 16713 . 1 14320 . 0 1 1 1 1 2      R VIF For x2, no multicollinearity
  • 185.
    Uji Asumsi Klasik:no multicollinearity Multicollinearity consequences:  The β coefficient estimates will still remain unbiased  Estimates will still be centered around the true values.  R2 will be high but the individual coefficients will have high standard errors  The variances and standard errors of the estimates will increase  Harder to distinguish the effect of one variable from the effect of another, so much more likely to make large errors in estimating the βs than without multicollinearity.  As a result, the estimated coefficients, although still unbiased, now come from distributions with much larger variances and, therefore, larger standard errors.  The computed t-scores will fall.  Variance and standard error are increased.  A relatively high R2 in an equation with few significant t statistics.  Thus confidence intervals for the estimates will be very wide, and significance tests might therefore give inappropriate conclusions. 185
  • 186.
    Uji Asumsi Klasik:no multicollinearity Multicollinearity consequences:  Estimates will become very sensitive to changes in specification.  The addition or deletion of an explanatory variable or of a few observations will often cause major changes in the values of the βs when significant multicollinearity exists  For example, if you drop a variable, even one that appears to be statistically insignificant, the coefficients of the remaining variables in the equation sometimes will change dramatically  This is again because with multicollinearity, it is much harder to distinguish the effect of one variable from the effect of another (holding all else constant)  The overall fit of the equation and the estimation of the coefficients of nonmulticollinear variables will be largely unaffected.  If the multicollinearity occurs in the population as well as the sample, then the predictive power of the model is unaffected. 186
  • 187.
    Uji Asumsi Klasik:no multicollinearity Methods to correct or remedy multicollinearity Do nothing: a. Multicollinearity will not necessarily reduce the t-scores enough to make them statistically insignificant and/or change the estimated coefficients to make them differ from expectations b. the deletion of a multicollinear variable that belongs in an equation will cause specification bias Drop a redundant variable: a. Viable strategy when two variables measure essentially the same thing b. Always use theory as the basis for this decision! 187
  • 188.
    Uji Asumsi Klasik:no multicollinearity Methods to correct or remedy multicollinearity Increase the sample size: a. This is frequently impossible but a useful alternative to be considered if feasible b. The idea is that the larger sample normally will reduce the variance of the estimated coefficients, diminishing the impact of the multicollinearity 188
  • 189.
    189 Terima kasih ... ...Ada pertanyaan ???

Editor's Notes

  • #10 Statistika dapat menjadi alat bantu dalam menyelesaikan masalah. Mulai dari saat mengumpulkan data, mengolah, menginterpresikan, menganalisa dan mensitesanya. Namun saat keliru menetapkan populasi atau keliru memilih sampel, serta keliru mendeskripsikan variabel yang akan diambil datanya, melalaikan tujuan penelitian, maka hasil pengolahannyapun akan menjadi sampah. Demikian pula meskipun data yang diambil benar dan representatif, namun metode dan alat pengolahannya pun keliru yang dipilih atau keliru cara mempergunakannya, maka hasilnya pun akan menjadi sampah. Sehingga perlu kita pahami bahwa statistika cuma alat berbasis model matematis, ada angka dimasukkan akan mengeluarkan hasil, entah itu benar atau sampah.
  • #11 Alat statistika bukan tongkat sihir, apapun dan bagaimanapun keadaan datanya dengan tongkat sihir ajaib bisa mengubahnya menjadi hasil yang sesuai dengan tujuan penelitian. Metode statistika pun bukan ramuan sihir, apapun dan bagaimanapun keadaan datanya diolah dalam ramuan sihir ajaib bisa mewujudkan hasil yang sesuai dengan tujuan penelitian.