This study includes applications of forecasting models established on the data that contain the electrical power consumption of a specific region which are observed hourly. At the beginning of the research, basic information about the electrical power system and the forecasting methods are given and the situation is clarified. Trakya region in Turkey which is in European side of Turkey is selected as the target region. The data is composed of hourly observed electrical energy values for the whole year of 2005 and some months of 2006 and 2007 which is 23 months in total. Because the data is large enough and the aim of the research is to establish accurate forecasting models for short term forecasting, quantitative methods are used. For this region, forecasting methods are improved for the short term electrical energy consumption that is the next 12 hours of the last day of each months and the best fitted model is determined for each months. The best fitted models are applied to the data and the related results are discussed.
Exploring the best method of forecasting for short term electrical energy demand
1. T.C.
MARMARA ÜNVERSTES
SOSYAL BLMLER ENSTTÜSÜ
SLETME ANABLM DALI
SAYISAL YÖNTEMLER (NG) BLM DALI
EXPLORING THE BEST METHOD OF FORECASTING FOR
SHORT TERM ELECTRICAL ENERGY DEMAND
(A RESEARCH ON ENERGY DEMAND OF TRAKYA REGION IN
TURKEY)
Yüksek Lisans Tezi
MESUT GÜNES
stanbul, 2009
2. T.C.
MARMARA ÜNVERSTES
SOSYAL BLMLER ENSTTÜSÜ
SLETME ANABLM DALI
SAYISAL YÖNTEMLER (NG) BLM DALI
EXPLORING THE BEST METHOD OF FORECASTING FOR
SHORT TERM ELECTRICAL ENERGY DEMAND
(A RESEARCH ON ENERGY DEMAND OF TRAKYA REGION IN
TURKEY)
Yüksek Lisans Tezi
MESUT GÜNES
SUPERVISOR: PROF. DR. RAUF NURETTN NSEL
stanbul, 2009
4. GENEL BILGILER
Isim ve Soyadı : Mesut Günes
Anabilim Dalı : Isletme
Programı : Sayısal Yöntemler
Tez Danısmanı : Prof. Dr. Rauf Nurettin Nisel
Tez Türü ve Tarihi : Yüksek Lisans – Temmuz 2009
Anahtar Kelimeler : Tahmin yöntemleri, zaman serileri, elektrik enerjisi
tüketimi, SPSS, Minitab, Matlab
ÖZET
KISA SÜRELI ELEKTRIK ENERJISI IHTIYACI ICIN EN IYI YÖNTEMIN
BELIRLENMESI (TRAKYA BÖLGESI ENERJI IHTIYACI ÜZERINE BIR
ÇALISMA)
Bu çalısma belli bir bölgeye ait saatlik tutulmus elektrik enerjisi tüketimine iliskin
veriler üzerine kurulu tahmin yöntemlerinin uygulanmalarını kapsamaktadır. Bu kapsamda
öncelikle elektrik sistemleri ve tahmin yöntemleri üzerine bilgi verilerek mevcut durum
ortaya konmustur. Bölge olarak Türkiyenin Avrupa kıtasında kalan kesimi yani Trakya
bölgesi amaç olarak ele alındı. Mevcut elektrik tüketim verilerinin saatlik tutulması ve 2005
yılının tamamı, 2006 ve 2007 yıllarının bazı ayları olmak üzere toplam 23 aylık büyük bir
veri üzerinde çalısılmasından dolayı “Quantitative” sayısal tahmin yöntemleri daha tutarlı
sonuç vermesi acısından kullanıldı. Bu bölgeye yönelik her bir ayın son gününü takip eden
12 saatlik elektrik enerji tüketimine iliskin tahmin teknikleri gelistirildi ve elde edilen
veriler ısıgında en uygun modeller belirlendi. Elde edilen tahmin modelleri elektrik enerjisi
verilerine uygulandı ve sonuçlar tartısıldı.
II
5. GENERAL KNOWLEDGE
Name and Surname : Mesut Günes
Field : Management
Programme : Quantitative Science
Supervisor : Prof. Dr. Rauf Nurettin Nisel
Degree Awarded and Date : Master - May 2009
Keywords : Forecasting methods, time series, electrical power
consumption, SPSS, Minitab, Matlab
ABSTRACT
EXPLORING THE BEST METHOD OF FORECASTING FOR SHORT TERM
ELECTRICAL ENERGY DEMAND (A RESEARCH ON ENEGRY DEMAND OF
TRAKYA REGION IN TURKEY)
This study includes applications of forecasting models established on the data that
contain the electrical power consumption of a specific region which are observed hourly.
At the beginning of the research, basic information about the electrical power system and
the forecasting methods are given and the situation is clarified. Trakya region in Turkey
which is in European side of Turkey is selected as the target region. The data is composed
of hourly observed electrical energy values for the whole year of 2005 and some months of
2006 and 2007 which is 23 months in total. Because the data is large enough and the aim
of the research is to establish accurate forecasting models for short term forecasting,
quantitative methods are used. For this region, forecasting methods are improved for the
short term electrical energy consumption that is the next 12 hours of the last day of each
months and the best fitted model is determined for each months. The best fitted models are
applied to the data and the related results are discussed.
III
6. IV
ACKNOWLEDGE
I am appreciated to represent my special thanks to my supervisor and teacher Prof. Dr.
Rauf Nurettin Nisel, my teacher Ass. Prof. Dr. Özcan Baytekin and my friend Betül
Özdemir.
7. V
ABBREVATION
AC : Alternative Current
ACF : Autocorrelation Function
ADF : Augmented Dickey Fuller Test
AIC : Akaike Information Criteria
AICF : Akaike Information Criteria Function
ANSI : American National Standards Institute
AR : Auto Regression
ARIMA : Auto Regressive Integrated Moving Average
BEDAS : Turkish Electricity Distribution CO.
BIC : Bayesian Information Criteriation
DC : Direct Current
df : Degrees-of-freeedom
LBQ : Indicator for Ljung-Box Q test
MA : Moving Average
MAD : Mean Absolute Deviation
MAPE : Mean Absolute Percentage Error
MSD : Mean Squared Deviation
MW : Unit of Electrical Power (equals to 106 Watt)
PACF : Partial Autocorrelation Function
TEIAS : Turkish Electricity Transmission CO.
8. TABLE OF CONTENTS
ÖZET……. ...........................................................................................................................II
ABSTRACT ........................................................................................................................ III
ACKNOWLEDGE............................................................................................................. IV
ABBREVATION ..................................................................................................................V
INTRODUCTION........................................................................................................... XIV
SECTION 1............................................................................................................................1
1 ELECTRICAL POWER SYSTEMS ............................................................................1
1.1 Basics Of Electrical Power .....................................................................................1
1.2 Electrical Power System .........................................................................................4
1.2.1 Generators ...........................................................................................................6
1.2.2 Transmission And Subtransmission....................................................................8
1.2.3 Distribution .........................................................................................................9
1.2.4 Loads .................................................................................................................10
SECTION 2..........................................................................................................................13
2 FORECASTING METHODOLOGY.........................................................................13
2.1 Basics of Forecasting Methods .............................................................................14
2.1.1 Qualitative Methods ..........................................................................................16
2.1.1.1 Delphi Methods.................................................................................................18
2.1.1.2 Scenario Writing ...............................................................................................18
2.1.1.3 Market Search ...................................................................................................19
2.1.1.4 Focus Groups ....................................................................................................19
2.1.2 Quantitative Methods ........................................................................................20
VI
9. 2.1.2.1 Naïve Models ....................................................................................................25
2.1.2.2 Autoregressive Process (AR) ............................................................................26
2.1.2.3 Moving Average (MA) .....................................................................................28
2.1.2.4 Autoregressive And Moving Average Process (ARMA) .................................30
2.1.2.5 Smoothing Methods ..........................................................................................32
2.1.2.6 Simple Exponential Smoothing Methods .........................................................35
2.1.2.7 Exponential Smoothing Adjusted For Trend: Holt’s Method...........................37
2.1.2.8 Exponential Smoothing Adjusted For Trend And Seasonality Variation:
Winter’s Method ...............................................................................................39
2.2 Test Of Stationarity ...............................................................................................42
2.3 Model Checking ....................................................................................................45
2.4 Model Selection Criteria .......................................................................................48
2.5 Testing Of Forecasting Accuracy .........................................................................49
2.6 Analysis Of Outlier ...............................................................................................51
2.6.1 Univariate Detection Of Outlier........................................................................53
2.6.2 Bivariate Detection Of Outlier ..........................................................................54
2.6.3 Multivariate Detection Of Outlier.....................................................................55
SECTION 3..........................................................................................................................57
3 APPLICATIONS OF FORECASTING METHODS TO THE ELECTRICAL
ENERGY DATA OF TRAKYA REGION FOR SHORT TERM ENERGY
DEMAND ......................................................................................................................57
3.1 Exploring Data Pattern..........................................................................................58
3.2 Test Of Stationarity ...............................................................................................65
3.3 Applications Of Autoregressive Moving Average Models For January 2005 .....72
3.3.1 Model 1: ARIMA(1, 1, 0)(0, 1, 2)24..................................................................82
VII
10. 3.3.2 Model 2: ARIMA(1, 1, 0)(1, 1, 1)24..................................................................84
3.3.3 Model 3: ARIMA(1, 1, 0)(0, 1, 1)24..................................................................86
3.3.4 Model 4: ARIMA(0, 1, 1)(0, 1, 1)24..................................................................88
3.3.5 Model 5: ARIMA(0, 1, 2)(1, 1, 0)24..................................................................90
3.3.6 Model 6: ARIMA(0, 1, 0)(2, 1, 0)24..................................................................92
3.3.7 Model Selection For ARIMA Models ..............................................................94
3.4 Applications Of Smoothing Methods For January 2005 ......................................96
3.4.1 Application Of Simple Exponential Smoothing For January 2005 ..................96
3.4.2 Application Of Exponential Smoothing Adjusted For Trend: Holt’s
Methods For January 2005................................................................................99
3.4.3 Application Of Exponential Smoothing Adjusted For Trend And Seasonal
Variation: Winter’s Methods For January 2005 .............................................102
3.4.3.1 Application Of Winter’s Additive Method For January 2005 ........................102
3.4.3.2 Application Of Winter’s Multiplicative Method For January 2005 ...............104
3.5 Exploring The Best Fitted Forecasting Model For January 2005 .......................107
3.6 Re-Modeling Of January 2005 By SPSS 17 “Time Series Modeler”.................108
SECTION 4........................................................................................................................115
4 EXPLORATION AND APPLICATION OF THE BEST FITTED
FORECASTING MODEL FOR EACH MONTHS BY SPSS TIME SERIES
MODELER .................................................................................................................115
4.1 Application Of The Best Fitted Forecasting Model For February 2005.............123
4.2 Application Of The Best Fitted Forecasting Model For March 2005.................125
4.3 Application Of The Best Fitted Forecasting Model For April 2005...................127
4.4 Application Of The Best Fitted Forecasting Model For May 2005 ....................129
4.5 Application Of The Best Fitted Forecasting Model For June 2005 ....................131
VIII
11. 4.6 Application Of The Best Fitted Forecasting Model For July 2005 ....................133
4.7 Application Of The Best Fitted Forecasting Model For August 2005................135
4.8 Application Of The Best Fitted Forecasting Model For September 2005 ..........137
4.9 Application Of The Best Fitted Forecasting Model For October 2005 ..............139
4.10 Application Of The Best Fitted Forecasting Model For November 2005 ..........141
4.11 Application Of The Best Fitted Forecasting Model For December 2005...........143
4.12 Application Of The Best Fitted Forecasting Model For August 2006................145
4.13 Application Of The Best Fitted Forecasting Model For September 2006 ..........147
4.14 Application Of The Best Fitted Forecasting Model For October 2006 ..............149
4.15 Application Of The Best Fitted Forecasting Model For November 2006 ..........151
4.16 Application Of The Best Fitted Forecasting Model For January 2007...............153
4.17 Application Of The Best Fitted Forecasting Model For February 2007.............155
4.18 Application Of The Best Fitted Forecasting Model For March 2007.................157
4.19 Application Of The Best Fitted Forecasting Model For April 2007...................159
4.20 Application Of The Best Fitted Forecasting Model For May 2007 ....................161
4.21 Application Of The Best Fitted Forecasting Model For June 2007 ....................163
4.22 Application Of The Best Fitted Forecasting Model For July 2005 ....................165
5 CONCLUSION ...........................................................................................................167
REFERENCE ....................................................................................................................169
BOOKS………….. ............................................................................................................169
ARTICLES AND WEB PAGES ......................................................................................172
IX
12. LIST OF TABLES
Table 1.1: Components of A Modern Electrical Distribution System ...................................5
Table 1.2: Heating Values of the Sources of Power Generation Used in Turkey..................7
Table 1.3: Sources of Power Generation in Turkey From 1970 to 2007 ...............................8
Table 1.4: Capacitive (a) and Inductive (b) Loads...............................................................10
Table 2.1: Organization Chart of Forecasting ......................................................................16
Table 2.2: Elements of Focus Groups ..................................................................................20
Table 2.3: Summary of ACF and PACF in AR(p), MA(q) and ARMA(p, q) Processes.....31
Table 2.4: The Route of AR(p), MA(q) and ARMA(p, q) Processes ..................................32
Table 2.5: Two Filter for Time Series..................................................................................33
Table 2.6: The Process of Smoothing A Data Set................................................................34
Table 2.7: Smoothing Methods – ARIMA...........................................................................35
Table 2.8: Comparison of Smoothing Constants .................................................................37
Table 2.9: Critical Values for ADF Test ..............................................................................44
Table 3.1: Autocorrelation of January 2005 with Lag 1 Difference ....................................60
Table 3.2: Autocorrelation of January 2005 with Seasonal Difference ...............................61
Table 3.3: Seasonally Differentiated Time Series, Seasonal Index: 24 ...............................69
Table 3.4: Autocorrelation of power0105_Bus_Dif1 ..........................................................80
Table 3.5: Partial Autocorrelation of power0105_Bus_Dif1 ...............................................81
Table 3.6: Comparison of ARIMA Models .........................................................................94
Table 3.7: Comparing ARIMA(0, 1, 1)(0, 1, 1)24 and Smoothing Methods ......................107
Table 3.8: Forecasting Boundary of ARIMA(0, 1, 1)(0, 1, 1)24 ........................................108
X
13. Table 3.9: Definition of Time Series Modeler Function ....................................................109
Table 3.10: Definition of Time Series Modeler Function ..................................................111
Table 4.1: Model Description of Raw Data, Outlier Detection is off ................................116
Table 4.2: Model Statistics of Raw Data, Outlier Detection is off.....................................117
Table 4.3: Model Description of Raw Data, Outlier Detection is on.................................118
Table 4.4: Model Statistics of Raw Data, Outlier Detection is on .....................................119
Table 4.5: Model Description of Data of Business Day, Outlier Detection is off..............120
Table 4.6: Model Statistics of Data of Business Day, Outlier Detection is off ..................121
Table 4.7: Model Description of Data of Business Day, Outlier Detection is on ..............121
Table 4.8: Model Statistics of Data of Business Day, Outlier Detection is on ..................122
Table 4.9: Summary of Forecasting Models for All Months .............................................168
XI
14. LIST OF FIGURES
Figure 2.1: Electrical Energy Consumption of Trakya Region January 2005 .....................23
Figure 2.2: Electrical Energy Consumption of Trakya Region 2005...................................24
Figure 2.3: Time Series Analysis Process ............................................................................25
Figure 2.4: Scatterplot for Bivariate Outlier Detection........................................................55
Figure 2.5: Multivariate Detection of Outlier ......................................................................56
Figure 3.1: Scatter plot of January 2005 with Lag 1 Difference..........................................59
Figure 3.2: Scatter plot of January 2005 with Seasonal Difference.....................................61
Figure 3.3: Trend Line Plot for January 2005 ......................................................................62
Figure 3.4: Growth Curve Trend Model Plot for January 2005...........................................63
Figure 3.5: Quadratic Trend Mode for January 2005 ..........................................................63
Figure 3.6: Component Analysis of January 2005. ..............................................................65
Figure 3.7: Consumption of Electrical Power Over Jan. 2005 ............................................66
Figure 3.8: Autocorrelation Function for powerJan2005.....................................................71
Figure 3.9: Partial Autocorrelation Function for powerJan2005 .........................................71
Figure 3.10: Autocorrelation Function for powerJan2005_sDiff ........................................72
Figure 3.11: Partial Autocorrelation Function for powerJan2005_sDiff .............................73
Figure 3.12: Power Consumption Business Days versus Holidays .....................................74
Figure 3.13: Power Consumption of Business Day .............................................................75
Figure 3.14: Autocorrelation Function for power0105_Bus ................................................75
Figure 3.15: Seasonally Differentiated Power Consumption of Business Day ...................76
Figure 3.16: Autocorrelation Function for power0105_Bus_Dif ........................................77
XII
15. Figure 3.17: Seasonal+ Lag1 Differentiated Power Consumption ......................................78
Figure 3.18: Autocorrelation Function for power0105_Bus_Dif1 ......................................79
Figure 3.19: Partial Autocorrelation Function for power0105_Bus_Dif1 ...........................79
Figure 3.20: Forecasting Boundary of ARIMA(0,1,1)(1,1,0)24 .........................................114
XIII
16. XIV
INTRODUCTION
Since there has been an increasing trend for the use of energy, the consumption
values are getting higher and higher if we don’t regard the economic crises. For any part of
life, the electrical energy is a non-replaceable item because the advantages of practically
use of electrical power in our smart home. Therefore for the government or any firms who
have the responsibility of the electrical power supplying from generation to distribution
have an import task for people’s needs. The energy for the people should be always eligible
in security. Any interruption can cause stopping the surgery operation or shutting down the
main server of a web provider if they haven’t taken any preventative action. Therefore
using the sources of electrical power efficiently is a must. Automation of the power flow
and estimating the fluctuation in the usage amount should be reinforced with the short term
power forecasting.
As I want to mention the importance of the electrical energy for ordinary life,
this research is aiming to develop forecasting models for short time forecasting like as
twelve hours energy demands. To achieve that, in the first section of the research, the basics
of electrical power and the components of electrical power distribution system are given
because we will use the data of electrical power consumption of Trakya region in Turkey.
17. Correspondingly, in section two, the basics of forecasting methods are given and structures
of forecasting iteration are explained. In the section three, the forecasting methods given in
the section two are separately applied to the data of the first month and the related result is
given by the help of SPSS, Minitab, Matlab, Excel, and some other sources. You can also
find the discussion of the each model in this section. In the section four, by the application
of the SPSS Time Series Modeler, forecasting result are found for the rest of the 22 months.
And again the results of the each moth are discussed here.
In the conclusion part, the best fitted forecasting methods are represented in a table
with outlier information. At the end of the research, you can find the data used in the
analysis.
XV
18. SECTION 1
1 ELECTRICAL POWER SYSTEMS
1.1 Basics Of Electrical Power
The history of electrical power system goes back to the 18th century and it starts
with Benjamin Franklin; by a kite string, electrical spark is understood as the base of the
electrical power then the principles of electricity become understandable gradually1. After
that the first electrical distribution system was established by Tomas Edison in 1882 which
was supplying direct current (DC Power) at Pearl Street Station in New York City. Then, in
1885, by William Stanley, transformer that regulates the magnitudes of current and voltage
level was invented and by Nicola Tesla, induction motor that uses alternative current (AC
Power) was invented in 1888.
The basic difference of AC power system and DC power system is that the DC
power system is supplied by DC current generators but the AC power system is supplied by
1 Robert H. Miller, James H. Malinowski, Power System Operation, Edition: 3, McGraw-Hill Professional, 1970, ISBN
0070419779, 9780070419773
19. AC current generator. Basically the DC system has a constant current level over time but
the AC system produces a current which changes sinusoidally over the time. The unit of
power is called watt which defines by the formula below for the DC power system;
P =V I (1.1)
= (1.2)
= (1.5)
2
I V
R
P = I 2 R (1.3)
Where, P is the power which is in Watt, V is the potential which is in Volt, I stands for the
current which is in Ampere and the R stands for the resistance of the system which is given
in Ohm. Then the result of the equation is given by watt. If we expand the formulation for
the AC power system then the every components must be given in time domain t. The
following equations are defined for AC systems2;
P( t )=V( t) I( t) (1.4)
I t V t
( ) ( )
Z t
( )
Z( t =) R +j X (1.6)
P( t)= I(2 t ) Z( t ) (1.7)
Where, Z( t )is the impedance of the AC power system which is given in ohm with
complex numbers. Since the AC power is in discussion the resistance is not only R, inactive
power components which are inductance and capacitance are added to the total resistance
and then the new component is called as impedance.
2 Mahmood Nahvi, Joseph Edminister, Schaum's outline of theory and problems of electric circuits,
Edition: 4, McGraw-Hill Professional, 2002, ISBN 0071393072, 9780071393072, p.219
20. The AC power system has two components, the first one is active power and the
second one is reactive power. In the street, power generally has the meaning of the active
power. The active power is used to run any kind of electrical machines, but the reactive
power is used to generate electromagnetic field in the winding of the motors. Wherever the
inductive and capacitive loads are present in a system, reactive power is consumed by the
system. The active and reactive powers are defined by the formulas given below3;
P( t)=(I 2)t ( Z) ct ojs … (W) (1.8)
Q( t)=(I 2) t ( Z) st i jn … (Var) (1.9)
S = P +j Q= P2+ Q2 … (VA) (1.10)
Where, the S is known as the complex power. Since the I in ampere, Z in ohm, V in volt the
result of the these powers are observed in Watt, Var and VA (volt-ampere). In generally
power is associated kilo so the powers are given in kilowatt, kWh which means that a
system consumes 1.000 Watt electrical power per hour. If the system works 5 hours, it
consumes 5.000 Watts, in another word, it consumes 5 kW.
In this research, active power consumptions of the Trakya region in Turkey are
observed by the TEIAS4 so the analysis is establish on active power consumption. Because
we will discuss the power consumption of a very large area of Turkey, the powers are given
by megawatt, MWh which is 1.000 times of kWh or 1.000.000 Watt.
3 Nahvi, Edminister, p.224
4 TEIAS stands for the Turkish Electrical Power Distribution Anonym Firm
3
21. 4
1.2 Electrical Power System
By the invention of Tesla the DC electrical distribution system was replaced to the
AC electrical distribution system because of many advantages of AC system5. The
advantages of AC distribution system can be summarized as below6:
1. Voltage level can be easily transformed in AC systems, thus providing the
flexibility for use of different voltage for generation, transmission and
consumption.
2. AC generators are much simpler than DC generators.
3. AC motors are much simpler and cheaper than DC motors
Basically, the electrical power in the distribution system is supplied by the
generators. In modern electrical distribution system, the distribution system is designed as
to supply the needs for electrical power without interruption. Therefore the system that the
generators are connected each other is called interconnected network is used for the modern
distribution system7.
5 Hadi Saadat, Power Transmission System, ISBN10: 0070122350 ISBN13: 9780070122352, 1/1/1998,
Mcgraw Hill Book Company, p.1
6 Prabha Kundur, Neal J. Balu, Mark G. Lauby, Power system stability and control, McGraw-Hill
Professional, 1994, ISBN 007035958X, 9780070359581, p.4
7 Saadat, p.4
22. Table 1.1: Components of A Modern Electrical Distribution System
Reference: Alan Elliott Guile, William Paterson, D. Das, Electrical Power Systems, New Age International,
2006, ISBN 8122418856, 9788122418859, p.2
By interconnecting, the large generators (MW) that produce electrical power at
cheaper cost than the small generators feed the whole system not a particular area so if
there is a fault in one area, this area is supplied by borrowing adjoining interconnected
areas. Therefore, the interconnected distribution system is not only economical but also it is
more reliable8. The basic components of the modern electrical system can be listed as
below
8 Alan Elliott Guile, William Paterson, D. Das, Electrical Power Systems, New Age International, 2006,
5
ISBN 8122418856, 9788122418859, p.3
23. 6
· Generators
· Transmission and subtransmission
· Distribution
· Loads
1.2.1 Generators
Generator is a kind of machine that if the stator is turned by applying a power from
outside, called mechanical power, and giving a direct current to exciting winding called as
excitation currrent, it generates electrical power. Therefore, they are one of the basic
components of an electrical system. There are made up as one phase or three phases.
Generally three phase generators are higher capacity than one phase generators and one
phase generators are used for local needs for electricity, not for a distribution system.
Capacities of generators are changed from 50 MW to 1500 MW9.
The sources to produce mechanical power to turn the generators are obtained a
variety of way. These are hydro, geothermal, wind, tidal, biomass, fossil fuels and nuclear
power10. Traditionally, damps have been used to produce electrical power but since the
trend of needs for electrical power had overcame the capacity of damps in many countries;
many of the countries have invoked other sources to provide their needs for electrical
energy. In the Table 1.1, summarizes the energy sources and their heating content and the
component of the chemical compounds in Turkey.
9 Saadat, p.4
10 Anthony J. Pansini, Kenneth D. Smalling, Guide to electric power generation Edition: 2, Press: Marcel
Dekker, 2002, ISBN 0824709276, 9780824709273, p.13
24. Table 1.2: Heating Values of the Sources of Power Generation Used in Turkey
7
Heating Values of Sources
Source 2006 2007
Hard Coal+Imported Coal 29.504 32.115
Lignite 83.932 100.320
Total 113.436 132.435
Fuel Oil 16.769 21.434
Diesel Oil 627 517
Lpg 0 0
Naphta 141 118
Total 17.537 22.069
Natural Gas 150.588 179.149
Total 281.561 333.653
Main Fuel 2.480 5.292
Auxiliary Fuel 1.505 1.601
Total 3.985 6.893
Main Fuel 80 37
Auxiliary Fuel 468 477
Total 548 514
Reference: The table formed by the data obtained from the source: http://www.teias.gov.tr/ist2007/45.xls
Figure 1.2 shows the percentage of the sources of power generation during 1970 to
2007. We can see that in 1970 the percent of the total heating sources is double of the total
hydro source and the years later, 1982 and 1988 the percent of the hydro power are greater
than the percent of the heating sources. However there is an increasing trend of using
heating sources, we can see that in 2007, the percent of the total heating sources is 5 times
bigger than the total hydro source. Another important point is that after 1984 geothermal
power and wind power started to use and in the recent years it is doubled but the percent of
the total of them is not satisfactory.
25. Table 1.3: Sources of Power Generation in Turkey From 1970 to 2007
Reference: Formed by the data obtained from the source: http://www.teias.gov.tr/ist2007/7.xls
1.2.2 Transmission And Subtransmission
Transmission of the power is performed by the transformers. By the meaning of the
transmission is that the depending on the ratio of transformer the voltage level or the
current level of the system or both is converted to another values. By transferring the
absolute value of the voltage level of the electric, transmission of the electrical power for
8
26. long distance become more effective11. Transmission of the high voltage of electric is more
effective in terms of loses but the insulation and design problems set limit of current level
for generation, which is usually 30kV. Therefore to make the transmission of electricity for
long distance with high voltage, step-up transformers are used to get higher voltage level
before transmission12.
By the term transmission, it is wanted to express, transferring the power for long
distance and by the term subtransmission, after the power transferred to long distance the
power should be reduced to voltage level of electric which can people use in their smart
home. In the transmission line the voltage level of the electricity which is called high
voltage or very high voltage are generally available in 60 kV, 69 kV, 115 kV, 138 kV, 161
kV, 230 kV, 345 kV, 500 kV, 765 kV13 for ANSI standard14. For the subtransmission line
the voltage level should be finally decreased to 230 Volt for Europe, Middle East and
Africa and 110 Volts for USA, Japan, Australia and some of other countries.
9
1.2.3 Distribution
Distribution is the last component of the power transmission. Since the electricity is
transmitted by transmission and subtransmission lines to the location where the power is
need, to serve for the people is performed by distribution system. The distribution system
can be underground and overhead because of the weather condition. The convenience of the
underground system makes it popular around the world; the 70 percent of the newly
building areas are equipped by underground system15. Generally distribution of electricity
is run by local government because the controlling of the system some times becomes
difficult. The distribution of the electricity is run by BEDAS in Turkey.
11 Robert H. Miller, James H. Malinowski, Power System Operation, Edition: 3, McGraw-Hill Professional,
1970, ISBN 0070419779, 9780070419773, p.2
12 Saadat, p.5-6
13 Saadat, p.6
14 ANSI: American National Standart Institute
15 Saadat, p.8
27. 1 0
1.2.4 Loads
As it is defined in the basics of electrical power, the load of the system is the total
impedance of the system. If the system is supplied by the AC power system then load has
three components which are resistive loads, inductive loads and capacitive loads. The
inductive and the capacitive loads make an angle difference between the current and the
voltage in sinusoidal wave form. The angle is called as load factor which takes minus, plus
value. For inductive load, load factor becomes minus and lags the voltage wave and for
capacitive loads, it takes positive values which mean that the current angle is leading the
voltage angle. For the resistive load, there is no angle in AC system16.
Table 1.4: Capacitive (a) and Inductive (b) Loads
Reference: Dale R. Patrick, Stephen W. Fardo, Rotating Electrical Machines and Power Systems, Edition:
2, The Fairmont Press, Inc., 1997, ISBN 0881732397, 9780881732399, p.35-38
In AC system, the load factor is wanted to be higher as much as possible because of
the power conservation. In the last review of the “Electrical Installation on Residential
Constructions for Low Voltage” the power factor is adjusted to 0.90 – 1. The meaning of
16 Dale R. Patrick, Stephen W. Fardo, Rotating Electrical Machines and Power Systems, Edition: 2, The
Fairmont Press, Inc., 1997, ISBN 0881732397, 9780881732399, p.40-41
28. this change is that the people must repair their system and then they profit in terms of
money by this changing17.
17 Ahmet Becerik, Ülkemizdeki Reaktif Güç Kompanzasyonuna Bir Bakis-I, Elektrik Mühendisleri Oda,
Izmir, 12 March 2008, http://www.emo.org.tr/ekler/6556dfe948f58c5_ek.pdf?dergi=4, p.1-2
1 1
29.
30. 1 3
SECTION 2
2 FORECASTING METHODOLOGY
Forecasting is the art of saying what will happen, and then explaining
why it didn’t. - Anonymous.
Forecasting is a systematic effort to anticipate future events, condition, amount of
anything, establishment of future expectation by the analysis of past data, or information of
opinions18. Selecting a proper forecasting method is the critical point for a successful
forecasting model for all type of the data and subjects. The importance of selecting the
correct forecasting methods can be explained by the internal result of forecasting. In the
forecasting process, every step is an observation for the success of the step performed one
before.
18 Chatfield, p.73
31. Forecasting methods can be applied every data with regarding the trend, cycle,
seasonality and irregular component. However every method has both advantages and
disadvantages so the selecting the appropriate methods is one of the most important issue.
For example, regarding a manufacturer, any significant over-or-under sales forecast error
may cause the firm to be overly burdened with excess inventory carrying costs or else
create lost sales revenue through unanticipated item shortages. When demand is fairly
stable, e.g., unchanging or else growing or declining at a known constant rate, making an
accurate forecast is less difficult than the situation includes unknown trend and unexpected
events. If, on the other hand, the data has historically experienced an up-and-down sales
pattern, then the complexity of the forecasting task is compounded. In this research we
ignore the unexpected events because it is not known how the situation changes and how it
would affect the forecast. This can be estimated by applying some methods but it is not a
subject of this research.
Time series methods are especially good for short-term forecasting where, within
reason, the past behavior of a particular variable is a good indicator of its future behavior, at
least in the short-term. The typical example here is short-term demand forecasting. Note the
difference between demand and production - demand should be zero.
1 4
2.1 Basics of Forecasting Methods
By the explanation it is a reality that modern economic system is based on the
explanation for the amount of future needs by analyzing the up to date data. Forecasting
methods are divided into two categories. First one is based on the explanation of the
behavior of the data collected until the time forecasting would be performed; this category
is called extrapolation method. The second one is called explanatory method which is based
on the factors that can affect the amount of the product or service. For example, the belief
that the sale of doll clothing will increase from current levels because of a recent
advertising blitz rather than proximity to Christmas illustrates the difference between the
32. two philosophies19. Both methods can produce successful result but the former method,
explanatory method, is more difficult to apply.
In this study, the extrapolation method will be used because, for short term
electrical energy consumption, it is important to recognize the fluctuation of the demand. In
addition to this it is also not easy to understand for what purposes people use electrical
energy just because we have the past related data. Since the power consumption data is
observed over time, it is supposed that the time series methods are best for the explanation
of the series. Time series methods are especially good for short-term forecasting where,
within reason, the past behavior of a particular variable is a good indicator of its future
behavior, at least in the short-term. The typical example here is short-term demand
forecasting. Note the difference between demand and production - demand should be zero.
Forecasting techniques are based on systematic effort so that the expectation can
be corrected by the correction of the errors done during the forecasting process. Basically
forecasting techniques are listed below in a table.
19 Peter Kenedy, A Guide to Econometrics, Edition: 5, MIT Press, 2003, ISBN 026261183X,
1 5
9780262611831, p.201-202
33. Table 2.1: Organization Chart of Forecasting
Forecasting Techniques
Techniques Routes
Qualitative Quantitative
1 – Naïve Model
2 – Auto Regressive
3 - Moving Average
4– Autoregressive Moving
Average
5 – Simple Exponential
Smoothing
6 – Holt’s Method
7 – Holt-Winters Method
1 6
1 - Delphi Methods
2 - Nominal Groups
Techniques
3 - Jury of Exclusive
Opinion
4 - Scenario Projection
2.1.1 Qualitative Methods
1 - Top-down route
2 – Bottom-up route
Qualitative methods are primarily based on judgments of past experience when
there is no past data to take an appropriate estimation formula and qualitative methods used
for the long term forecasting. However the people studying on qualitative methods don’t
have health or medical educational background, qualitative methods are generally used for
34. the health and medical study20. As it is defined by Catherine P., Nicholas M. qualitative
research asks qualitative question as follows:
“Measurement in qualitative research is usually concerned with taxonomy or
classification. Qualitative research answers questions such as, ‘what is X, and how
does X vary in different circumstances, and why?’ rather than ‘how big is X or how
many X’s are there?’”
The differences between quantitative and qualitative methods are not only the
quantitative method uses the observed data or the numbers. Sometimes the qualitative gives
more accurate result by eliminating the misunderstanding of language or terms of a specific
disciplinary by asking the question face-to-face21. Well known qualitative methods are
listed below.
1 7
1. Delphi Method
2. Growth Curves
3. Scenario Writing
4. Market Search
5. Focus Groups
20 Catherine Pope, Nicholas Mays, Qualitative Research in Health Research, Blackwell Publishing Ltd.
2006, ISBN-13: 978-1-4051-3512-2, ISBN-10: 1-4051-3512-3, p.1
21 Pope and Mays, p.5-6
35. 1 8
2.1.1.1 Delphi Methods
The Delphi method is an iterative process which gathers the expert’s options22 .
All experts or forecasters are meted together to make a future forecast on specific products
or services but the result of the consensus possibly may not be acceptable for all experts. As
in the continue time, everyone defends their point of view and poses their opinions to the
investigating team. Then the team sends the summary of the comments and mails the all
participants. This time every participant can see the others opinion and they can evaluate
themselves and modify the thoughts regarding the others opinions.
The procedures last when the majority of the experts reach the same point of view
after these procedures, all participants are invited to debate their opinion again and then the
result of the consensus are announced for the future expectations.
Nowadays the Delphi technique has a different meaning. It involves asking a body
of experts to arrive at a consensus opinion as to what the future holds. Underlying the idea
of using experts is the belief that their view of the future will be better than that of non-experts
(such as people chosen at random in the street). One of the most important problem
of qualitative methods which cause the models to be biased is that the qualitative methods
depends on people opinion, let say the models are subjective23.
2.1.1.2 Scenario Writing
Scenario writing is a special estimation for the specific un-clear future which
includes an organization of long term forecasting. This scenario writing is based on the
trends, people needs, new technology and also political view of the government. These
factors are important long years before the issue comes out.
22 Kenneth Lawrence, Ronald K. Klimberg, Fundamentals of Forecasting Using Excel Industrial Press,
Inc.,1’st edition, November 15, 2008, p.4
23 Lawrence and Klimberg, p.4-5
36. Scenario writing is established, in general, for the forecast of the many years in the
future. For example, if a company wants to write a scenario for long-term profitability,
generally planning department, should not focus on the short-term profitability which they
need to ignore short-term indicators. After discussion by employees of the planning
department, top management team reacts to important environmental changes.
1 9
2.1.1.3 Market Search
Market research is an affair that collects the customer information about new or old
products. After the research is completed, the result is used to profile of the product in the
market. Therefore the market research is aiming to collect general information about the
product, which is different than the focus group that is aiming to collect this kind of
information from the group of people who were already selected or determined by a group
of expert. However by the focus group detailed information which is not appropriate to
collect by survey can be collect by the help of a moderator, collected information can not
be generalized24.
2.1.1.4 Focus Groups
The focus group method is an interview which is performed by group of people. In
the social sciences, focus groups allow interviewers to study people in a more natural
setting than a one-to-one interview so the result of the method generally become more
natural and deterministic. Because the participants are not restricted for the answers, they
can say anything, by this way, the researchers gain any type reflection about the product
and also the feelings behind the facts can also be illustrated25. If the question is easy to
24 Lawrence and Klimberg, p.4
25 Nancy Grudens-Schuck, Beverlyn Lundy Allen, Kathlene Larson, Focus Group Fundamentals, Iowa State
University, May. 2004, p.2
37. understand, the results are believable and also it is cost and time effective to get sample
size. The element of Focus Groups is given in the Table 2.2
Table 2.2: Elements of Focus Groups
2 0
Reference: Grudens, Allen and Larson, p.7
2.1.2 Quantitative Methods
Quantitative methods are research techniques that are used to gather quantitative
data - information dealing with numbers and anything that is measurable. Statistics, tables
and graphs, are often used to present the results of these methods. They are therefore to be
distinguished from qualitative methods. Past time data are needed to use to anticipate the
38. future by quantitative forecasting methods. Further more, quantitative methods are divided
into two groups time series methods which uses just the past time data and causal
methods26. In this research, time series forecasting techniques are used to produce better
result.
The data that is collected or observed during incremental time period is named as
time series data27. Since time series methods are used, frequency which represents the
number of occurrences over time may be defined by minute, half-hour, hour, day, week,
mouth, and so on28. Depends on the frequency, we can see time series components or
patterns on the time series data. As in the quantitative methods, numerical indicators must
be observed successfully. However, we can not assume that the data is random because
collecting the data over time are disposed to have trend, seasonal pattern and the other time
series characteristics29. These are the basic issue in the quantitative methods application;
trend, cycle, seasonality and irregularity. The time series characteristic features can be
described as below:
1. Trend: It is a component which can be seen locally or globally but it lies on the
time series for long time. Trend can be upward or downward in the series. It is
important to estimate the trend because the mean of the changes in the series is
calculated by the slope of the trend. The more the slope of the trend line is, the more
the difference between next occurrences, and vice wise30.
2. Seasonality: In a time series, the seasonality occurs in a period of time
consecutively. Generally, economic pattern and the time series which is observed by
hourly, daily, weekly, yearly, and so on have this component. In engineering,
26 Lawrence and Klimberg, p.5
27 Bovas Abraham, Jhonnes Ledolter, Statistical Methods for Forecasting, Wiley Series in Probability and
2 1
Statistics”, John Willey Sons, p.58-59
28 Lawrence and Klimberg, p.33
29 John E. Hanke, Dean W. Wichern, Business Forecasting, Pearson, Prentice Hall, New Jersey, 2005, ISBN
0-13-122856-0, pp.327
30 Lawrence and Klimberg, p.34
39. demand of power, gas, water, and any kind of needs have the problems of
seasonality which is always be clarified and be well estimated31.
3. Cyclical: It is described as long-term data pattern that repeat themselves. In
electrical energy demand, cyclical components occur as annual, weekly and daily
cycles32.
4. Irregular: In time series, after the trend, seasonality, cycles are removed, the
irregular component of the series is observed. It is the pattern which is not described
by the rules.
The series may have all of the components, or one or more of the components
together. We can see these indicators from the electrical energy distribution of the Trakya
region in Turkey.
31 Ajoy K. Palit, Dobrivoje Popovic, Computational intelligence in time series forecasting: theory and
engineering applications, Springer, London, 2005, ISBN:1852339489, p.21
32 Michael P. Clements, David F. Hendry, A Companion to Economic Forecasting, Blackwell Publishing,
2 2
2002, ISBN 0631215697, 9780631215691, p.81
40. 0 100 200 300 400 500 600 700 800
2 3
4500
4000
3500
3000
2500
2000
1500
1000
Consumption of Electrical Power During Jan. 2005
Electrical Power (MWh)
Time Interval Jan. 2005 (Hour)
Figure 2.1: Electrical Energy Consumption of Trakya Region January 2005
According to Figure 2.1, power demand changes with the time, the data pattern
includes seasonality which the needs reach the maximum and minimum values in every 24
hours. This chart also shows that at night from 6pm to midnight, electrical energy demand
is at maximum. We can also see that 2 days for per weeks have less consumption, this
should be weekends.
41. 0 1000 2000 3000 4000 5000 6000 7000 8000 9000
2 4
5000
4500
4000
3500
3000
2500
2000
1500
1000
Consumption of Electrical Power During 2005
Electrical Power (MWh)
Time Interval Jan. 2005 - Dec. 2005
Figure 2.2: Electrical Energy Consumption of Trakya Region 2005
Furthermore, if we calculate a larger time series, Figure 2.2, it is also seen that the
electrical energy demand has annually cycle. The demand goes to maximum level at winter
time and lowest level at spring and autumn but in summer time, the consumption is higher
than spring and autumn but lower than winter time. In addition to these, there are two
lowest points in January and November. There are the Islamic vacation33 celebrated
annually.
Quantitative methods can be applied the data after the needed process has been
done. Upon starting to analysis, we need to estimate/find the seasonality and then eliminate
the trend and cycle at the end of the procedure data has to become stationary. Then we can
42. apply the forecasting techniques to find the electrical consumption for any demanded
intervals.
Figure 2.3: Time Series Analysis Process
2 5
2.1.2.1 Naïve Models
Basically Naïve forecasting model is the easiest model to understand the base of
forecasting techniques. The Naïve model depends on the last observed data to calculate the
forecasting values34. The Naïve forecasting model is described as below:
Y ˆ
=
Y t + 1
t ˆ
t Y + is the forecasted value for time
Where, t Y is the observed data at the time period t and 1
period t. By this method one hundred percent of forecasting values is imposed by the
current value of the series, having this feature the method is sometimes called as “no
change” forecast35. Since the Naïve model is accepted as the base of the forecasting
techniques, it is used to test the accuracy of the forecasting models by determining the
accuracy ratio36.
33 www.yildizliblok.com.tr/2005Takvimi.asp
34 Edwin J. Elton, Martin Jay Gruber, Investments: Portfolio theory and asset pricing, MIT Press, 1999, ISBN
0262050595, 9780262050593, p.378
35 John E. Hanke, Dean W. Wichern, Business Forecasting, Pearson, Prentice Hall, New Jersey, 2005, ISBN
0-13-122856-0, p.102
36 Charles W. Ostrom, Time series analysis: regression techniques, Second edition SAGE, 1990, ISBN
0803931352, 9780803931350, p.85
43. f o r e c a s t i n g e l
n a i v e e l
Accuracy Ratio = _ m o d
BmY Y - = (2.4)
2 6
_ m o d
R M S E
R M S E
(2.1)
Where, RMSE is stand for root-mean-squared-error, which is explained later of the
research.
2.1.2.2 Autoregressive Process (AR)
Basically, autocorrelation is described as values of dependent variable in one time
period are linearly related to values of the dependent variable in another time period37. An
AR model is represented as the function of dependent past data38. Therefore time series
forecasting model can be defined by a function of time which contains constant, predictor
and error term as following:
t t t Y = f (x + b ) + e (2.2)
Where, t Y is the desired data point to be forecasted, t x is the predictor variable or function
of time, b is the constant for over the time and t e is the error term as well.
t t t Y - - Y - = a - ( ) ( ) 1 m f m (2.3)
Where, t f is the coefficient and t a is the uncorrelated random variable. Then, we need a new
operator B which is called as backward-shift to shift the time series one step back. This
operator for one shift can be defined as -1 = t t BY Y , and it is in general form:
t t m
37 Hanke and Wichern, p.345
38 Bovas Abraham, Jhonnes Ledolter, Statistical Methods for Forecasting, Wiley Series in Probability and
Statistics”, John Willey Sons, p.192
44. Combining the formulation (2.3) and (2.4) auto regression model turns into more
representative formulation for the time series.
t t (1-fB)(Y - m) = a (2.5)
Estimation of sufficient p for AR models is called as determination of AR. For
determination there have been two ways, first is using autocorrelation function (PACF) and
the second one is information criterion function (AICF). This step can be made by
empirically39. In this research, because it is easy to apply to the series, PACF is used to
determine the order of the AR models. Therefore before deciding to use an AR model,
these two questions should be asked to the data40:
2 7
1. What is the order of process?
2. How can the parameters of the process be estimated
To describe the Partial autocorrelation function, following AR models is used to find the
order of the partial autocorrelation...
t t p p 1 0,1 1,1 1 1 = f +f + e -
t t t p p p 2 0,2 1,2 1 2,2 2 2 = f +f +f + e - -
t t t t p p p p 3 0,3 1,3 1 2,3 2 3,3 3 3 = f +f +f +f + e - - - (2.6)
…
39 Ruey S. Tsay, Analysis of Financial Time Series, John Wiley and Sons, 2001, ISBN 0471415448,
9780471415442, p.36
40 Christopher Chatfield,The Analysis of Time Series: An Introduction, Edition: 6, CRC Press, 2004, ISBN
1584883170, 9781584883173, p.59
45. Where, 0, j f is the constant term, i, j f is the coefficient of t j p - and jt e is the error of AR(j)
model. in the process, the partial autocorrelation which is highest than the order of the AR
is going to be zero41.
p = (2.9)
2 8
2.1.2.3 Moving Average (MA)
Moving average is described as an average shift of the body of the data. As an
instance, a 12-hour moving average is produced by dividing 12 the sum of the nearest data
in the series. End of this procedure, the average of the series is shifted forward by 12 times.
The moving average method is defined as following for the MA(1):
1 -1 - = - t t t Y m a q a or t t Y - m = (1 -q B)a (2.7)
Where, finite number of non-zero 1 y weight is 1 1 y = -q and -1 = t t Ba a . This is for the first
order moving average but if we consider the order q moving average, then the weight is
rewritten for the order q:
t t
q
t t q Y - m = (1-q B -...-q B )a = q (B)a (2.8)
After that autocorrelation function is defined as
-
q
+
1 2 1 q
Where, = 0 k p for k 1. This shows that observations more than one step are not
correlated but one step observations should be correlated42. Furthermore, if we expand the
autocorrelation model for the order q, then we observe the following equation:
41 Tsay, p.36
42 Abraham and Ledolter, p.215
46. - + + +
= + - k=1, 2, . . . ,q (2.10)
= p = (2.11)
- -
=
p (2.12)
f (2.13)
2 9
q q q L
q q
k k q k q
q
1 1
1 q q
k p 2
1
2
L
+ +
As a result, because the MA models are time invariant and they are produced by
finite linear combination of white noise, the MA models are always said to be weakly
stationary43.
To determine the sufficient order of the MA models, partial autocorrelation function
is also used as AR models with some differences. While PACF of MA process at the order
of q is waving like a sinusoidal or exponential, ACF of the model cuts immediately after
lag q. However, it is difficult to determine the partial autocorrelation for the higher degree
of the MA model because the model is dominated by the disruption in exponential and
sinusoidal wave.
PACF for the MA models is defined as follows:
- -
=
q q
4
2
q
-
1,1 1 2 1
(1 )
1 q
q
f
+
+
2 2
q q
6
2
q
2 4
2
1
2
1
p p
2
1
2
-
2 1
-
-
=
p
2,2 1
(1 )
1 1 1 q q
q
f
-
+ +
=
+
+
=
p
p
3 2
- -
=
q q
8
2
1
3
1
2,2 1
(1 )
1 2 q
f
-
-
=
p
For the k th order, the PACF should be,
2
q q
2( 1)
. 1
k
(1 )
- +
- -
= k
k k q
43 Tsay, p.43
47. The difference in terms of the PACF and the ACF functions between AR(p) and
MA(q) is that in AR(p) models while ACF is going to infinity, the PACF cuts of after lag p,
however, for the MA(q) models while PACF is going to infinity and dominated by damped
exponentials and sinusoidal wave, ACF cuts off after lag q44.
2.1.2.4 Autoregressive And Moving Average Process (ARMA)
A useful model is composed of the advantages of both autoregressive and moving
average process so this process is called mixed autoregressive and moving average process
(ARMA). The model of ARMA(p, q) is the representation of AR model with the order of p
and MA model with the order of q. The ARMA process is defined as following:
(1 B B p )(Y ) (1 B B )a 1 1 1 -f -L-f - m = -q -L-q (2.14)
= (2.17)
= (2.18)
3 0
t
q
t q
Then if we redefine the AR and MA process as following:
AR(p): 1 1 f(B )= f1 -B -Lf B- p (2.15)
MA(q): 1 ( ) 1 q
q q B = q -B -Lq -B (2.16)
Such a way, a pure MA process is described as
B B
( ) t t Y - m B=y a ( ) ( )
B
( )
q
y
f
And a pure AR process is described as
B B
( ) ( ) t t p B m-Y =a ( ) ( )
B
( )
f
p
q
44 Abraham and Ledolter, p.218
48. In ARMA process, autoregressive parameters ( 1 f , 2 f , 3, ,p f Lf ) manage the
autocorrelation of the model, but the moving average parameters ( 1 q , 2 q , 3, ,q q Lq ) don’t
have such an effect on the process45. We should also be sure that the roots of f(B )= 0 are
outside the unit circle for stationarity and the roots of q (B )= 0 are outside the unit circle
for invertibility46.
For ARMA(p, q) model, the ACF and the PACF have the behaviors of both AR(p)
and MA(q) process. In addition to this we can estimate the parameter of I(q) by the PACF,
as it is indicated by Wei, PACF invokes that time series needs to be differentiated if the
PACF of the time series declines very slowly47. For a non-stationary data ARIMA(p, d, q)
model has the ability to represent the model efficiently. There is a close relationship
between AR(p), I(d) and MA(q), however there is not an algorithm to find the correct
model for forecasting48. Determination of the orders of the AR(p), MA(q) and ARMA(p, q)
processes are summarized in the table below.
Table 2.3: Summary of ACF and PACF in AR(p), MA(q) and ARMA(p, q) Processes
45 James Douglas Hamilton, Time Series Analysis, Princeton University Press, 1994, ISBN 0691042896,
3 1
9780691042893, p.60
46 Abraham and Ledolter, p.223
47 Kadri Yürekli, Osman Çevik, Detection of Whether The Autocorrelated Meteorological Time Series
Have Stationarity by Using Unit Root Approach: The Case of Tokat, Gaziosmanpasa University, Magazine of
Faculty of Agriculture, 2005, 22 (1), 45-53, p.46
48 SPSS User Manul, “SPSS® Trends 13.0”
49. Table 2.4: The Route of AR(p), MA(q) and ARMA(p, q) Processes
Reference: http://www.shef.ac.uk/pas/TimeSeries/Fitnew.pdf, p.51
3 2
2.1.2.5 Smoothing Methods
Smoothing means averaging the data into more representative value this sometimes
become the average of the past data equally or sometimes there is weighting parameters
between old and newly observed data. Generally, smoothing methods are useful for short
term forecasting. Base of smoothing methods are depends on identifying historical trends in
http://webs.edinboro.edu/EDocs/SPSS/SPSS%20Trends%2013.0.pdf
50. the time series to be forecasted, then the smoothing method produce forecasting by
extrapolating the patterns.
Table 2.5: Two Filter for Time Series
3 3
Reference: Chatfield, p.18
Another meaning of smoothing is that the noise or unpredicted fluctuations which
are not desirable throughout a time series so this kind of errors should be eliminated by the
smoothing parameters for every smoothing period49. For example, if we want to remove
local fluctuation we may use a smoothing method which is called low-passed filter, or if we
want to remove long-term fluctuation we may use a smoothing method which is called
high-passed filter50. In the Table 2.6, there are some filtering models for different
situations; it also shows the different smoothing models.
49 Douglas C. Montgomery, Chery L. Jennings, Murat Kulahci, Introduction to Time Series Analysis and
Forecasting, John Wiley Sons Inc., 2008, p.171
50 Chatfield, p.18
51. Table 2.6: The Process of Smoothing A Data Set
There are three main smoothing models which are the subjects of the this research
1. Simple exponential smoothing method
2. Holt’s methods or double exponential smoothing method
3. Holt-Winters methods or triple exponential smoothing method
As it is shown in the Table 2.7, there is equality between the optimal one-step-ahead
ARIMA model and single exponential smoothing and the double exponential smoothing
methods51.
51 Minitab Inc. Single And Double Exponential Smoothing, May. 15, 2001, p.4
3 4
52. Table 2.7: Smoothing Methods – ARIMA
2.1.2.6 Simple Exponential Smoothing Methods
Exponential smoothing is a forecasting method which can be also applied to time
series to produce smoothed data. The Exponential Smoothing model is based on weighted
average of past and current values so we can adjust the weight of smoothing. In terms of
seasonality, it adjusts the weight on current values to account for the effects of swings in
the data. The weight of the model is represented by a new term alpha a which takes the
values between 0-1 so that the sensitivity of the model can be adjusted. Therefore, in
addition to the moving average model, exponential smoothing provides an exponentially
weighted moving average of all previously observed data52. When the sequence of
observations begins at time t = 0, the simplest form of exponential smoothing is given by
the formulas:
New Forecast = [a X (new observation)] + [(1-a ) X (old observation)]
ˆ ˆ( 1 ) t t t Y aY a Y + = + - (2.19)
ˆ
t Y + = new smoothed value or the forecasted value for the next period
3 5
Formal exponential smoothing equation:
1
Where, the variables are defined as:
1
a = smoothing constant (0 a 1)
53. t Y = new observation or actual values of series in period t
ˆ
t Y = old smoothed value or forecast for period t
If the equation (2.19) is rewritten, we can get this equation:
ˆ ˆ ( ˆ ) t t t t Y Y aY Y + = + - (2.20)
=å - (2.21)
3 6
Y ˆ = aY + ˆ( Ya - 1 ) Y = ˆ a + Y a ˆ -
Y t + 1
t t t t t 1
Since a time series has a trend and the forecasting model doesn’t accept a time
delay, exponential smoothing model carries very important advantage over simple
forecasting models, which is that the exponential smoothing model does not have a time
delay or phase effect53.
Selecting the optimal a is one of the biggest issues for exponential smoothing
method. It is suggested by Brown that the constant discount efficient (w =1 -a ) should be
lies between ( . 7 10g/) and ( . 915g/) where g is the number of parameters, or the value of the
w =1 -a should be traced and the value of smoothing constant which makes the sum of the
squared one-step ahead forecasting error (SSE) minimum should be selected54.
n
( ) [ ( 1ˆ ) 2
]
S Sa E Y Y-
1
1
t t
t
=
Upon selecting optimal a , the value sample autocorrelation function of one step
ahead forecasting error should be calculated for adequacy of the model if the value is found
52 Hanke and Wichern, p.114
53 D. G. Infield, D. C. Hill, Optimal Smoothing for Trend Removal in Short Term Electricity Demand
Forecasting, IEEE Transaction on Power Systems, Vol. 13, No. 3, August 1998, p.1116
54 Abraham and Ledolter, p.158
54. to be significant then it means the model is not appropriate for forecasting55. Final model
for the exponential smoothing is given below:
Y ˆ = a + Ya ( - 1 a Y ) + a ( 1 - aY 2 ) a + ( 3
Ya 1 - ) + t + t - t (2.22)
- t t K 3 7
1 2 3
Table 2.8: Comparison of Smoothing Constants
a = 0.1 a = 0.6
Period Calculation Weight Calculation Weight
t 0.1 0.100 0.6 0.600
t-1 0.9x0.1 0.090 0.4x0.6 0.240
t-2 0.9x0.9x0.1 0.081 0.4x0.4x0.6 0.096
t-3 0.9x0.9x0.9x0.1 0.073 0.4x0.4x0.4x0.6 0.038
t-4 0.9x0.9x0.9x0.9x0.1 0.066 0.4x0.4x0.4x0.4x0.6 0.015
All others 0.059 0.011
Reference: Hanke and Wichern, p.114
2.1.2.7 Exponential Smoothing Adjusted For Trend: Holt’s Method
For a simple exponential smoothing method, the level of mean is constant over the
time series. However, if the mean changes locally and the mean needs to be recalculated,
the simple exponential smoothing methods become incapable of handling the trend. The
Holt’s technique is regarded as capable of handling trend but not seasonality56. To identify
the Holt’s method (sometimes called as double exponential smoothing), two parameters are
used. First parameter a which is previously used for simple exponential smoothing model
and the second parameter is g . By the Holt’s method the newer observation takes higher
weight than the old observation for forecasting model because the an equally weighted
model means that decaying the weight of observation exponentially in time series makes
55 Abraham and Ledolter, p.158
56 Chatfield, p.78
55. the newer observation more important. The weighting of observation is defined by the
parameter of a 57.
The three equations used in Holt’s methods are:
1. The exponential smoothed series, current level estimation:
1 1 ( 1 ) ( ) t t t t L a Y a L T - - = + - + (2.23)
3 8
2. The trend estimate:
1 1 ( ) ( 1 ) t t t t T g L Lg T - - = - + - (2.24)
3. forecast p period into the feature:
ˆ
t P t t Y L p T + = + (2.25)
Where the parameters are defined as:
t L = new smoothed value (estimated of current level)
a = smoothing constant for the level (0 a 1)
t Y = new observation or actual value of series in period t
g = smoothing constant for trend estimate (0 g 1)
t T = trend estimate
p = periods to be forecast into the future
57 Joseph J. La Viola Jr., Brown University Technology Center for Advanced Scientific Computing and
Visualization, Double Exponential Smoothing: An Alternative to Kalman Filter-Based Predictive Tracking, The
Eurographics Association 2003. www.cs.brown.edu/~jjl/pubs/kfvsexp_final_laviola.pdf, p.2
56. ˆ
t p Y + = forecast for p period into the future
The smoothing parameters a and g are optimized using the minimum one step
ahead mean squared error criterion (MSE) or mean absolute percentage error (MAPE).
Amount of change is subject to the weight of the parameters for example large weight
causes rapid change in the component, besides a small weight in the parameters cause a less
rapid change in the component. Therefore, more smoothed values is placed in the data if the
weight is larger58.
2.1.2.8 Exponential Smoothing Adjusted For Trend And Seasonality Variation:
Winter’s Method
As previously defined Holt’s methods can not deal with only trend but it can be
enhanced to be efficient for trend plus seasonality. In 1957, C.C. Holt suggest a model for
non-seasonal time series with no trend then he again presented a procedure which can
handle the trend. In 1965, Winter generalized the Holt’s formula to add a functionality to
handle the seasonality59. The enhanced method is called Winter’s method or Holt-Winters
method. Winter’s method uses three parameters which are a for updating the level, g for
slope and d for the seasonal component60. The minimum one step ahead mean squared
error are used for determining the optimal smoothing hyper parameters, it is never
forgotten that if the parameters are set to be 1 then it means that the naïve model is used for
selection criteria and only the last observation takes the meaning full of the model61. The
Holt-Winters method has two versions first one is additive and the second one
58 Hanke and Wichern, p.122
59 http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc437.htm, Acces Date: 19.05.2009
60 Abraham and Ledolter, p.167
61 Reinaldo C. S., Mônica B., Cristina Vidigal C. de Miranda, Short Term Load Forecasting Using Double
Seasonal Exponential Smoothing and Interventions to Account for Holidays and Temperature Effects
http://www.ecomod.org/files/papers/294.pdf, p.4
3 9
57. multiplicative. The use of a version of Holt-Winters method depends on the characteristics
of the particular time series.
The Winter’s method for a model with linear trend and multiplicative seasonality is applied
to the formula below:
Forecast = (Level + Linear Trend)* Seasonal
1. The exponentially smoothed series or level estimate:
= + - + (2.26)
d d - = + - (2.27)
4 0
L Y L T
a t ( a 1 ) ( )
t t t
- 1 +
1 t s
S
-
2. The trend estimate:
1 1 ( ) ( 1 ) t t t t T g L Lg T - - = - + - (2.26)
3. The seasonality estimate:
S Y S
t ( 1 )
t t s
L
t
4. Forecast for p periods into the future:
ˆ ( ) t p t t t s p Y L p T S + - + = + (2.28)
Where the parameters are defined as:
t L = new smoothed value for current level estimate
a = smoothing constant for the level
t Y = new observation or the actual value in period t
58. g = smoothing constant for trend estimate
4 1
t T = trend estimate
d = smoothing constant for seasonality estimate
t S = seasonal estimate
p =periods to be forecast into the future
s = length of seasonality
t p Y + = forecast for p period into the future
The Winter’s method for a model with linear trend and additive seasonality is applied to the
formula below:
Forecast = Level + Linear Trend + Seasonal
5. Forecast for p periods into the future:
ˆ
t p t t t s p Y L p T S + - + = + +
While applying Holt-Winter method to the seasonal data, the things needs to be
done with a great care are given in “The Analysis of Time Series” by Christopher C. they
are listed as below62:
1. Examine a graph of the data to see whether an additive or a
multiplicative seasonal effect is the more appropriate
62 Reinaldo Castro Souza, Mônica Barros, Cristina Vidigal C. de Miranda, Short Term Load Forecasting
Using Double Seasonal Exponential Smoothing and Interventions to Account for Holidays and Temperature Effects
http://www.ecomod.org/files/papers/294.pdf, p.79-80
59. 2. Provide starting values for 1 L and 1 T as well as seasonal values for
the first year, here it is hour, say I , IK , ,I , using the first few
1 2 s observation in the series in a fairly simple way; for example, the
analyst could choose L =åx s
/ s .
1 1 i 3. Estimate values for a, g , d by minimizing 2
4 2
t åe over a suitable
fitting period for which historical data are available.
4. Decide whether to normalize the seasonal indices at regular
intervals by making they sum to zero in additive case or have
average of one in the multiplicative case.
Choose between a fully automatic approach (for a large number of series) and a
non-automatic approach. The later allows subjective adjustments for particular series, for
example, by allowing the removal of outliers and a careful selection of the appropriate form
of seasonality.
2.2 Test Of Stationarity
Since we have time series analysis, we first determine if the series is stationary
otherwise spurious regression may be observed because of non-stationary situation63. The
reason that makes the series to be non-stationary is the effect of the one or more of the
following time series conditions: outliers, random walk, drift, trend or changing variance64.
As it is seen in the Figure 2.1, hourly electrical energy consumption series has a
seasonality, trend and also cycle so if the series is found to be non-stationary, we should
63 Ferhat T., Serdar K., Issiz ve Bosanma Iliskisi 1970-2005 VAR Analizi, p.6
64 Yaffee and McGee, p.78
60. make it stationary before the forecasting techniques can be applied to the series65. The
series is called stationary if its mean and variance of observed data are constant and the
difference between two observed data t Y and t d Y - are the base of the covariance and it
doesn’t change over time66. To test the series in terms of stationarity, “Augmented Dickey-
Fuller” (ADF - Test) which was improved by Dickey and Fuller in 1981 or Philips-Perron
test (PP - Test) can be used. However the two methods give same result, ADF test is
preferred because ADF test is more applicable.
ADF test is applied to the following formula:
1 2 1 b b d a e t = 1, 2, 3, … T (2.29)
t t i t i t Y t Y Y
4 3
m
å=
- - D = + + + D +
i
1
Where t DY ; first-difference operator of the series, t; trend variable, t i Y - D ;
difference between observed and following times, t e is the error term of the process, m is
the lag length of the sum. Selecting an optimal lag length is very important for the
adequacy. If m is chosen very large then it is a possible danger to reduce adequacy of the
test; on the other hand, if the m is chosen too small the result of the ADF test might be
wandered by the remaining serial autocorrelation in the errors67. For the optimum lag
length, Ng and Perron suggest that m a x p = p should be selected and check if the absolute
value of the last lag is greater than 1.6 and the lag length is reduced by one and repeating
the process68.
1 / 4
é æ ùö = ê ç ú÷
êë è úûø
p T
m a x 1 2 .
1 0 0
(2.30)
65 Peter Kenedy, A Guide to Econometrics, Edition: 5, MIT Press, 2003, ISBN 026261183X,
9780262611831, p.350
66 Ajoy K. Palit, Dobrivoje Popovic, Computational intelligence in time series forecasting: theory and
engineering applications, Springer, London, 2005, ISBN:1852339489, p.18
67 Eric Zivot, Lecturer Notes: Choosing the Lag Length for the ADF Test,
http://faculty.washington.edu/ezivot/econ584/notes/unitrootLecture2.pdf, p.1
68 Zivot, p.1
61. In the equation (2.29), both a constant or intercept 1 b and time trend variable t
are included. The term ( t 2 b ) is omitted from equation (2.29), if the series has a constant
term 1 b but no time trend69. Augmented Dickey-Fuller test also eliminates the possibility
of an auto correlated error70.
Table 2.9: Critical Values for ADF Test
4 4
Number of
Observation
Significance Level
1% 2,5% 5% 10%
25 -3.75 -3.33 -3.00 -2.63
50 -3.58 -3.22 -2.93 -2.60
100 -3.51 -3.17 -2.89 -2.58
250 -3.46 -3.14 -2.88 -2.57
500 -3.44 -3.13 -2.87 -2.57
inf -3.43 -3.12 -2.86 -2.57
Reference: MacKinnon, James (1991), Critical Values for Cointegration Tests, Chapter 13 in Robert Engle
Clive Granger, eds., Long-run Economic Relationships: Readings in Cointegration, Oxford University
Press, Oxford, pp. 267-276, p.272
ADF test defined by equation (2.29), is aiming to test the value of d is statistically
equal to zero or not. Zero hypotheses, the series which are not differentiated have unit-root
so they are not stationary. If the coefficient d is statistically significant; then it means to
reject the hypothesis and let’s say that the series is stationary. If the coefficient d is
statistically not significant; then it means to accept the zero hypotheses. To test the result of
the ADF test, the result is compared to the values in the Table 2.9 which is obtained from
MacKinnon (1990). If the absolute value of the ADF test is less than the value in the Table
2.9, we will accept the null hypothesis and say that the series is not stationary.
0 H : The series is not stationary.
69 Wang Baotai, Tomson Ogwang, Is the Size Distribution of Income in Canada a Random Walk?,
62. 4 5
1 H : The series is stationary.
If the series is found to be non-stationary, one way to make the series stationary is to
difference the series until the series is accepted as stationary. However in every
differentiation, the series looses one observed data. After this process, the series is called as
differentiated time series, which is represented as ‘I’ in ARIMA process. The ARIMA
(Auto Regressive Integrated Moving Average) process is an addition to ARMA process.
2.3 Model Checking
Before starting forecasting with possible forecasting models, the most important
thing should be done is to test the adequacy of each models. For the adequacy of model,
two plots are needed. First plot is the time plot which helps to determine if the time series
has any outlier data, and the second plot is the correlogram of the residuals which assists to
test the effect of the autocorrelation. The correlogram of such model which is acceptable as
an adequate model should be normally distributed, with mean zero and the variance 1 / N ,
where, N is the number of observation. Another meaning of ACF function is that if all the
ACFs are statistically equal to zero the time series is called as Gaussian white noise71. For
an adequate model, the residual autocorrelation, the autocorrelation should lies in the
interval calculated by the formula below72.
m2 /N (2.31)
The portmanteau lack-of-fit test can be used to test the residual autocorrelation. The
portmanteau lack-of-fit test is considered to test the first K values of the residual
correlogram all at once. The test statistic is defined by the formula below:
Economics Bulletin, Vol. 3, No. 29, 2004, p.3
70 Kenedy, p.350
71 Tsay, p.31
72 Chatfield, p.68
63. = å (2.32)
4 6
2,
Q N r
1
K
z k
k
=
Where, N is the number of term in the difference series and the K is chosen as a
number between15 to 30, 2,
z k r is the autocorrelation coefficient at lag k of the residuals. if
the result of the test says that the model successfully fits to the series, the Q is distributed as
c2 with (K – p - q) degrees of freedom where p and q are the parameters of AR and MA
process respectively73. The checks for the model estimation is listed by John E. H., Dean
W. W as:
1. Many of the same residual plots that are useful in regression analysis can be
developed for the residual from an ARIMA model. A histogram and a normal
probability plot (to check for normality) and a time sequence plot (to check for
outliers) are particularly helpful.
2. The individual residual autocorrelation should be small and generally be within
m2 /N of zero. Significant residual autocorrelations at low lag or seasonal
lags suggest the model is inadequate and a new or modified model should be
selected.
3. The residual autocorrelations as a group should be consistent with those
produced by random errors.
An enhancement type of portmanteau test as called Ljung-Box Q test is used to
examine the adequacy of the model. Ljung-Box Q test is applied to the formula below:
2
Q N N r e
( 2 ) ( )
1
K
k
m
k
= N k
= +
- å (2.33)
Where the parameters are :
64. ( ) kr e = the residual autocorrelation at lag k
4 7
n = the number of residuals
k = the time lag
K = the number of time lag to be tested
As it is indicated by Ruey S. Tsay, the residuals of a model should behave like a
white noise. The ACF and the LBQ statistic of the residuals can be used for the checking of
the closeness of the model to white noise. For example, the correlations of the series whose
residual autocorrelation function illustrates an additive serial autocorrelation are examined
with spending more attention. For an AR(p) model, the Ljung-Box statistic Q(m) follows
asymptotically a chi-square distribution with d =f m- g degrees of freedom. Where, g is
the number of coefficient. If a fitted model is found to be inadequate, it must be redefined
so that to remove the significant coefficients by simplifying the model74.
By the result of the test, we can test the hypothesis that the model is adequate for
the time series data and the model can be used for forecasting. If the p value is greater than
significance level (p-value .05 for 5 percent significance level) than the null hypothesis is
accepted75.
· H0 : The model adequately describes your data
· H1: The model does not adequately describe your data
Upon accepting the null hypothesis, the next step is to selection of the model among
the adequate models. Next section summarizes the model selection criteria.
73 Chatfield, p.68
74 Tsay, p.44
75 Hanke and Wichern, p.392
65. Another important test for model checking is called by Goodness-of-Fit test. The
test is used to test whether the model fits the time series. In the goodness-of-fit test, the test
parameter is R-square ( R2 ), which is defined as following formula;
R s i d u a l s u m o f s q u a r e s
= - (2.34)
T o t a l s u m o f s q u r e s
4 8
2 1 R e _ _ _
_ _ _
2
T
t p
T
2 1
2
= +
1
1
( )
t
t p
e
R
r r
= +
= -
-
å
å
(2.35)
å
1
T
t
t p
r
r
= = +
T -
p
(2.36)
Where, T is the number of observation. The R2 has a value in the interval from 0 to
1, which is 0 R2 1. The model which has larger R-square value fits better to the time
series. However the goodness-of-fit test is valid for only stationary time series76.
2.4 Model Selection Criteria
Akaike selection criterion (AIC)77 or Schwarz selection criterion (BIC)78 enable us
to determine the most accurate forecasting model. These criteria are defined as below,
where, sˆ 2 is the residual sum of squares divided by the number of observations, T is the
76 Tsay, p.46-47
77 Hirotsugu Akaike, A New Look At Statistical Model Identification, IEEE Trans. Automatic Control AC-
19, 1974, p.716-723
78 Gideon Schwartz, Estimating the Dimension of a Model, Annual of Scientist, Vol. 6, No. 2, March 1978,
p.461-464
66. number of observation (residual), r is the total number of parameters (including the
constant term) in the ARIMA model:
=å (2.37)
= s + (2.38)
4 9
Mean Square Error (MSE)
2
1
T
t
e
t =
T
Akaike Information Criteria (AIC) l nˆ 2 2 r
T
Swartz - Bayesian Information Criteria (BIC) l nˆ 2 l nn r
= s + (2.39)
T
Both AIC and BIC are tent to give same result so we can use one of the criteria for
the selection of model. However, because of the “penalty factor” for including additional
parameter in the model, if there is a conflict in the result of AIC and BIC choosing the
model BIC is suggested if the number of parameter by BIC is greater than the model AIC
suggests. The AIC and BIC should be thought as the additional procedures to help during
the selection of the accurate model but they are not thought as testing procedure for sample
autocorrelation and partial autocorrelation79. However, the AIC or BIC suggest the best
model of forecasting for the time series, the other descriptive indicator should be kept in
mind for the performance of the forecasting model. In the next section, other indicators for
the testing of model accuracy are represented.
2.5 Testing Of Forecasting Accuracy
The accuracy of a model can be tested by the comparison of the input variables
versus output variables80. For a forecasting model the input variables are the observed data
until the time of forecasting and the output variables are the forecasting results for desirable
period of time. Basically the forecasting error is the difference between the forecasting
79 Hanke and Wichern, p.413
67. values and the actual values. The listed formulas should be always kept in mind during
forecasting procedure.
1. Mean percentage error (MPE):
5 0
1 n ( ˆ)
M P E Y Y
= å
T = Y
1
-
t t
t t
2. Mean absolute percentage error (MAPE):
1 n | ˆ|
M A P E Y Y
= å
T = Y
1
-
t t
t t
3. Mean squared error (MSE):
2
1 n
( ˆ)
= å -
M S E Y Y
T =
1
t t
t
4. Root mean squared error (RMSE):
2
1 n
( ˆ)
= - å
R M S E Y Y
T =
1
t t
t
5. Mean absolute deviation (MAD):
1 | ˆ|
= å -
M A D Y Y
T =
1
T
t t
t
6. Forecast error, or residual (e):
ˆ
t t t e = Y -Y
80 Minitab Inc. Single And Double Exponential Smoothing, May. 15, 2001, p.7
68. 7. t statistic for testing the significance of lag 1 autocorrelation (t):
5 1
t r
1
1 ( )
S E r
=
8. Random model (Y):
t t Y = c +e
9. Ljung-Box (Modified Box – Pierce) Q statistic (Q):
2
m
Q T T r
1
( 2 )
k
k
= T k
= +
- å
10. Standard error of autocorrelation coefficient (SE):
1
2
-
1
1
( )
k
i
i
k
r
S E r
=
T
+
=
å
2. kth order autocorrelation coefficient (r)
1
Y Y Y Y
( ) ( )
2
1
-
( )
T
t t k
t k
k n
t
t
r
Y Y
= +
=
- -
=
-
å
å
2.6 Analysis Of Outlier
The success of an analysis starts with the successive data observation. Such an error
or a kind of lack of attention may deeply affect the analysis. Outlier is described by
Hawkins (1980) that an outlier is an observation that deviates so much from other
69. observations as to arouse suspicion that it was generated by a different mechanism81. At
this point, any outlying data points in a time series data may mislead analysis in modeling
process. Since there has been unpredictable event such as strikes, outbreaks of war, and
sudden changes in the marketing strategy can occur any time, time series data is directly
affected by this intervention. Because the effect of such unpredictable events can deviate
the parameter estimation, forecast and seasonal adjustment, the outliers should be
determined before starting to apply forecasting model82. The reasons for the outlier can be
classified into four classes83:
· Procedural error, generally this kind of error occurs by the lack of attention
during data entry. Procedural error can be eliminated in data cleaning.
· Extraordinary event, such an event that explains the uniqueness of the
situations. The researcher must decide if the observation during extraordinary
event is taken into the analysis or not.
· Extraordinary event, such an event can not be explained the origin of the event.
Generally this kind of extraordinary event should be omitted.
· Outlier in the range of population, sometimes the outliers can lie in the range of
population. If there is a specific reason for the cause of data is not a member of
valid population then the outliers must be eliminated.
In the time series analysis, if we think an AR(p) model, possibly there two kinds of
outliers are presence in the series. First one is additive outliers (AO) which affects the time
series from a single point and the second one is innovative outliers (IO) which affects the
subsequent series and an observation by an innovation. The affects of the outliers, named
81 Irad Ben-Gal, Outlier Detection, Department of Industrial Engineering, Tel-Aviv University, p.1
82 Abraham and Ledolter, p.356
83 Hanke and Wichern, p.64-65
5 2
70. AO and IO are evaluated and measured separately84. Mathematically, an additive outlier h y
is defined as;
5 3
x w i f t h
, h
t
ì + ® =
= í î
® t
y
x o t h e r w i s e
Where, w is the magnitude of the outlier and t x is an outlier free time series. According to
Tsay, the other type of outliers can be listed as85;
· Additive outliers (AO)
· Innovative outliers (IO)
· Level Shift (LS)
· Permanent level change (LC)
· Transient level change (TC)
· Variance change (VC)
The identification of outlier can be performed as univariate, bivariate and
multivariate structure.
2.6.1 Univariate Detection Of Outlier
Detection of univariate outlier depends on a known distribution of data. The
analysis is performed under the condition that the a generic model for which the number of
84 Watson S. M., Tight M., Clark S., Redfern E., Detection of Outlier in Time Series, Institute od Transport
Studies, University of Leeds, Working Paper 362, 1991, p.1.3
85 Watson S. M., Tight M., Clark S., Redfern E., p.5
71. observation become smaller and distributed form the distribution 1, , k G KG , which is
differentiated, as accepting normal distribution F, from target distribution86.
5 4
2 {
1 / 2 o ( u , t, )x : x | |Z a a m s m s- = -
Where, the confidence level a , 0 a 1 ; and the a -outlier region of N(m ,s2 ).
The x is an outlier with respect to F.
The method of univariate detection depends on the standard scores, comparison of the
observed data versus the standard score determines the data as outlier. Typically for the
small number of sample, let’s say 80, the boundary for the valid data sets 2.5 of standard
score or greater. For the large number sample of data the range can be extended to 3 or4
times of standard score87.
2.6.2 Bivariate Detection Of Outlier
In univariate detection of outlier, the outlier boundary is estimated by the standard
score Z, for the univariate detection of outlier there are two variables are used to draw a
scotterplot and a boundary for the valid value of data88. The data which is outside of the
confidence boundary is accepted as outlier.
86 Ben-Gal, p.2
87 Hanke and Wichern, p.65
88 Hanke and Wichern, p.65
72. Figure 2.4: Scatterplot for Bivariate Outlier Detection
5 5
2.6.3 Multivariate Detection Of Outlier
This type of outlier detection is used for multivariate data set. The method depends
on the test of the Mahalanobis Distance (Mahalanobis D2) which is suggested by P. C.
Mahalanobis in 193689. The application of the Mahalonobis distance is performed on linear
regression model. As it is shown in the Figure 2.11, on the model one liner line is
determined and mahalanobis distance for each variable is calculated. The observation
which has greater value has more influence on the slope or the coefficient of regression
model. Mahalanobis distance is defined by the formulation below90, where S is the
covariance matrix;
2 1
1 2 1 2 D =Y( Y)- S' - Y-( Y ) (2.40)
89 Alvin C. Rencher, Methods of multivariate analysis, Edition: 2, John Wiley and Sons, 2002, ISBN
0471418897, 9780471418894, p.76
90 Rencher, p.76
74. 5 7
SECTION 3
3 APPLICATIONS OF FORECASTING METHODS TO THE
ELECTRICAL ENERGY DATA OF TRAKYA REGION FOR
SHORT TERM ENERGY DEMAND
In this section, the forecasting techniques introduced in the previous section will be
applied to the data. As it is described, forecasting methods are classified as quantitative and
qualitative methods. Qualitative methods are basically used for any cases that don’t have
enough observation and generally for the long term forecasting. More about the qualitative
methods, Delphi Method generates forecasts depend on the expert’s opinion. After a
consensus, if the result is accepted then the forecasting model can be used for only the case
being discussed. The second qualitative method Scenario Writing aims to produce forecasts
for the long term forecasting for the subjects like new marketing strategy or technological
improvement on a product. Therefore the method is not practical for number based
structure. Market Research and Focus Group are a kind of survey to demonstrate people's
thought about present product or services to find out the effect of new product or service.
Behind the disadvantages of qualitative methods for short term forecasting, they are
systematical ways to generate long term forecasting even if there is no eligible data. Since
the quantitative methods are more efficient to represent number based structure, they are
75. used to generate forecasting with some performance terms which enable us to compare
them. At the end of each method’s application, advantages and disadvantages of the method
will be introduced with error terms.
In the research, we have the electrical consumption data of Trakya region in
Turkey for whole year of 2005, half of 2006 and 2007, it is totally 23 months. This data
includes both the sum of active energy and the sum of reactive energy which are hourly
taken from transformers located in Trakya region to provide energy for Trakya region and it
also includes hourly load of each transformers. However, for the sum of the reactive
energy, there are some empty fields to make a forecasting model. Therefore, the research
focus on forecasting of active power, the data is converted into one column and it just
contains active energy information for the whole year 2005 and from August to December
of 2006 and from January to June 2007. However the data contains the whole year active
energy stored as hourly, the data of the first moth is used to establish the best fitted
forecasting model such as ARMA(p, q) models or a smoothing method for sort term
electric energy forecast. It is good enough information/observation to make an accurate
forecasting model. Furthermore, for the first month, January 2005, all the models are
established and related result will be given in the analysis if each forecasting model
separately. By this way at the end of the forecasting process, we will have a chance to
compare the each result of the forecasting models against to the real consumption values.
5 8
3.1 Exploring Data Pattern
Time series is the observation of the variable during time so the data which comes
after the previous one has the information about the previous one. This kind of relation is
called as correlation. Autocorrelation coefficient gives the correlation function of the series
and also gives information about the pattern of the estimated model92. Therefore upon
92 Ajoy K. Palit, Dobrivoje Popovic, Computational intelligence in time series forecasting: theory and
engineering applications, Springer, London, 2005, ISBN:1852339489, p.60
76. starting to the time series analysis it is needed to analyze the autocorrelation and the data
pattern of the series.
5 9
1
Y Y Y Y
( ) ( )
2
-
( )
n
t t k
t k
k n
t
t k
r
Y Y
= +
=
- -
=
-
å
å
k = 0, 1, 2, … (3.1)
Where,
k r = autocorrelation coefficient for lag k
t k Y - = observation at time period t-k
Y = mean of the series
t Y = observation at time period t
-500 -250 0 250 500 750 1000 1250
4500
4000
3500
3000
2500
2000
1500
powerJan2005_Diff1
powerJan2005
Scatterplot of powerJan2005 vs powerJan2005_Diff1
Figure 3.1: Scatter plot of January 2005 with Lag 1 Difference
77. Table 3.1: Autocorrelation of January 2005 with Lag 1 Difference
Lag ACF T LBQ Lag ACF T LBQ
1 0,959742 26,18 688,07 16 0,109403 1,09 2425,82
2 0,876469 14,18 1262,69 17 0,195377 1,95 2454,96
3 0,765886 9,98 1702,05 18 0,294581 2,92 2521,3
4 0,642767 7,44 2011,93 19 0,399198 3,92 2643,3
5 0,517686 5,59 2213,21 20 0,502832 4,83 2837,13
6 0,392552 4,07 2329,1 21 0,601329 5,61 3114,71
7 0,274712 2,79 2385,93 22 0,686268 6,15 3476,76
8 0,170672 1,71 2407,9 23 0,74461 6,35 3903,57
9 0,08651 0,87 2413,55 24 0,762993 6,18 4352,33
10 0,023943 0,24 2413,98 25 0,721686 5,57 4754,38
11 -0,01311 -0,13 2414,11 26 0,640737 4,75 5071,74
12 -0,0283 -0,28 2414,72 27 0,534974 3,85 5293,28
13 -0,02622 -0,26 2415,24 28 0,41895 2,96 5429,34
14 -0,00294 -0,03 2415,25 29 0,300678 2,1 5499,52
15 0,043561 0,44 2416,69 30 0,183286 1,27 5525,63
As a result of the autocorrelation plot, the correlation between t Y and t 1 Y - at the lag
1 is positive and the lag 1 autocorrelation coefficient is k r = 0,959742 which means that
there is a high correlation between two corresponding data point. However when the lag is
higher the correlation becomes lower. As it is seen form Figure.3.4, the scatter plot is not a
straight line, the correlation distributes in a very large of scale the reason for this is having
the very small autocorrelations for the higher order of lag. What is more, from the Table
3.1, while the correlation decreases, at the lag 24 the autocorrelation gets the highest value
which is 0,762993 for the rest of the series. Therefore this means that there is a seasonality
which occurs every 24 observed data.
6 0
79. The autocorrelation at lag 1 between the seasonally differentiated data and raw data
is k r = 0,996995 and the correlation values is decreasing very slowly relatively to the
autocorrelation table for the raw data and lag differentiated data. This means that between
two data, there is a very high correlation so it can be said that there is seasonality of 24
hours between in the series. As it is seen form Figure 3.2, the scatter plot is not a straight
line but comparing the Figure 3.1 the autocorrelations are handled more efficiently.
1 44 88 132 176 220 264 308 352 396 440
6 2
4500
4000
3500
3000
2500
2000
Index
power0105_Bus
Variable
Actual
Fits
Forecasts
Accuracy Measures
MAPE 21
MAD 627
MSD 497039
Trend Analysis Plot for power0105_Bus
Linear Trend Model
Yt = 3344,9 + 0,374*t
Figure 3.3: Trend Line Plot for January 2005
80. 1 44 88 132 176 220 264 308 352 396 440
6 3
4500
4000
3500
3000
2500
2000
Index
power0105_Bus
Variable
Actual
Fits
Forecasts
Accuracy Measures
MAPE 21
MAD 645
MSD 503538
Trend Analysis Plot for power0105_Bus
Growth Curve Model
Yt = 3264,61 * (1,00011**t)
Figure 3.4: Growth Curve Trend Model Plot for January 2005
1 44 88 132 176 220 264 308 352 396 440
4500
4000
3500
3000
2500
2000
Index
power0105_Bus
Variable
Actual
Fits
Forecasts
Accuracy Measures
MAPE 21
MAD 627
MSD 496947
Trend Analysis Plot for power0105_Bus
Quadratic Trend Model
Yt = 3323 + 0,67*t - 0,00069*t**2
Figure 3.5: Quadratic Trend Mode for January 2005