1. 445: Posttreatment stability in Class II nonextraction and two-maxillary p...Rafi Romano
This study compared the long-term stability of Class II malocclusion treatments with and without extraction of two maxillary premolars. 60 patients were divided into two groups - one treated with non-extraction and the other with extraction of two premolars. Cephalograms were taken before, after treatment, and 8 years post-treatment on average. The results showed similar stability between the two protocols for overjet, overbite, and canine and molar relationships over the long-term. However, the extraction group showed greater counterclockwise rotation compared to the non-extraction group. The conclusion was that both protocols provide similar long-term stability for treating Class II malocclusions.
Eletropaulo's total market grew 3.3% in 2005. The company received additional revenue from tariff adjustments and bond/debenture issuances. However, the company reported a loss of R$184.4 million due to extraordinary provisions and allowances, including R$346.4 million for MGSP and R$43.7 million for increased PIS/COFINS taxes. The loss and debt costs were expected to improve in 2006 as the extraordinary impacts were non-recurring and debt was restructured at lower costs.
Este capítulo discute a análise de regressão múltipla e a inferência estatística. Apresenta os testes t e intervalos de confiança para os coeficientes de regressão e discute a importância da hipótese de normalidade dos resíduos para a aplicação correta destes testes. Também introduz o teste de Jarque-Bera para avaliar a normalidade dos resíduos.
This document summarizes a presentation on short term demand forecasting for the Brazilian power system. It provides an overview of the Brazilian power system, including its history of government control and ongoing privatization process. It then discusses short term load forecasting and the importance of accounting for temperature effects and holidays. The presented work uses a version of Holt-Winters exponential smoothing to generate short term forecasts and plans to incorporate temperature data to improve results.
O documento discute a modelagem da demanda por revistas no Brasil com o objetivo de fornecer previsões precisas. Ele descreve as características do mercado de revistas no Brasil, as séries temporais de vendas disponíveis, e os modelos estatísticos usados para prever a demanda, incluindo regressão dinâmica e modelos bayesianos. O documento também discute os desafios em prever a demanda para revistas com periodicidade quinzenal devido às frequentes revisões nos dados.
1) O documento discute dados em painel, que combinam séries temporais e seção cruzada para várias unidades ao longo do tempo. 2) Apresenta dois modelos comuns para análise de dados em painel: modelo de efeitos fixos e modelo de efeitos aleatórios. 3) Usa como exemplo dados de investimento de 4 empresas entre 1935-1954 para ilustrar a análise em painel no software Stata.
O documento apresenta um resumo sobre estatística aplicada à logística. Aborda conceitos como população, amostra, variáveis, estatística descritiva e seus métodos de síntese de dados qualitativos e quantitativos. Tem como objetivo fornecer uma revisão sobre esses tópicos para aplicação em modelos de otimização logística.
1. 445: Posttreatment stability in Class II nonextraction and two-maxillary p...Rafi Romano
This study compared the long-term stability of Class II malocclusion treatments with and without extraction of two maxillary premolars. 60 patients were divided into two groups - one treated with non-extraction and the other with extraction of two premolars. Cephalograms were taken before, after treatment, and 8 years post-treatment on average. The results showed similar stability between the two protocols for overjet, overbite, and canine and molar relationships over the long-term. However, the extraction group showed greater counterclockwise rotation compared to the non-extraction group. The conclusion was that both protocols provide similar long-term stability for treating Class II malocclusions.
Eletropaulo's total market grew 3.3% in 2005. The company received additional revenue from tariff adjustments and bond/debenture issuances. However, the company reported a loss of R$184.4 million due to extraordinary provisions and allowances, including R$346.4 million for MGSP and R$43.7 million for increased PIS/COFINS taxes. The loss and debt costs were expected to improve in 2006 as the extraordinary impacts were non-recurring and debt was restructured at lower costs.
Este capítulo discute a análise de regressão múltipla e a inferência estatística. Apresenta os testes t e intervalos de confiança para os coeficientes de regressão e discute a importância da hipótese de normalidade dos resíduos para a aplicação correta destes testes. Também introduz o teste de Jarque-Bera para avaliar a normalidade dos resíduos.
This document summarizes a presentation on short term demand forecasting for the Brazilian power system. It provides an overview of the Brazilian power system, including its history of government control and ongoing privatization process. It then discusses short term load forecasting and the importance of accounting for temperature effects and holidays. The presented work uses a version of Holt-Winters exponential smoothing to generate short term forecasts and plans to incorporate temperature data to improve results.
O documento discute a modelagem da demanda por revistas no Brasil com o objetivo de fornecer previsões precisas. Ele descreve as características do mercado de revistas no Brasil, as séries temporais de vendas disponíveis, e os modelos estatísticos usados para prever a demanda, incluindo regressão dinâmica e modelos bayesianos. O documento também discute os desafios em prever a demanda para revistas com periodicidade quinzenal devido às frequentes revisões nos dados.
1) O documento discute dados em painel, que combinam séries temporais e seção cruzada para várias unidades ao longo do tempo. 2) Apresenta dois modelos comuns para análise de dados em painel: modelo de efeitos fixos e modelo de efeitos aleatórios. 3) Usa como exemplo dados de investimento de 4 empresas entre 1935-1954 para ilustrar a análise em painel no software Stata.
O documento apresenta um resumo sobre estatística aplicada à logística. Aborda conceitos como população, amostra, variáveis, estatística descritiva e seus métodos de síntese de dados qualitativos e quantitativos. Tem como objetivo fornecer uma revisão sobre esses tópicos para aplicação em modelos de otimização logística.
Este documento discute o modelo de análise envoltória de dados (DEA) utilizado pela Agência Nacional de Energia Elétrica (ANEEL) para calcular os custos operacionais eficientes das empresas de transmissão de energia elétrica no Brasil. O autor propõe uma adaptação ao modelo DEA da ANEEL, considerando os níveis de tensão das linhas de transmissão e restrições aos multiplicadores. Os resultados do modelo proposto são comparados com o modelo original da ANEEL.
The document discusses decision support methods for energy auctions and risk in the new Brazilian electricity sector regime. It proposes a simulation model to optimize energy purchasing costs for distributors. The model integrates an energy demand forecasting model with simulations of the settlement price (PLD) using hydro dispatch optimization software. It also simulates different percentages of estimated load purchased each year to minimize total costs over multiple years amid demand and PLD uncertainties.
1. O documento descreve modelos de regressão com variáveis binárias, explicando como especificar corretamente variáveis qualitativas com m categorias para evitar colinearidade.
2. É apresentado um exemplo usando dados eleitorais do Rio de Janeiro para ilustrar como interpretar os coeficientes em modelos com e sem termo constante.
3. Os resultados mostram que a região da cidade influencia os percentuais de votos brancos e nulos, sendo menores nas zonas Sul e Norte.
Este documento analisa os riscos enfrentados pelas distribuidoras de energia elétrica no Brasil sob o novo modelo regulatório, no qual o governo centraliza as aquisições de energia. As distribuidoras precisam estimar com precisão a demanda de suas áreas para os próximos cinco anos e são punidas financeiramente por erros de estimativa. A pesquisa simula os custos de contratação para uma distribuidora hipotética considerando variáveis aleatórias e conclui que o sistema é frágil, pois pequenas mudanças nos dados
Este capítulo discute o modelo de regressão múltipla com duas variáveis explicativas e apresenta os
estimadores de mínimos quadrados ordinários. Os coeficientes parciais de regressão medem o efeito
de cada variável explicativa sobre a variável dependente quando o efeito da outra variável é
mantido constante. Os estimadores MQO são obtidos resolvendo um sistema de equações que
envolve a matriz dos dados e os vetores de parâmetros e erros. A variância dos estimadores
depende de um parâmetro que mede a variância dos
Este documento apresenta um tutorial sobre econometria e o uso do software Gretl, com o objetivo de ensinar conceitos básicos de econometria e análise de dados econômicos. O documento introduz o que é econometria, características dos dados econômicos e o software Gretl, e fornece exemplos passo a passo de estatística descritiva, correlação, regressão linear simples e múltipla utilizando o Gretl.
The document discusses residential electricity consumption in Brazil following economic reforms in 1994. It summarizes that inflation fell dramatically, increasing incomes especially for the poor. Combined with increasing electronics imports, this caused unprecedented growth in residential electricity consumption. A new survey is being conducted to create an updated residential consumer profile across Brazil to aid planning and demand side management policies, as consumption growth has strained energy supply. Cluster analysis is being used to group municipalities for a stratified sampling approach in the surveys.
O documento discute os modelos de dependência espacial em econometria. Apresenta os principais conceitos da econometria espacial, como matrizes de pesos espaciais, operador de defasagem espacial, autocorrelação espacial e testes para sua detecção. Também explica os dois modelos mais usados - modelo de defasagem espacial e modelo de erro espacial - e como especificar e testar a presença de dependência espacial nos dados.
A hybrid neural network and neuro-fuzzy system model was developed to forecast weekly electricity spot prices in Brazil. The model combines the forecasts from an artificial neural network with the original inputs in a neuro-fuzzy system to generate the final forecasts. Six models were created to specialize in forecast horizons of 1 to 6 weeks ahead. Various neural network structures were tested to select the best performing models. The hybrid approach aims to incorporate the advantages of both neural networks and neuro-fuzzy systems for improved forecasting performance.
Este documento fornece uma introdução ao ambiente R, abordando tópicos como a instalação e uso básico do R, pacotes disponíveis, objetos fundamentais como vetores e matrizes, programação em R incluindo funções e depuração, e manipulação e visualização de dados.
The document summarizes the results of a 3D scan comparison between a reference model (Sheet Metal_R) and test model (Sheet Metal_T). It found 1,686,643 matching data points with an average deviation of 0.3mm and standard deviation of 0.9mm. The maximum upper deviation was 36.1mm and maximum lower deviation was -32.4mm, with 7928 outliers found. The deviation distribution chart shows that most points (50%) fell within 1 standard deviation of the mean.
The document contains numerical examples and calculations comparing different root finding methods: bisection method, false position method, secant method, and Newton's method. It shows the iterations, calculated values of x and f(x), and percent error for finding the root of sample functions using each method. Graphs are also included to help visualize the functions and convergence of solutions.
The document contains numerical examples and calculations using several root finding methods, including the bisection method, false position method, secant method, and Newton's method. Tables show the iterative calculations for each method applied to various functions, recording the iteration number, input and output values, and estimation of error. Graphs of functions are also shown. The methods are then compared based on the rate of convergence and number of iterations required to achieve a solution.
Operational Value at Risk is estimated using an Advanced Measurement Approach involving statistical modeling of operational risk loss data. Loss severity and frequency are each modeled separately using distributions like lognormal or Poisson. Monte Carlo simulation is then used to aggregate the severity and frequency distributions and estimate unexpected losses, expected losses, and the capital at risk needed to sustain operational losses exceeding expectations.
Susan Borda is a digital curation librarian who provides guidance on data management. She discusses the importance of data management plans and best practices for the various stages of the research data lifecycle including planning, collecting, managing, sharing, and preserving data. She highlights funder requirements for data management plans and tools like DMPTool that can help researchers create plans and meet funder standards. Borda also offers tips for organizing, documenting, and storing data as well as sharing data through repositories to increase citations and ensure long-term preservation.
Several of the largest MMOG companies reported their Q4/08 results over the past week; including Shanda, Netease,
and The9. The MMOG market continued to grow on a sequential basis in Q4.
Este documento discute o modelo de análise envoltória de dados (DEA) utilizado pela Agência Nacional de Energia Elétrica (ANEEL) para calcular os custos operacionais eficientes das empresas de transmissão de energia elétrica no Brasil. O autor propõe uma adaptação ao modelo DEA da ANEEL, considerando os níveis de tensão das linhas de transmissão e restrições aos multiplicadores. Os resultados do modelo proposto são comparados com o modelo original da ANEEL.
The document discusses decision support methods for energy auctions and risk in the new Brazilian electricity sector regime. It proposes a simulation model to optimize energy purchasing costs for distributors. The model integrates an energy demand forecasting model with simulations of the settlement price (PLD) using hydro dispatch optimization software. It also simulates different percentages of estimated load purchased each year to minimize total costs over multiple years amid demand and PLD uncertainties.
1. O documento descreve modelos de regressão com variáveis binárias, explicando como especificar corretamente variáveis qualitativas com m categorias para evitar colinearidade.
2. É apresentado um exemplo usando dados eleitorais do Rio de Janeiro para ilustrar como interpretar os coeficientes em modelos com e sem termo constante.
3. Os resultados mostram que a região da cidade influencia os percentuais de votos brancos e nulos, sendo menores nas zonas Sul e Norte.
Este documento analisa os riscos enfrentados pelas distribuidoras de energia elétrica no Brasil sob o novo modelo regulatório, no qual o governo centraliza as aquisições de energia. As distribuidoras precisam estimar com precisão a demanda de suas áreas para os próximos cinco anos e são punidas financeiramente por erros de estimativa. A pesquisa simula os custos de contratação para uma distribuidora hipotética considerando variáveis aleatórias e conclui que o sistema é frágil, pois pequenas mudanças nos dados
Este capítulo discute o modelo de regressão múltipla com duas variáveis explicativas e apresenta os
estimadores de mínimos quadrados ordinários. Os coeficientes parciais de regressão medem o efeito
de cada variável explicativa sobre a variável dependente quando o efeito da outra variável é
mantido constante. Os estimadores MQO são obtidos resolvendo um sistema de equações que
envolve a matriz dos dados e os vetores de parâmetros e erros. A variância dos estimadores
depende de um parâmetro que mede a variância dos
Este documento apresenta um tutorial sobre econometria e o uso do software Gretl, com o objetivo de ensinar conceitos básicos de econometria e análise de dados econômicos. O documento introduz o que é econometria, características dos dados econômicos e o software Gretl, e fornece exemplos passo a passo de estatística descritiva, correlação, regressão linear simples e múltipla utilizando o Gretl.
The document discusses residential electricity consumption in Brazil following economic reforms in 1994. It summarizes that inflation fell dramatically, increasing incomes especially for the poor. Combined with increasing electronics imports, this caused unprecedented growth in residential electricity consumption. A new survey is being conducted to create an updated residential consumer profile across Brazil to aid planning and demand side management policies, as consumption growth has strained energy supply. Cluster analysis is being used to group municipalities for a stratified sampling approach in the surveys.
O documento discute os modelos de dependência espacial em econometria. Apresenta os principais conceitos da econometria espacial, como matrizes de pesos espaciais, operador de defasagem espacial, autocorrelação espacial e testes para sua detecção. Também explica os dois modelos mais usados - modelo de defasagem espacial e modelo de erro espacial - e como especificar e testar a presença de dependência espacial nos dados.
A hybrid neural network and neuro-fuzzy system model was developed to forecast weekly electricity spot prices in Brazil. The model combines the forecasts from an artificial neural network with the original inputs in a neuro-fuzzy system to generate the final forecasts. Six models were created to specialize in forecast horizons of 1 to 6 weeks ahead. Various neural network structures were tested to select the best performing models. The hybrid approach aims to incorporate the advantages of both neural networks and neuro-fuzzy systems for improved forecasting performance.
Este documento fornece uma introdução ao ambiente R, abordando tópicos como a instalação e uso básico do R, pacotes disponíveis, objetos fundamentais como vetores e matrizes, programação em R incluindo funções e depuração, e manipulação e visualização de dados.
The document summarizes the results of a 3D scan comparison between a reference model (Sheet Metal_R) and test model (Sheet Metal_T). It found 1,686,643 matching data points with an average deviation of 0.3mm and standard deviation of 0.9mm. The maximum upper deviation was 36.1mm and maximum lower deviation was -32.4mm, with 7928 outliers found. The deviation distribution chart shows that most points (50%) fell within 1 standard deviation of the mean.
The document contains numerical examples and calculations comparing different root finding methods: bisection method, false position method, secant method, and Newton's method. It shows the iterations, calculated values of x and f(x), and percent error for finding the root of sample functions using each method. Graphs are also included to help visualize the functions and convergence of solutions.
The document contains numerical examples and calculations using several root finding methods, including the bisection method, false position method, secant method, and Newton's method. Tables show the iterative calculations for each method applied to various functions, recording the iteration number, input and output values, and estimation of error. Graphs of functions are also shown. The methods are then compared based on the rate of convergence and number of iterations required to achieve a solution.
Operational Value at Risk is estimated using an Advanced Measurement Approach involving statistical modeling of operational risk loss data. Loss severity and frequency are each modeled separately using distributions like lognormal or Poisson. Monte Carlo simulation is then used to aggregate the severity and frequency distributions and estimate unexpected losses, expected losses, and the capital at risk needed to sustain operational losses exceeding expectations.
Susan Borda is a digital curation librarian who provides guidance on data management. She discusses the importance of data management plans and best practices for the various stages of the research data lifecycle including planning, collecting, managing, sharing, and preserving data. She highlights funder requirements for data management plans and tools like DMPTool that can help researchers create plans and meet funder standards. Borda also offers tips for organizing, documenting, and storing data as well as sharing data through repositories to increase citations and ensure long-term preservation.
Several of the largest MMOG companies reported their Q4/08 results over the past week; including Shanda, Netease,
and The9. The MMOG market continued to grow on a sequential basis in Q4.
1. 1
Models for Southeast Load
Mônica Barros
March, 22nd, 2002
Objectives
Our primary objective in this report is to present simple models that will allow us to forecast the DAILY load in the Southeast subsystem in Brazil.
The database consists of observations collected since January 1st, 2001, and all models were fitted using as the final point March, 18th, 2002.
We take a step-by-step approach, and we present a tentative model. We believe we will not have our “final” model until we obtain temperature data for major
metropolitan areas in the Southeast region, since it is a widely known fact that electricity consumption is heavily dependent on this variable. Moreover, it is
important to note that the temperature effect will probably be relevant when we consider several important load centers simultaneously.
Why is Daily Load Forecasting Important for Trading Purposes?
Short-term (daily and eventually, hourly) load forecasting is essential for trading, since it will give us a good indication on what type of position we should
assume, and the volume of this position. For example, if we have a fairly accurate prediction that load is about to increase by x% on the next week (due, for
example, to a weather forecast that predicts extremely high temperatures), in a totally free market we would expect prices to go up, and could use this
information to our advantage.
Unfortunately, the current situation in Brazil is still very much away from a free market set-up, but under the newly proposed offer and demand system, short
term load forecasts will be as important as short term inflow energy forecasts, so we need to attack the problems from both sides!
Below we exhibit a time series graph with the load data for the Southeast subsystem. One should note the dramatic impact of rationing (from June 2001 to
February 2002).
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
2. 2
Time Series Graph – Southeast Load
Legend
28000 LOAD_SE
26000
24000
22000
20000
18000
16000
2001 2002
Tentative Model 1
Model Structure: Linear Scale, Constant included in the model, Lagged Dependent Variables (up to 7 days), Dummy Variables for National Holidays and
Rationing, Day of Week Indicator, Least Squares Estimation
Dependent Variable: LOAD_SOUTHEAST
Method: Least Squares
Date: 03/07/02 Time: 12:30
Sample(adjusted): 8/01/2001 5/03/2002
Included observations: 422 after adjusting endpoints
Variable Coefficient Std. Error t-Statistic Prob.
C 2050.533 1331.534 1.539979 0.1243
LOAD_SOUTHEAST(-1) 0.332666 0.041826 7.953559 0.0000
LOAD_SOUTHEAST(-2) -0.114157 0.040054 -2.850093 0.0046
LOAD_SOUTHEAST(-3) 0.027442 0.040547 0.676788 0.4989
LOAD_SOUTHEAST(-4) 0.017247 0.041990 0.410735 0.6815
LOAD_SOUTHEAST(-5) -0.037280 0.040406 -0.922639 0.3567
LOAD_SOUTHEAST(-6) 0.100908 0.066705 1.512752 0.1311
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
3. 3
LOAD_SOUTHEAST(-7) 0.595833 0.053326 11.17337 0.0000
NAT_HOLIDAYS -2928.877 330.2739 -8.868024 0.0000
WEEKDAY 24.71201 68.12823 0.362728 0.7170
RATIONING -423.4708 310.8725 -1.362201 0.1739
R-squared 0.887131 Mean dependent var 23052.64
Adjusted R-squared 0.884385 S.D. dependent var 3497.637
S.E. of regression 1189.274 Akaike info criterion 17.02579
Sum squared resid 5.81E+08 Schwarz criterion 17.13123
Log likelihood -3581.442 F-statistic 323.0397
Durbin-Watson stat 1.259962 Prob(F-statistic) 0.000000
Note: Non-significant variables show n above in boldface.
Obviously, there is a lot of room for improvement in this model. We can start by dropping out the insignificant variables and also by creating dummies for each
weekday, instead of putting them all together in a single variable. Also, Normality of the dependent variable might be an issue, as shown in the following
histogram (note the high value of the Jarque-Bera statistic, a clear indication of non-normality). Moreover, it wouldn’t surprise me at all if the load distribution
were bimodal, representing, to an extent, the effect of rationing (June 1st, 2001 to February 28th, 2002).
Histograms – dependent variable on original and log scale
36
Series: LOAD_SOUTHEAST
32 Sample 1/01/2001 5/03/2002
28 Observations 429
24 Mean 23075.80
Median 22292.00
20 Maximum 29704.00
16 Minimum 15708.00
Std. Dev. 3485.951
12 Skewness 0.331409
Kurtosis 2.187413
8
4 Jarque-Bera 19.65580
Probability 0.000054
0
17500 20000 22500 25000 27500 30000
If we log transform the data, there is no clear improvement in terms of Normality, as shown in the next histogram. I guess the problem is really the bi-modality I
previously mentioned.
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
4. 4
40
Series: LOG(LOAD_SOUTHEAST)
Sample 1/01/2001 5/03/2002
Observations 429
30
Mean 10.03526
Median 10.01198
20 Maximum 10.29904
Minimum 9.661925
Std. Dev. 0.150244
Skewness 0.066686
10 Kurtosis 2.262865
Jarque-Bera 10.03065
Probability 0.006635
0
9.750 9.875 10.000 10.125 10.250
Needless to say, the residuals of the model are still quite bad, as the next graph shows.
8000
6000
4000
2000
0
-2000
-4000
2001:04 2001:07 2001:10 2002:01
LOAD_SOUTHEAST Residuals
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
5. 5
Tentative Model 2
Model Structure: Linear Scale, Constant included in the model, Lagged Dependent Variables (lags 1, 2, 3, 5, 6 and 7), Dummy Variables for National Holidays,
Rationing, Weekends, Mondays and Wednesdays, Least Squares Estimation.
Dependent Variable: LOAD_SOUTHEAST
Method: Least Squares
Date: 03/07/02 Time: 16:41
Sample(adjusted): 8/01/2001 5/03/2002
Included observations: 422 after adjusting endpoints
Variable Coefficient Std. Error t-Statistic Prob.
C 3072.134 866.2896 3.546313 0.0004
LOAD_SOUTHEAST(-1) 0.665403 0.040540 16.41337 0.0000
LOAD_SOUTHEAST(-2) -0.082546 0.039324 -2.099112 0.0364
LOAD_SOUTHEAST(-3) 0.124398 0.043531 2.857663 0.0045
LOAD_SOUTHEAST(-5) 0.149711 0.036362 4.117247 0.0000
LOAD_SOUTHEAST(-6) -0.137910 0.032591 -4.231564 0.0000
LOAD_SOUTHEAST(-7) 0.185818 0.030101 6.173177 0.0000
NAT_HOLIDAYS -2558.808 220.2461 -11.61795 0.0000
RATIONING -529.0360 200.7525 -2.635265 0.0087
WEEKEND -2635.212 204.4063 -12.89203 0.0000
MONDAY 1634.653 231.9577 7.047202 0.0000
WEDNESDAY 434.2261 228.5706 1.899746 0.0582
R-squared 0.952112 Mean dependent var 23052.64
Adjusted R-squared 0.950827 S.D. dependent var 3497.637
S.E. of regression 775.5981 Akaike info criterion 16.17317
Sum squared resid 2.47E+08 Schwarz criterion 16.28819
Log likelihood -3400.539 F-statistic 741.0601
Durbin-Watson stat 1.792038 Prob(F-statistic) 0.000000
There´s a significant improvement in model fit when compared to tentative model 1. Some points need to be made:
The effect of national holidays (keeping all other variables constant) is a reduction of about 2559 MW in the daily load.
Similarly, rationing represented a decrease of roughly 529 MW on daily load (keeping all other effects fixed).
The effect of weekends is even more dramatic, and represents a load decrease of approximately 2635 MW.
One should keep in mind that current load (as of March 5th, 2002) is 25,542.
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
6. 6
Observed and Fitted Values, Residuals
30000
25000
20000
4000
2000 15000
0
-2000
-4000
-6000
2001:04 2001:07 2001:10 2002:01
Residual Actual Fitted
Some residuals are still quite large, and we expect to improve model fit with the inclusion of temperature in some of the major load centers in the region, and we
intend to incorporate this information in our models as soon as we obtain the data.
Short-term forecasting capacity for tentative model 2
We next use the structure proposed in tentative model 2 to forecast load up to one week ahead, thus we take as the end of the in-sample period February 26th,
2002. Note that rationing did not officially end until February 28th, 2002, even though its end was announced some 10 days earlier, so our forecasts actually
include a period of structural break.
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
7. 7
Actual SE Forecasted Error % Error
Load SE Load
27/02/02 24,839 21,903 2,936 11.8%
28/02/02 24,816 21,804 3,012 12.1%
01/03/02 24,792 22,336 2,457 9.9%
02/03/02 22,958 20,575 2,383 10.4%
03/03/02 20,542 18,569 1,973 9.6%
04/03/02 24,625 22,325 2,300 9.3%
05/03/02 25,542 23,211 2,331 9.1%
Forecast errors are still quite large, and definitely improvements will be made by including other explanatory variables in the model. However, the current
structure is simple enough to make it still useful. One point that should be kept in mind is that all forecasts underestimate the actual load, which can be due to
two factors:
• Load started “picking up” as the end of rationing was announced in mid-February or,
• Temperature effect – the last couple of days have been warmer than on the previous weeks, and we had a fairly mild summer so far, which in part
explained the low levels of load previously observed.
Quite probably, the reason for such low forecasts is a combination of both factors already mentioned.
Tentative Model 3
We start with a longer set of lagged covariates (originally up to 14 days as possible explanatory variables) and tentatively include some indicator variables in the
model, namely: dummies for each day of the week (except Friday, to avoid collinearity) and a dummy for weekends, rationing and national holidays dummy
variables.
The final model structure is:
Variable Coefficient Std. Error t-Statistic Prob.
C 987.64 304.72 3.24 0.1%
LOAD_SOUTHEAST(-1) 0.60 0.03 18.13 0.0%
LOAD_SOUTHEAST(-3) 0.07 0.02 2.80 0.5%
LOAD_SOUTHEAST(-5) 0.12 0.03 3.85 0.0%
LOAD_SOUTHEAST(-7) 0.12 0.03 3.36 0.1%
LOAD_SOUTHEAST(-12) 0.10 0.04 2.68 0.8%
LOAD_SOUTHEAST(-13) -0.16 0.03 -5.12 0.0%
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
8. 8
LOAD_SOUTHEAST(-14) 0.14 0.03 4.54 0.0%
NAT_HOLIDAYS -2778.93 220.68 -12.59 0.0%
MONDAY 1334.63 214.66 6.22 0.0%
WEEKEND -2672.70 196.81 -13.58 0.0%
R-squared 0.9528 Mean dependent var 22,989.46
Adjusted R-squared 0.9516 S.D. dependent var 3,483.37
S.E. of regression 766.36 Akaike info criterion 16.15
Sum squared resid 237000000 Schwarz criterion 16.25
Log likelihood -3339.57 F-statistic 814.93
Durbin-Watson stat 1.67 Prob(F-statistic) -
Observed and Fitted Values, Residuals
30000
25000
20000
4000
2000 15000
0
-2000
-4000
-6000
2001:04 2001:07 2001:10 2002:01
Residual Actual Fitted
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
9. 9
Short-term forecasting capacity for tentative model 3
We repeat the procedure used for tentative model 2 and produce one week ahead forecasts starting on February 27th, 2002. The remarkable fact is: even
though there is a structural break in this period, the forecasting capacity of this model is quite superior to the previous one, as the next table shows.
It remains to be seen if this excellent forecasting capacity can be maintained at other periods, since I personally feel the model might be slightly “too large”
(however all lags included in the model are significant!).
Tentative Model 3
Actual SE Forecasted
Load SE Load Error % Error
27/02/02 24,839 24,546 292.90 1.2%
28/02/02 24,816 24,868 -52.30 -0.2%
01/03/02 24,792 24,647 144.80 0.6%
02/03/02 22,958 22,735 222.70 1.0%
03/03/02 20,542 20,439 103.30 0.5%
04/03/02 24,625 23,946 679.10 2.8%
05/03/02 25,542 25,119 422.60 1.7%
Tentative Model 4
We next attempt to incorporate the temperature effect into the tentative daily load models for the Southeast subsystem as a means of improving their predictive
ability.
We obtained two new data series (average and maximum daily temperatures at the Botafogo measuring station in the Southern part of the city of Rio de
Janeiro). This series probably does not represent extreme changes in temperatures in the city, as there are areas where the maximum observed temperatures
can be 15 to 20 degrees Fahrenheit higher than in the measuring station for which we have available data at this time.
It seems remarkable to us that the maximum temperature series at the above mentioned station has a very poor explanatory ability, and “drops” out quickly of
almost any model we attempt to fit. Surprisingly, the average temperature series remains in the “final” model, together with several lags of the dependent
variable.
Model Structure: Linear Scale, Constant not included in the model, Lagged Dependent Variables (lags 1, 3, 5, 7, 12, 13, 14), Dummy Variables for National
Holidays, Weekends, Mondays, Average Daily Temperature at Botafogo Measuring Station, Least Squares Estimation.
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com
10. 10
Variable Coefficient Std. Error t-Statistic Prob. Observed and Fitted Values, Residuals
LOAD_SOUTHEAST(-1) 0.53 0.03 16.18 0.0%
LOAD_SOUTHEAST(-3) 0.07 0.02 2.92 0.4%
LOAD_SOUTHEAST(-5) 0.10 0.03 3.47 0.1%
LOAD_SOUTHEAST(-7) 0.13 0.03 4.13 0.0%
LOAD_SOUTHEAST(-12) 0.11 0.03 3.28 0.1%
LOAD_SOUTHEAST(-13) -0.17 0.03 -5.79 0.0%
LOAD_SOUTHEAST(-14) 0.14 0.03 4.61 0.0%
NAT_HOLIDAYS -2835.96 209.54 -13.53 0.0%
MONDAY 1102.99 205.00 5.38 0.0%
WEEKEND -2692.80 181.64 -14.83 0.0%
TEMP_AVER_RJ 103.74 13.42 7.73 0.0%
R-squared 0.96 Mean dependent var 2298089.0%
Adjusted R-squared 0.96 S.D. dependent var 349815.2%
S.E. of regression 727.39 Akaike info criterion 1604.3%
Sum squared resid 211,000,000 Schwarz criterion 1615.1%
Log likelihood -3277.87 Durbin-Watson stat 166.5%
We haven’t tested the short-term forecasting ability for tentative model 4 since we haven’t generated forecasts for the temperature series.
M. Barros Consultoria Ltda.
e-mail: barrosm@alumni.utexas.net
mailto:info@mbarros.com