The document describes automating demand response management through forecasting electricity load. It involves gathering historical load and weather data, developing forecasting models using techniques like generalized additive models and moving averages, and creating an interactive dashboard using R Shiny. The dashboard allows visualizing actual vs forecasted load, adding events to a log, and downloading reports. The aim is to accurately predict load and events through an automated framework with minimal manual effort.
2. BUSINESS CONTEXT
In US, the consumption of electricity is highest in the
summer due to the increased use of electrical appliances
Regional utilities implement DR programs and shed / clip
electricity by offering financial incentives to end
customers as per standard rules and regulations
Events are called when the forecasted electrical load
exceeds 96% of summer peak load
The aim is to build an automated model that will
accurately predict occurrences of events with least
manual intervention
3. BUSINESS CONTEXT – SGQ
SGQ (State-Gap-Question) representing the current state, the gap
preventing us from reaching the final state, the key question that must
be answered to reach the final state and the final state
4. DEMAND RESPONSE (DR)
Reduction in usage of electricity by end customers, in
response to the events called by regional utility by
curtailing peak electricity load
Involves generation, transmission and distribution of
electric load through proper planning & scheduling
They’re generally implemented in the summer as usage
of electrical appliances is at its peak
The incentives offered are proportional to the amount of
electricity that was clipped
5. DEMAND RESPONSE (DR)
Maximum of 6 events can be called for the summer that
is for 4 months
Customers are informed of events in advance via
Curtailment Service Providers (CSPs) who are third party
vendors
Hence, customers efficiently manage electrical usage
and subsequently, their costs
6. OBJECTIVES
To predict events based on the past 10 years historical
data for a client in Philadelphia considering all factors
such as weather, humidity, precipitation etc.
Develop a reliable framework by process flow
automation through real time collection, creating a
repository for storage and creating a visualization for
the client
7.
8. FORECASTING ACCURACY
The forecasted error is the difference between actual and the
forecasted values for a particular period of time
The error is measured via MAPE (Mean Absolute Percentage Error)
MAPE usually measures accuracy as a percentage:
N
M = 1/N * ∑ |(At-Ft)/At| * 100
t = 1
where At – Actual value at t,
Ft – Forecasted value at t and
N – No. of data points
9. FACTORS INFLUENCING POWER CONSUMPTION
Atmospheric pressure
Temperature
Wind speed
Precipitation
Humidity
Number of residents
Consumption pattern of electrical appliances
10. FORECASTING TECHNIQUES
It’s the process of making reasonable prediction by taking
the historical and present data into account
There is a fair amount of risk and uncertainty that is
associated with forecasting and prediction
For getting the most accurate forecasts, the data must
be updated
11. FORECASTING MODELS CONSIDERED
There are 3 models - 3 hour, 12 hour and 24 hour look-
ahead model
The 3 hour look ahead model depicts the forecasted
values for the next 3 hours, and similarly for the other 2
models
The 3 hour look-ahead model is the best model as its
MAPE is 1.30% compared to 2.05% and 2.40% of the 12
and 24 hour models respectively
12. GENERALIZED ADDITIVE MODEL (GAM)
GAM is used to develop the forecasting equations
A simple GAM relationship can be written as:
g(E(Y)) = b0 + f1(X1) + f2(X2) + ….. +
fm(Xm)
13. GENERALIZED ADDITIVE MODEL (GAM)
Here,
X1, X2, …, Xm - m predictor variables
Y - dependent variable
E(Y) - mean of dependent variable
b0 - regression coefficient
f1, f2, …, fm - functions of the m predictors
g - link function forming a relationship between the
predictor variables and the dependent variable
14. MOVING AVERAGES
It creates a set of averages of various subsets of the
entire data set, to create few derived variables
The process is explained below:
The first element of the moving average series is
calculated by taking the average of the initial fixed
subset of the number series
The subset is ‘shifted forward’ by excluding the first
number in the series and repeating the same process
again, over the entire data series
15. MOVING AVERAGES
So, this creates a new subset of numbers
A line connecting all the averages is called the moving
average
In simpler terms, a moving average is a set of points,
each of which, is the average of a larger data set
16. R SHINY
R Shiny is an R package that is used to build interactive
web applications in R
Shiny web-applications are programmed using R without
the need to use HTML, CSS, or JavaScript for
development, although Shiny is flexible to incorporate
the same if the user desires (for developing customized
visualizations)
It’s the visualization tool that was used to create the
dashboard
17. R SHINY - FEATURES
Open source
Generate interactive visualizations
Easy to use
Designed to build applications to analyze and/or visualize
data
23. WEB SCRAPING – LOAD DATA
The instantaneous live-feed load is scraped from the PJM
website every 5 minutes
The HTML table which contains the load data is
converted into a data frame in R
From the list of 18 utilities (or zones) present in PJM’s
website, the data for the required zone is extracted
Appropriate quality checks are implemented sequentially
when scraping real-time load data
24. WEB SCRAPING – LOAD DATA
If the server crashes, NULL / missing values are replaced
by the average of the load values of the past week for
that corresponding hour, depending on how close that
value will be to the forecasted values
Outlier treatment is done to check if the scraped values
lie within a particular range. If not, it’s replaced by
either bound depending on which is close to this value
Only the maximum load value of an hour is stored in a
CSV file. Hence, there’ll be 24 pts in a particular day
25. WEB SCRAPING – WEATHER DATA
The weather data is scraped from 2 different sources:
PULSE API for scraping instantaneous weather data
every 5 minutes
ENCast API for scraping forecasted weather data once
in an hour
As the 2 sets of weather data are present in JSON
format, they’re converted to the CSV format
26. WEB SCRAPING – WEATHER DATA
The same set of quality checks that were performed on
the load data are also done on the weather data so that
there aren’t any discrepancies in the data
The average of the temperature and relative humidity
and maximum wind speed and temperature on an hourly
basis are extracted
The ENCast API gives the forecasted weather data for the
next 6 days
30. MERGING OF DATASETS
The 3 datasets, i.e., the instantaneous load data, the
instantaneous weather data and the forecasted weather
data are merged into a single CSV file based on the
primary key (date and time)
31. CREATION OF DERIVED VARIABLES
THI (Temperature-Humidity Index) is calculated based on
temperature and humidity
Other derived variables include:
THI Square
THI moving averages for 4 and 6 hours
12 hour prior load
Moving averages for 12 and 6 hour
Maximum temperature
Maximum squared temperature
32. FORECASTING MODEL
A 3 hour, 12 hour and 24 hour look-ahead model are each
executed each hour
The graph contains 24 data points that correspond to the
forecasted load in a day
The first 3 data points in the graph indicate the best
forecast as the MAPE is considerably lesser than the
other two models
The next 9 data points are taken from the 12 hour look-
ahead model
33. FORECASTING MODEL
The last 12 data points are taken from the 24 hour look-
ahead model
This continues until all the 24 data points are taken from
the 3 hour model as it’s most accurate
The last 3 rows from the 3 hour model, last 9 from 12
hour model and last 12 from the 24 hour model are used
to form the respective sets of equations for the different
models
The expected MAPE (Mean Absolute Percentage Error) of
the forecasting model is less than 2%
34. FORECASTING MODEL
For example, if one wants to predict the next day’s load
at 8.00 AM, he / she must have the
3 hour prior load and weather data to be fed into the
3 hour model
12 hour prior load and weather data to be fed into the
12 hour model and
24 hour prior load and weather data to be fed into the
24 hour model
35. FORECASTED DATA STORAGE
Three separate CSV files are created, one for each model
containing columns like date, time, load recorded, wind
speed, relative humidity, etc
These files are then merged into a single CSV file based
on the primary key – date and time
36.
37. VISUALIZATION – LOAD FORECAST TAB
There are three lines in the graph that indicate the
actual load line, forecasted load line and threshold Line
Text inputs for entering the summer peak load and the
threshold % are provided because these 2 parameters
change each year
A reactive threshold line is present on the graph that
corresponds to 96% of the summer peak load
38. VISUALIZATION – LOAD FORECAST TAB
A graph showing the actual and the forecasted load
values with the x-axis being the 24 hours of a day and the
y-axis being the load values
A data table which has columns – Actual load, Actual load
as % of summer peak load, Forecasted load and
Forecasted load as % of summer peak load
An option for the client to enter the date, duration of
event and thereby, add an event to a log
An option for the client to download the load summary
data table is also available
46. VISUALIZATION – HISTORICAL LOAD TAB
An option for the client to select any date and see the corresponding plot
and data table is available
Both the data table and graph contain 24 points each
An option for the client to download the historical load data table is also
available
52. VISUALIZATION – EVENT LOG TAB
An event log data table that displays the dates and the duration of the
events that have occurred
The event log is sorted so that the most recent events appear at the top of
the table
54. CONCLUSION
OUTCOME
The client achieves faster and efficient event
prediction through this automated model
BEHAVIOUR
The client predicts the occurrence of events with
greater reliability and less manual intervention
INSIGHTS
A reliable model for event prediction is developed and
automated