Tropical and winter storms can cause widespread damage to electric distribution networks. These distribution networks are mostly above ground and are exposed to direct damage from severe weather conditions associated with these storms. For example, during winter storms, the combined stress of the weight of ice, the increased wind resistance of the conductors, and broken tree limbs can damage lines, poles, and support structures. The goal is to develop a model to predict electric power outages in near-real time when severe storm conditions are forecasted. This is especially important as predicting power outages during hurricanes is one with important practical ramifications. As part of this work, we will address the problem of forecasting power outages knowingly only information about the incoming hurricane and basic environmental, social, and economic indicators in the affected areas. These data are available and uniformly measured across the US, making for a scalable model. Moreover, we will explore data driven approaches, using standard prediction metrics to evaluate performance of flexible machine learning techniques.
1. Managed by Triad National Security, LLC for the U.S. Department of Energy’s NNSA
Team Members:
Daisy Arokiasamy
Luis Damiano
Mai Dao
Samuel Gailliot
Akira Horiguchi
Ramesh Kesawan
Yiming Xu
July 24, 2019
Problem Presenters:
Mary Frances Dorn
Kimberly Kaufeld
Faculty Mentors:
Brian Reich
Yawen Guan
3. Statement of Problem
8/1/2019 | 3Los Alamos National Laboratory
• Our goal is to forecast
one day ahead in near-
real-time the number of
tropical cyclone-induced
county-level electrical
outages, using only
publicly available data in
the Atlantic coast of the
United States.
4. Why do we Care?
8/1/2019 | 4Los Alamos National Laboratory
• Real time
forecasting at the
county level will
allow decision
makers and
emergency
responders to focus
their efforts in the
optimal locations.
• Forecasting one day
ahead gives
responders enough
time while
minimizing error. Source: https://www.history.com (Hurricane Katrina)
5. What Makes this Problem Interesting?
8/1/2019 | 5Los Alamos National Laboratory
• Two Spatial Problems
– The eastern seaboard is a large area
to estimate over.
• Different geographies and weather
patterns.
– Counties as spatial elements vary
greatly.
• Data Sparsity
– Only 13 storms made landfall in 2015-
2017
– Weather data is recorded only in
certain counties, requiring
interpolation
• Low Signal to Noise Ratio Source: https://www.bbc.com/news/world-us-canada-45532679 (Tropical
Storm Florence)
6. Our Approach
8/1/2019 | 6Los Alamos National Laboratory
• Classification:
– Risk Assessment: We want to help decision makers allocate resources
when a tropical cyclone hits the Atlantic region by identifying regions of
highest impact.
• Regression:
– Inference: Data driven understanding of underlying causes
– Sliding: After risk is classified, we want to report an estimate of the number
of outages.
7. Different Data
8/1/2019 | 7Los Alamos National Laboratory
Storm
Name Year Class Regr.
Ana 2015 Train Train
Bill 2015 valid Valid
Bonnie 2016 Train Train
Colin 2016 Train Test
Eight 2016 Train Train
Hermine 2016 Test Train
Julia 2016 Train Train
Matthew 2016 valid Train
Cindy 2017 Test Test
Emily 2017 Test Test
Harvey 2017 Train Valid
Irma 2017 Train Valid
Nate 2017 valid Train
• FIPS
• Weather: Wind Speed, Precip, Temp
• Population Density
• Tree Species
• Land Usage
• Outage Data
• Classification and Regression teams used
different data subsets due to different
focuses
• Focus on results vs. inputs
• Outages vs Wind Speed
9. Process
8/1/2019 | 9Los Alamos National Laboratory
• Goal:
Help policy makers decide
distribution of resources when a
tropical cyclone hits landfall
• Process:
Quantify the severity of impact
based on observed average daily
outages with special interest in
identifying potential Very High
impact regions
10. Percentiles of Average Daily Outages
8/1/2019 | 10Los Alamos National Laboratory
Category Percentile Range Outage Range
Low Below 85th percentile Below 75
Medium Between 85th and 95th
percentile
Between 75 and 251
High Between 95th and 99th
percentile
Between 251 and 998
Very High Above 99th percentile Above 998
11. Statistical Methods
8/1/2019 | 11Los Alamos National Laboratory
• Multinomial Logistic Regression (MLR)
• Linear Discriminant Analysis (LDA)
• Random Forest (RF)
• K-Nearest Neighbors (k-NN)
• Blind – Baseline classifier (Classifies every observation
into the Low category)
17. Regression approach
8/1/2019 | 17Los Alamos National Laboratory
• Motivation
– Risk map model for resource assignment.
– Expected outages for resource quantification (budgeting & logistics).
Source: Texas National Guard/Lt. Zachary West , 100th MPAD (Hurricane Harvey).
18. Regression approach
8/1/2019 | 18Los Alamos National Laboratory
Regression model goals:
– To predict
the impact on outages What definition would be more predictable?
19. Regression approach
8/1/2019 | 19Los Alamos National Laboratory
Regression model goals:
– To predict
the impact in outages
on a given county
hit by a hurricane with predefined characteristics.
What county-specific characteristics help
explain and predict outages?
20. Regression approach
8/1/2019 | 20Los Alamos National Laboratory
Regression model goals:
– To predict
the impact in outages
on a given county
hit by a hurricane.
What storm-specific characteristics are
most useful for prediction?
21. Characterizing counties
8/1/2019 | 21Los Alamos National Laboratory
• Typical number of outages:
– Historical median number of daily mean outages during no-hurricane days.
• Forestry characteristics: groups based on tree inventory data.
• Land usage characteristics: groups based on land usage and cover.
Challenge: it is not evident how to use tree and land inventory data
to predict outages.
Strategy: let the data talk!
22. Land and tree data inventory
8/1/2019 | 22Los Alamos National Laboratory
Tree cluster map here Land usage maps here
Capturing spatial smoothness
without knowing about county adjacency.
Capturing scattered patterns
such as high-density urban areas.
South east
coastline
Appalachian mtns
23. Characterizing storms
8/1/2019 | 23Los Alamos National Laboratory
More details on the report.
• Genuine observations: measured by weather stations.
– Precipitation (log), Wind speed measurements (PCA1), temperature
measurements (PCA1).
– Spatial interpolation.
• Storm wind model: physics-based simulation model.
1 Principal component analysis for decorrelation (whitening) and dimension reduction.
24. Measuring impact…
8/1/2019 | 24Los Alamos National Laboratory
Impact on
outages
Log ratio
Observed value1
Difference1
Ratio1
r = log
# 𝑜𝑢𝑡𝑎𝑔𝑒𝑠 𝑜𝑛 𝑡ℎ𝑒 𝑑𝑎𝑦 𝑡ℎ𝑒 𝑠𝑡𝑜𝑟𝑚 ℎ𝑖𝑡𝑠
# 𝑜𝑢𝑡𝑎𝑔𝑒𝑠 𝑜𝑛 𝑎 𝑡𝑦𝑝𝑖𝑐𝑎𝑙 𝑑𝑎𝑦
Storms have a
multiplicative effects
Many small # of outages
with a few peaks.
1 Defined in the report.
25. Results
8/1/2019 | 25Los Alamos National Laboratory
• Best model (out-of-sample R2):
Inputs
Weather
conditions.
Trees & land
usage.
Geolocation.
Physics-based
simulation storm
wind model.
Output
Log of ratio of
outages.
Random forest
26. Results
8/1/2019 | 26Los Alamos National Laboratory
Take-aways:
• Log ratio seems most predictable.
• Most relevant inputs in decreasing order: Temperature, Wind speed, Precipitation (see report).
−100 −50 0 50 100 150 200 250
1.52.02.53.0
Partial dependence on windPCA
windPCA
Logratio
Possible
thresholds?
Partial dependence
Log ratio ~ Wind PCA
0 200 400 600 800 1000 1200 1400
1.41.61.82.02.2
Precipitation (mm)
Logratio
Partial dependence
Log ratio ~ Precipitation (mm)
Ceiling at 300mm
(on average)
27. Summary
8/1/2019 | 27Los Alamos National Laboratory
What is the predicted impact on outages of a cyclone storm hitting
a county?
Two complementary strategies to answer one question.
– Classification models.
• Useful for risk maps.
• Higher accuracy.
– Regression models.
• Useful for quantifying resources needed.
• Better understanding the relationship among predictors and storm impact.
• Lower prediction accuracy.