Extreme Hurricane Winds in the United States Thomas H. Jagger & James B. Elsner Department of Geography Florida State University http://garnet. fsu . edu/~jelsner/www University of Florida’s Winter Workshop on Environmental Statistics January 12, 2007 Gainesville, FL
Maximum Likelihood Inference
Extreme hurricane winds
Calibrate geological records
US hurricanes and global temperature
US hurricanes and other covariates
What are the return levels of hurricane winds in the U.S. over 5, 10, 50, and 100 years?
Are return levels different for different regions?
What is the maximum possible hurricane wind speed level?
How do these levels change under different climate conditions?
Although fewer hurricanes affect the United States when El Ni ñ o conditions are present, are they stronger?
Can we apply a similar analysis to insured hurricane losses?
What are the primary predictors of extreme insured losses?
How does the analysis of extreme insured losses differ from those of extreme winds?
We answer these questions by modeling the maximum wind speeds near the coast a using peaks-over-threshold model.
We estimate parameters using ML on data only over the reliable period from 1899-2004.
We then demonstrate the use of a Bayesian approach that allows us to incorporate an additional set of Atlantic hurricane data extending back to 1851.
We show that the same Bayesian extreme value model applied can be used to estimate extreme insured losses.
We use log(loss).
Alternate truncated normal model used for yearly losses.
Coastal hurricanes are a serious social and economic concern for the United States.
Strong winds, heavy rainfall, and storm surge kill people and destroy property.
Hurricane destruction rivals that from earthquakes.
Historically, 80% of all U.S. hurricane damage is caused by 20% of the most intense hurricanes.
The rarity of severe hurricanes implies that empirical estimates of return periods likely will be unreliable.
Extreme value theory provides models for rare wind events and a justification for extrapolating to levels that are much greater than have already been observed.
Stuart Coles “An Introduction to the Statistical Modeling of Extreme Values”
Definitive answers to questions about whether hurricanes will be more intense or more frequent in a future of global warming require long records .
The longest records available are near the coast.
The maximum possible hurricane wind speed is estimated to be 208 kt (183 kt) using the Bayesian (ML) model.
On average we can expect 132 (157) kt hurricane winds near the U.S. coast once every 10 (100) years.
Along the Florida coastline we can expect 108 kt (137) kt winds once every 5 (50) years on average.
Along the East coast we can expect 103 kt (120) kt hurricane winds once every 10 (300) years.
Return periods change with climate factors.
The return period for Hurricane Katrina is 21 years.
Global temperature may have some effect.
The expected worse case indicates a 94% probability of at least some loss during a year.
The expected worse case amount is $23.7 billion.
The expected best case indicates a 53% probability of at least some loss.
The expected best case amount is $1.1 billion.
The maximum 50-year single event loss under the worse case is $630 billion.
The maximum 50-year single event loss under the best case is $10 billion.
Poisson Regression 1993 Regression Tree 1996 Discriminant Model 1997 Time Series Model 1998 Single Change Point Model 2000 Weibull Model 2001 Space Time Model 2002 Extreme Value Model 2006 Cluster Model 2003 Time Series Regression 2008 Multiple Change Point Model 2009 Space Time Regression 2010 Hurricane Type (Paths, Origin) Hurricane Rate (Counts) Hurricane Strength (Intensity) Regional Hurricane Activity Large Scale Predictors (AMO, NAO, ENSO) Regional Scale Predictors (SST, SLP, etc)
Darling (1991): empirical model to estimate local probabilities of hurricane wind speeds exceeding specified thresholds.
Rupp and Lander (1996): method of moments on annual peak winds over Guam to determine the parameters of an extreme value model leading to estimates of recurrence intervals for extreme typhoon winds.
Heckert et al. (1998): peaks-over-threshold method and a reverse Weibull distribution to obtain mean recurrence intervals for extreme wind speeds at consecutive mileposts along the U.S. coastline.
Chu and Wang (1998): various parametric distributions to model return periods for tropical cyclone wind speeds in the vicinity of Hawaii.
Jagger et al. (2001): maximum likelihood (ML) estimation to determine a linear regression for the scale and shape parameters of the Weibull distribution for hurricane wind speeds in coastal counties.
Pang et al. (2001): Bayesian approach to estimating parameters from a Weibull distribution using wind speed data.
We enhance previous research efforts by
Interpolating 6-hourly positions and intensities hourly.
For each tropical storm in each region, find maximum wind speed using interpolated values.
Interpolation remove some boundary bias.
Examining the effect of climate variables on the distribution of extreme winds.
The model employed by Jagger et al. (2001) captures the variation of hurricane frequency as a function of climate variables using the Weibull distribution, which is appropriate for wind speeds above some threshold but not necessarily for the most extreme winds.
Here we attempt to put extreme hurricane winds in the context of climate variability and climate-change.
Demonstrating the feasibility of a Bayesian approach for adding older, less reliable, data into the analysis.
Generalized Pareto Distribution
Limit family for extremes
P ( x > v | x > u ) v [kt]
GPD threshold estimation
Which observations do we keep?
Mean residual life plot Smallest value.
, 0 versus threshold plot (not shown).
83 kt for Florida and Gulf.
64 kt for East.
96 kt for Entire Coast.
Best Track HURDAT center fixes
Time period:1851 through 2004.
10 kt rounding error prior to 1900
5 kt 1900 and later
1900-2004 data used in initial analysis, 1851-1899 used in Bayesian analysis.
Interpolated to hourly fixes using natural splines.
Initial smoothing: Use polynomial local fit: Savitsky-Golay
Try smoothing splines with uniform Wmax ± 2.5 kt distribution on true Wmax.
Data Sets: Regions Anatomy of a Tropical Storm
Atlantic Sea Surface Temperature (SST) ºC
Hurricane Season Global Warming
Results: Models for Extremes Curves are based on an extreme value model and asymptote to finite levels as a consequence of the shape parameter having a negative value. Parameter estimates are made using the ML approach. The thin lines are the 95% confidence limits. The return level is the expected maximum hurricane intensity over p -years. Points are empirical estimates and fall close to the curves. Return level plots by region
Return level plots for the entire U.S. coast (Region 4) by climate factors. Curves are based on an extreme value model using a ML estimation procedure. Data is partitioned separately by predictor. Red (blue) lines and points indicate above (below) normal climate conditions for each predictor. 15
Theoretical predictions of the influence of global warming on hurricane activity suggest:
Increased maximum intensity.
Increased level of storminess is uncertain .
Impact of Global Warming? Return Level (kt) Warm years Cold years For a given return period (> 5 yr), warm years result in higher return levels.
Entire coast 137 kt Magnitude of the difference in return level is consistent with climate models Warm Years Cold Years Saffir-Simpson Category 14 yr No change in frequency of weaker hurricanes 11 Hurricane Katrina
Bayesian Inference of Extremes
Bayesian Inference of Extremes:
Bayesian Extreme Value Models Hurricane Intensity Component Hurricane Frequency Component Covariate Component Observed maximum wind speed. True maximum wind speed. Bayesian model for coastal hurricane wind speeds Intercept: j=1 X[,1]=1 for all
How Gibbs Sampling Works Gibbs sampling algorithm in two parameter dimensions starting from an initial point and completing three iterations. (0) (1) (2) (3) The contours in the plot represent the joint distribution of and the labels (0) , (1) etc., denote the simulated values. One iteration of the algorithm is complete after both parameters are revised. Each parameter is revised along the direction of the coordinate axes---problematic if the two parameters are correlated (contours compressed) as movement along the axes tend to produce small changes in parameter values.
POT WinBUGS code Part I
for(j in 1:M)
#For each year calculate log sigma and xi:
lsigma2[j] <- inprod(ls.x,X[j,])
xi2[j] <- inprod(xi.x,X[j,])
#Threshold Crossing Rate for each year:
H[j] ~ dpois(lambda[j])
log(lambda[j]) <- inprod(tc,X[j,])
#Extreme value for each exceedance, note censoring:
for( i in 1:N)
offset[i] <- Yr[i] - Yr0
lsigma[i] <- lsigma2[offset[i]]
sigma[i] <- exp(lsigma[i])
xi[i] <- xi2[offset[i]]
yl[i] <- y[i] + e[i]
ys[i] <- y[i] - e[i]
POT WinBUGS code Part II
#Initializations and Missing Data model
#In our case Np=2, intercept and global temperature
Region 4: Northeast Coast Region 3: Southeast Coast Region 2: Florida Region 1: Gulf Coast Hourly interpolated hurricane positions (1851-2004)
Bayesian Model for Coastal Hurricane Winds 2 Hurricane Intensity Component Hurricane Frequency Component Covariate Component Observed maximum wind speed. True maximum wind speed.
Results: Raw Climatology P( >0) = 0.22 P( >0) < 0.01 P( >0) < 0.01 P( >0) < 0.01 Assuming the model is correct, the data support a super-intense hurricane threat only in the Gulf of Mexico. Region 1: Gulf Coast: Shape Region 2: Florida: Shape Region 3: Southeast: Shape Region 4: Northeast: Shape Probability that hurricane intensity is unbounded Frequency Distribution Frequency Distribution
Important Results: Conditional Climatology log( ) log( ) log( ) log( ) log( ) log ( ) Frequency Distribution Frequency Distribution Region 1: AMO: Scale Region 3: AMO: Threshold Region 1: NAO: Threshold Region 2: NAO: Threshold Region 1: SOI: Threshold Region 3: SOI: Threshold Stronger Hurricanes More Hurricanes More Hurricanes More Hurricanes More Hurricanes More Hurricanes
Simulated Hurricane Seasons
A Monte Carlo procedure is employed on the posterior samples to generate a large number (50K) simulated hurricane seasons based on sampled covariate and parameter values.
Results from the Gulf Coast show that the simulated data match the empirical data through Category 4 wind speeds but for winds in excess of 135 kt (68 m/s) the simulated data indicate a higher frequency (by a factor of 2 to 3). *mean annual exceedence rate Region 1: Gulf Coast 0.015 0.028 0.054 0.143 0.279 0.423 0.469 Simulation* 0.007 0.007 0.037 0.142 0.313 0.418 0.463 Empirical data* 170 150 135 114 96 83 80 Threshold (kt) V++ V+ V IV III II > I Category (SS)
Modeled 100-year return levels do not appear to match empirical evidence.
Might be a problem with the regression structure of the shape parameter. Using simpler models where we replace the covariates with discrete factors (above/below normal) we produce a better match.
Convergence is not guaranteed when modeling the log( ) and scale as a linear combination of covariates.
Problem in region 4 (NE) where there are fewer hurricanes.
Threshold is not estimated in the current model.
Behrens, Lopes and Gamerman(2004)
"Bayesian analysis of extreme events with threshold estimation“
Statistical Modelling, 4, 227-244.
The nature of the shape parameter is such that its value is a function of the support of the underlying distribution.
Attempts to model this using a Reversible Jump Markov Chain MC approach reduces this problem but introduces significant autocorrelation into the sampler.
Model assigns positive probability for discrete values of xi, uses RJMCMC
To be useful to risk models, the relationship between climate and hurricane activity needs to be forecast in advance of the season.
Fortunately, three important climate variables related to hurricanes can be used in a prediction model; but each variable enters the prediction model in a unique way.
NAO : Natural precursor signal to hurricane activity.
SST : Slowly varying (persistent).
ENSO : Can be predicted with some skill by dynamical models.
Predicting Insured Losses 21
22 Peaks Over Threshold
Small Loss Events (36.5%) Large Loss Events (63.5%) 99.4% Losses 0.6% Losses Reference line indicates 80/17 split We split losses (red line: $100 Million) Allows us to examine significant events Splitting Insured Losses
24 Large Loss Potential Evenly Distributed
SST SST 25 Preseason Predictors Insured Loss Model Model Distributions : log(loss): Truncated Normal . dnorm( ,1/ 2 ) Rate: Poisson dpois( ) May June averaged values of predictors
SST 27 Extreme Loss Model and Results Predictors set at maximum values of covariates with least favorable climate: +NAO, +SOI -NAO, -SOI Model Distributions : log(loss): GPD distribution . dGPD(u, , ) u=9 (log(1 Billion)) Rate: Poisson dpois( )
The yearly model and extreme loss model use the same Peaks over Threshold approach.
Yearly model used SST and NAO predictors with truncated Normal distributions:
Smallest DIC for truncated Normal, with SST and NAO predictors
Better estimates of single year insured losses.
Extreme Loss Model uses SST, NAO and SOI with GPD distribution:
SST and NAO used in regression for logarithm of threshold crossing rate
Log(loss) assumed to have GPD distribution for loss > $1 Billion.
Mean Residual Life Plot used to estimate threshold.
NAO used in log( ) regression
SOI used in regression
More Information: “Forecasting U.S. Insured Hurricane Loses” , available at our website, in “Forecasting Insured Losses” in press.
Comments on POT BUGS Models:
Model mixes well.
Model must be initialized carefully.
Once model compiles it runs smoothly.
Posterior mean and MLE sometimes very different.
MLE may be approximated by posterior sample values at minimum deviance…, but not well in this model.
Return level sampling not recommended.
Data and R code using BRugs available.
POT model using U.S. Hurricane loss data
Demo and R/OpenBUGS code available.
Examples of truncated normal distributions in POT type model as well as truncated normal/GPD.
A sediment core from a back barrier marsh in New England.
These data are currently not used in risk models.
Evidence of Prehistoric Hurricanes peat layer sand layer 1954 H peat layer peat layer peat layer sand layer 1938 H sand layer 1635 H Courtesy: Jeff Donnelly
Increasing Radius beginning at 45 km
Combining extreme value theory with a Bayesian specification provides a practical way to assess return periods of extreme hurricane winds.
Results are expressed in terms of posterior distributions of the parameters of interest.
Less reliable, but still useful information is incorporated into the model in a natural way.
A large number of simulated hurricane seasons can be generated by repeated sampling from the posterior distributions.
Analysis of hurricane winds suggests the possibility that the highest winds estimated for early storms may be under reported.
Results suggests the possibility that the highest winds estimated for early storms may be under reported.
The approach can be used to better understand the projected impact of global warming on extreme hurricane winds.
The POT approach can be extended to insured losses and the logarithm of extreme losses can be modeled using a GPD distribution.
32 Taken as a random event along the Gulf coast, the return period of Hurricane Katrina is 21 years.
Katrina might be a sign of things to come as the observed and modelled effect of global warming appears to start at Katrina’s intensity.
Preseason climate signals provide information about the nature of the upcoming hurricane season.
The hurricane risk index quantifies this information geographically.
Insured losses (both expected and maximum) show a statistical link to preseason climate signals.
Google hurricane climate
http://garnet. fsu . edu/~jelsner/www
[email_address] . edu ; [email_address] . edu
Future Work I
Use more data and new covariates
Incorporate historical climate data with geological evidence and historical data sets. (last slide)
Use proxy data for covariates
Use Quasi-Biennial Oscillation (QBO) Zonal Wind Index .
Only from 1953, very predictable, especially west phase
50 mb, East phase, more vertical wind shear.
El-Nin ̃ o may interact with West descent phase.
New wind index:
Saunders, M. A. and A. S. Lea, 2005: Seasonal prediction of US landfalling hurricane activity from 1 August, Nature 434, 1005 – 1008.
Future Work II
Use alternative prior formulation:
Choose three return periods, rp[1:3]= [2,10, 100] years
Assign reasonable multivariate prior to the return levels for rp[1:3] with rl < rl < rl (using rl, rl-rl,rl-rl)
Use rl[i]=u+ ¢ ( ( z[i] ¢ u ) -1 ) with z[i]=1/log(rp[i]/ ( rp[i]-1 ) ) ¼ rp[i]-1/2
Solve for u , , as a function of rl[1:3], rp[1:3]
Coles, S. G. and J. A. Tawn, (1996): A Bayesian analysis of extreme rainfall data. Appl. Statist ., 45 , 463-478
Future Work III
Additional analyses using extreme value theory and Bayesian methods :
Use Bayesian model averaging and median models selection
Use RJMCMC to examine spatial and temporal parameter changes, e.g. smoothing splines (WinBUGS 1.4 RJMCMC)
Apply to pressure, rainfall and sea surge.
Apply to hurricane rapid intensification (RI) process.
Bivariate GPD, maximum intensity, intensification per storm
Western Pacific RI density, conditional probability of intensification.
How does global temperature affect RI.
Use spatial model for distribution of GPD parameters.
Spatial and Temporal Models :
Clustering models on a lattice. (Jagger Dissertation)
Point process models (Cox Processes). (R/Splus)
Apply to hurricane activity and intensity full models.