SlideShare a Scribd company logo
1 of 15
Atmospheric Power Law Behavior
      A Look at Southeastern US Temperatures

                              James Duncan
Motivation & Introduction
   Extreme climatic events are weather phenomena
    that occupy the tails of a dataset‟s probability density
    function (PDF).
   Advanced stochastic theory asserts that power law
    distributions should exist in the tail ends of our data.
   Questions to Answer:
       Show That Power Law Distributions Are Evident within
        Temperature Data.
       Analyze how power law distributions change with varying
        weather and climatic patterns (seasons, ENSO, etc.).
What is a Power Law
Distribution?
Mathematically, a power law probability distribution of
quantity x may be written as:                             -a
                                        a -1 æ x ö
               p ( x ) = Cx -a   p(x) =      ç
                                        x èx ø
                                                 ÷
                                              min   min


Where α is the exponent or scaling parameter and C is
the normalization constant.




                       [Neelin et al. 2011]
Data & Methods
Data
 Daily observed maximum and minimum temperatures across the
  southeastern United States (AL, FL, GA, NC, SC) spanning 1960-
  2009.
 Measures of quality control have been put in place resulting in the
  omission of 20 stations.
Methods
 Trends have been removed from the data.
 If data is to follow a power law distribution, it does so above some
  lower bound xmin.
       To find our lower bound, we employ the Kolmogorov-Smirnov or KS
        Statistic which calculates the maximum difference between the CDF of the
        observed data and estimated power law distribution.
   To calculate our scaling parameter α, we employ the “method of
    maximum likelihood”.
                                                   én     x    ù-1
            D = max | F(x) - P(x) |       a = 1+ n êå ln i ú
                                          ˆ
                 x³x min                           ë i=1 x min û
Significance Testing
   Employ the use of a goodness of fit test which will
    measure and analyze the KS distance of our power
    law distribution with that of other synthetically
    derived power law distributions.
   From this goodness of fit test, we are able to derive
    a „p-value‟ which expresses the probability that the
    estimated power law distribution is a good fit to the
    observed data.
Presence of Power Law
Distributions
Power Law Fit &
     Significance                                                Skewness
                       P-Value Tests                             Kurtosis




Criteria                                        Is Power Law Fit Significant?

Ppower>0.10 and Pgauss<0.10                     YES

Ppower<0.10 and Pgauss>0.10                     NO

Ppower>0.10 and Pgauss>0.10 but Ppower>Pgauss   Both Fits Are Significant, But Can Say Power Law is Better Fit (YES)

Ppower<0.10 and Pgauss<0.10                     NO
Xmin & Alpha




More analysis is needed to adequately note whether patterns exist in the spatial
distributions of Xmin and Alpha.
Distinguished Power Laws
                                Maximum
                              Temperatures
           Tamiami, FL                             Asheville, NC




                                Minimum
           Hialeah, FL         Temperature         Henderson, NC
                                    s




 Distinguished Criteria (ppower>0.90 & pgauss~0)
Seasonal Shifts in
Power Law Distributions



Now that we have established that power law
  distributions are existent, how are they
modulated by changes in the seasonal cycle?
Fall Power Law Fit &
     Significance                                               Skewness
                        P-Value Tests                           Kurtosis




Criteria                                        Is Power Law Fit Significant?

Ppower>0.10 and Pgauss<0.10                     YES

Ppower<0.10 and Pgauss>0.10                     NO

Ppower>0.10 and Pgauss>0.10 but Ppower>Pgauss   Both Fits Are Significant, But Can Say Power Law is Better Fit (YES)

Ppower<0.10 and Pgauss<0.10                     NO
Future Work & Conclusions
   There does appear to be a dynamic link between
    areas of significant power law fit and areas of distinct
    skewness and kurtosis.
   Further examine how power law distributions change
    with respect to season, ENSO, and other climatic
    cycles.
       Look to see if these modulations in the power law
        distribution may be explained by any specific physical
        processes.
   Look into more ways to objectively characterize
    changes in the power law parameters (Xmin and
    Alpha) and distribution.
References
Clauset, A., C. R. Shalizi, and M. E. J. Newman, 2009: Power-law distributions in empirical data, SIAM Rev., 51, 661-
    703.
Neelin, D., and T. W. Ruff, 2011: Long tails in regional surface temperature probability distributions with implications for
   extremes under global warming. Geophys. Res. Lett., 39, l04704, doi: 10.1029/2011GL05061.
Newman, M. E. J., 2005: Power laws, Pareto distributions and Zipf‟s law, Contemp. Phys., 46, 323-351.
Sura, P., 2011: A general perspective of extreme events in weather and cliamte. Atmos. Res., 101, 1-21.
Stefanova, L., P. Sura, and M. Griffin, 2012: Quantifying the non-Gaussianity of wintertime daily maximum and
    minimum temperatures in the Southeast United States. J. Climate, in press.
Winter Power Law Fit &
     Significance
                      P-Value Tests                               Skewness Kurtosis




Criteria                                        Is Power Law Fit Significant?

Ppower>0.10 and Pgauss<0.10                     YES

Ppower<0.10 and Pgauss>0.10                     NO

Ppower>0.10 and Pgauss>0.10 but Ppower>Pgauss   Both Fits Are Significant, But Can Say Power Law is Better Fit (YES)

Ppower<0.10 and Pgauss<0.10                     NO
Values of Xmin




Appears to be several more distinct regions of behavior than Annual Behavior;
however, more analysis and comparison is need to adequately depict the
potential patterns developing spatially.

More Related Content

Similar to COAPS Short Seminar Series

EXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docx
EXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docxEXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docx
EXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docxgitagrimston
 
Tales of correlation inflation (2013 CADD GRC)
Tales of correlation inflation (2013 CADD GRC) Tales of correlation inflation (2013 CADD GRC)
Tales of correlation inflation (2013 CADD GRC) Peter Kenny
 
Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...
Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...
Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...Garima Bharti
 

Similar to COAPS Short Seminar Series (7)

EXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docx
EXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docxEXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docx
EXERCISE 23 PEARSONS PRODUCT-MOMENT CORRELATION COEFFICIENT .docx
 
Tales of correlation inflation (2013 CADD GRC)
Tales of correlation inflation (2013 CADD GRC) Tales of correlation inflation (2013 CADD GRC)
Tales of correlation inflation (2013 CADD GRC)
 
Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...
Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...
Enhancement of Power System Static and Dynamic Stability Using UPFC by GA and...
 
Healthcare
HealthcareHealthcare
Healthcare
 
Qsar by hansch analysis
Qsar by hansch analysisQsar by hansch analysis
Qsar by hansch analysis
 
Davis_Research_Report
Davis_Research_ReportDavis_Research_Report
Davis_Research_Report
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 

COAPS Short Seminar Series

  • 1. Atmospheric Power Law Behavior A Look at Southeastern US Temperatures James Duncan
  • 2. Motivation & Introduction  Extreme climatic events are weather phenomena that occupy the tails of a dataset‟s probability density function (PDF).  Advanced stochastic theory asserts that power law distributions should exist in the tail ends of our data.  Questions to Answer:  Show That Power Law Distributions Are Evident within Temperature Data.  Analyze how power law distributions change with varying weather and climatic patterns (seasons, ENSO, etc.).
  • 3. What is a Power Law Distribution? Mathematically, a power law probability distribution of quantity x may be written as: -a a -1 æ x ö p ( x ) = Cx -a p(x) = ç x èx ø ÷ min min Where α is the exponent or scaling parameter and C is the normalization constant. [Neelin et al. 2011]
  • 4. Data & Methods Data  Daily observed maximum and minimum temperatures across the southeastern United States (AL, FL, GA, NC, SC) spanning 1960- 2009.  Measures of quality control have been put in place resulting in the omission of 20 stations. Methods  Trends have been removed from the data.  If data is to follow a power law distribution, it does so above some lower bound xmin.  To find our lower bound, we employ the Kolmogorov-Smirnov or KS Statistic which calculates the maximum difference between the CDF of the observed data and estimated power law distribution.  To calculate our scaling parameter α, we employ the “method of maximum likelihood”. én x ù-1 D = max | F(x) - P(x) | a = 1+ n êå ln i ú ˆ x³x min ë i=1 x min û
  • 5. Significance Testing  Employ the use of a goodness of fit test which will measure and analyze the KS distance of our power law distribution with that of other synthetically derived power law distributions.  From this goodness of fit test, we are able to derive a „p-value‟ which expresses the probability that the estimated power law distribution is a good fit to the observed data.
  • 6. Presence of Power Law Distributions
  • 7. Power Law Fit & Significance Skewness P-Value Tests Kurtosis Criteria Is Power Law Fit Significant? Ppower>0.10 and Pgauss<0.10 YES Ppower<0.10 and Pgauss>0.10 NO Ppower>0.10 and Pgauss>0.10 but Ppower>Pgauss Both Fits Are Significant, But Can Say Power Law is Better Fit (YES) Ppower<0.10 and Pgauss<0.10 NO
  • 8. Xmin & Alpha More analysis is needed to adequately note whether patterns exist in the spatial distributions of Xmin and Alpha.
  • 9. Distinguished Power Laws Maximum Temperatures Tamiami, FL Asheville, NC Minimum Hialeah, FL Temperature Henderson, NC s Distinguished Criteria (ppower>0.90 & pgauss~0)
  • 10. Seasonal Shifts in Power Law Distributions Now that we have established that power law distributions are existent, how are they modulated by changes in the seasonal cycle?
  • 11. Fall Power Law Fit & Significance Skewness P-Value Tests Kurtosis Criteria Is Power Law Fit Significant? Ppower>0.10 and Pgauss<0.10 YES Ppower<0.10 and Pgauss>0.10 NO Ppower>0.10 and Pgauss>0.10 but Ppower>Pgauss Both Fits Are Significant, But Can Say Power Law is Better Fit (YES) Ppower<0.10 and Pgauss<0.10 NO
  • 12. Future Work & Conclusions  There does appear to be a dynamic link between areas of significant power law fit and areas of distinct skewness and kurtosis.  Further examine how power law distributions change with respect to season, ENSO, and other climatic cycles.  Look to see if these modulations in the power law distribution may be explained by any specific physical processes.  Look into more ways to objectively characterize changes in the power law parameters (Xmin and Alpha) and distribution.
  • 13. References Clauset, A., C. R. Shalizi, and M. E. J. Newman, 2009: Power-law distributions in empirical data, SIAM Rev., 51, 661- 703. Neelin, D., and T. W. Ruff, 2011: Long tails in regional surface temperature probability distributions with implications for extremes under global warming. Geophys. Res. Lett., 39, l04704, doi: 10.1029/2011GL05061. Newman, M. E. J., 2005: Power laws, Pareto distributions and Zipf‟s law, Contemp. Phys., 46, 323-351. Sura, P., 2011: A general perspective of extreme events in weather and cliamte. Atmos. Res., 101, 1-21. Stefanova, L., P. Sura, and M. Griffin, 2012: Quantifying the non-Gaussianity of wintertime daily maximum and minimum temperatures in the Southeast United States. J. Climate, in press.
  • 14. Winter Power Law Fit & Significance P-Value Tests Skewness Kurtosis Criteria Is Power Law Fit Significant? Ppower>0.10 and Pgauss<0.10 YES Ppower<0.10 and Pgauss>0.10 NO Ppower>0.10 and Pgauss>0.10 but Ppower>Pgauss Both Fits Are Significant, But Can Say Power Law is Better Fit (YES) Ppower<0.10 and Pgauss<0.10 NO
  • 15. Values of Xmin Appears to be several more distinct regions of behavior than Annual Behavior; however, more analysis and comparison is need to adequately depict the potential patterns developing spatially.

Editor's Notes

  1. This research is motivated by the study of extreme events, that is an event where the magnitude of the event is large, but the probability of the occurrence is rather/relatively small. These extreme events are high impact, hard to predict phenomena that is beyond our normal (Gaussian) expectations. Thus for the interest of our research, we are interested in the tails (maxima/minima) in the data. Here, an extreme event is defined in terms of the non-Gaussian tail of the data’s probability density function, opposed to the definition in extreme value theory.While it understood that the PDFs of atmospheric phenomena are non-Gaussian, the exact shape/distribution of these tails are not fully understood. So from a purely stochastic perspective (intrinsically non-deterministic, sporadic, and categorically not intermittent (ie random), this distribution should exist in the tails of this, so we want to analyze/investigate this question and further look into the behavior of the power laws in nature ( in this case temperature )An example of a stochastic process in the natural world is pressure in a gas. Even though each molecule is moving in a deterministic path, the motion of a collection of them is computationally and practically unpredictable. Purpose of research: More so, understanding the statistical distribution of daily temperature extremes is of practical interest in ecology, agriculture and utilities planningWeather and climate risk asesment depends on knowing the tails of the PDFsState purpose in modeling our extreme events (get a better idea of the distribution of extreme events, and inherently do a better job forecasting/predicting the occurrence and magnitude of these events), also important with regards to climate change, because if we are witnessing a shift in the mean or norm of our data, we can also expect a shift (sometimes in multiple magnitudes) in the tail of our datasets. -way to model, temperature department building (model energy use), how many times will max occur -industries effected by this, insurance and modeling industryThat is, with respect to climate change, if we get a small shift in the mean of a dataset, then the extreme values become of more importance. Present and discuss observational examples, and applications of our non-Gaussian stochastic framework.
  2. It is not arbitrary to look for a power law distribution (as stated by stochastic theory and the existence of power laws throughout the physical world)[Equation on the Right is the Normalized Expression]Properties of PowerMathematically-a quantity x obeys a power law if -When the frequency of an event varies as a power of some attribute of that event -more often the power law applies only for values greater than some minimum xmin, in such cases we say that the tail of thedistribution follows a power law. -The distribution must deviate from the power-law form below some minimum value xminPhysically-It has been shown from observations that many atmospheric variables follow a power law distribution in the tails -Power-law distributions occur in an extraordinarily diverse range of phenomena. Note: power laws with alpha of less than one rarely occur in nature, as they would diverge
  3. Quality control measures have been put into place (resulted in omission of 20+ stations) -quality controlled digital data from the Summary of the Day data set supplied by the National Climatic Data Center (NCDC) -daily measurements of maximum and minimum temperature are provided by the National Weather Service’s Cooperative Observation Program (COOP) -For this study, only selected stations reporting since at least 1960, stations that have more than 5 consecutive years of missing data were discarded. -In case of missing data for a given station, correlations between the existing time series at this reference station and surrounding stations within a 50-mile radius are computed and stations with correlations greater than 0.6 are retained for use in reconstructing the reference station’s missing data. -also not too worried about the missing data beyond the QC put in place, the reason for this is because we are interested in the tails of our distribution, and we expect data that is in the tails to always be recorded, anomalously large events are always recorded, more likely that values close to the mean would be omitted. ********************************************************************************************************************************************************************************************In order to utilize the K-S statitistic, the CDFs of both the observed data and the estimated power law distirbution must be calculated. One typically cannot say with absolute certainty that an empirical data set is described by a specific probability distribution Rather, it can only be stated that the observed data is in agreement with the proposed PDF.(test various values of xmin, choose the one with the smallest K-S statistic)Our method attempts to minimize the difference between the distribution of the observed data and the best estimation of the power law distribution assigned to the data by using the Komogorov-Smirnov statistic (K-S Statistic) -D is the maximum distance between the cumulative distribution function of the observed data F(x) and the cumulative distribution function of the estimated power law distribution P(x), in the domain of x &gt; xmin -By testing different values of xmin and calculating the respective K-S distance, one obtains many different values of D that serve as a comparison between the CDF of the estimated power law distribution and the CDF of the observed data. -The value of xmin where the smallest value of D was obtained becomes the permanent lower bound of the estimated power law fit.-note: must be some lower bound to the power-law behavior. Point at which the power law distribution appears. -allows one to consolidate the domain of x where the power law is located. -if we choose too low a value for xmin, we will get a biased estimate of the scaling parameter since we will be attempting to fit a power-law model to non-power-law data. -if we choose too high a value for xmin, we are effectively throwing away legitimate data point x &lt;xmin -better to err a little on the high side, but estimates that are too low oculd have severe consequences. -estimating a value of xmin is crucial for determining the power law exponent, as the slope of the power law distribution is determined by which data points are within the domain of the power law distribution.*********************************************************************************************Once we have an estimation of the lower bound of the power law distribution, the value of xmin may be used in estimating the scaling parameter of the power law distribution.Talk about straight line on log log plot, note alpha is slopeTo obtain this parameter, we utilize the “method of maximum likelihood” (MLE)-obtains a value of alpha by summing over each empiracle data point (xi) (xi are observed values) that is greather than or equal to the previously estimated value of xmin. -MLEs will give us no warning that our fits are wrong: they tell us only the best fit to the power-law form, not whether the power law is in fact a good model for the data.
  4. To quantitatively measure the significance of our estimated power law distribution, we employ a test that calculates the K-S distance between the power law distribution and many idealized, synthetically-produced data sets. One is not enough, it is plausible that by chance the synthetic dataset will have a more precise fit to the empirical data than that of a power law distribution with small variations or sampling errors. -In other words, in instances where D syn &lt; D the estimated power law distribution is not able to represent the data more closely than random chance. -Compare the K-S distance of a large number of synthetic datasets. -As the number of datasets increases, Dsyn&lt; D will converge closer to an expected value. To obtain an estimate of the expected value, we take the number of datasets where Dsyn&lt;D and divide it by the total number of synthetic datasets. The result is a “p-value” which expresses the probability that the estimated power law distribution is a good fit to the observed data. -Use the threshed of .10, thus less than 10% of the time our synthetic data set was a better fit to the distribution. The calculation of p-values for multiple distributions is a way to test or compare different probability distribution fits to empirical data. Pgauss, is a quantitative measure of how appropriate the Gaussian fit is to the data.
  5. So I started running this program through the distributions of this data one by one, for maximum and minimum temperatures for the 272 stations. Slowly realized that this would neither be an effective or efficient way to note power law behavior in the atmosphere.Nor will it help us to determine any patterns or obvious fluctuations with changing weather patterns.
  6. So I wanted to quantify the strength of the power law distributions, in which places we can say with significance whether or not there is a power law distribution present. When attempting to fit a probability distribution to empirical data, it is nearly impossible to find only one distribution that describes the behavior of the data. One typically cannot say with absolute certainty that an empirical data set is described by a specific probability distribution Rather, it can only be stated that the observed data is in agreement with the proposed PDF.However, when both p-values for p(gauss) and p(power) are above .10, there arises some problems…both distributions are then significant in that they could be a possible fit for the data. However, we can say that the power law fit was ‘better’ due to the higher p-value. It may be helpful to discern which distribution returns a larger p-value, even though both distributions are ‘significant’Our current criteria may miss out on any case where the power law is a ‘better’ fit of the data than the Gaussian distribution. When p is greater than 0.10 for a distribution, can say that the fit is significant. Thus, however if both are greater than our threshold (cite this threshold), we say that both distributions are significant, however, one may be a better fit that the other. Where should we expect power law distribtuions, in regions where we have heavy tails, ie a certain combination foskewnnes and kurtosis-Skewness --Is a measure of the asymmetry of the probability distribution of a real-value random variable (right vs left skew) -positive skewness, seem to see power law in the negative side of the pdf, negative skewness seem to see power law in the positive side-Kurtosis-any measure of the ‘peakedness’ of the probability distribution of a real-valued random variable. -Kurtosis is a descriptor of the shape of a probability distribution. -a higher kurtosis distribution has a sharper peak and longer, fatter tails, while a low kurtosis distribution has a more -ie a positive kurtosis corresponds to high peak with more data contained in the tails of the distribution -stronger kurtosis, potentially have ‘heavier’ tails (so we are interested in potentially larger areas of kurtosis)Results:Because of negative skewness, expect positive side of pdf to have a ‘heavier tail’ and this doest seem to mirror where we find our ‘most significant’ power law distributionsNegative side seems to be dominated by kurtosis pattern, whereas skewness dominated positive side of the pdf.In non-gaussian areas, see greater power law fitAnalyze impacts of
  7. Remember nick thinks that it isnt bad to have one or two statements underneath the plotsNegative skewness would mean tail extends out to left, and we may see a relation beteen higher x-min values and the region where the tail extends out to, ietrivaly because less standard deviations from the mean(mean seeing higher x-min values in respective region of skewness)-tailsideWant to look at x-min and alpha behavior in locations where our p-values were strongest/most significantNote the few patterns that do show up in thisAlso not known whether or not these have any physical meaning or purposeNote:Maximum Temperatures-had significant power law fits on the positive side of the distribution given the large span of negative skewness-significant power law fit up through north Carolina-had almost like a cold tongue in north Carolina in kurtosis which appeared to show up in the negative p-value side. -Think of a way to better analyze (possibly normalize x-min values, and find departure, or instead visualize anomalous x-min values), and potentially note patterns with changing season and such. -Further analysis is neededMinimum Temperatures-skewness most zero throughout much of the southeastern United States-Small pocket of negative skewness was existent in the middle portion of Florida. This was matched in the presence of significant power law fits in the man portions of Florida. Most of Georgian Alabama South and North Carolina demonstrated non significant power law fit. -Kurtosis was mostly positive through the southeastern US, south eastern florida demonstrates on average above normal positive values of kurtosis when compared to other areas. -Meaning more peakedness, and heavier tails, and thus of more interest to us, with regards to an annual analysis
  8. Selected by Magnitude Plot, possibly show here, and look into individual plots to note characteristicsDecided based upon comparative values of pSEEMED TO BE INTERESTING STUFF GOING ON IN SOUTH FLORIDA AND NORHT CAROLINA
  9. Now that we have established that power laws exist, who do they change with season, or ENSOWe must note change in order to understand the true significance to the originalHow do seasons alter a) strongest distributions from before b) alter areas of weak confidence
  10. Note impact of kurtosis on the northern portions of the SE United StatesAppears as though with negative skewness, appear to have a higher confidence in the significane of the p-value on the positive tail of the pdfSkewness pattern in the minimum temperature similar to the annual, but may not be identical power law pattern because of negative kurotois (flatness), this may make for a more symmetrical appearance of power laws0 kurtosis seems to mirror 0 p-signifcance in side of the tailIn annual we saw a lot more regions of positive kurtosis or peakedness, resulting in heavier tails
  11. Discovered the patterns, what is next stepPower law distributions seem to be determined by the non-gaussinity of the dataset
  12. Previously Kurtosis seemed to mirror negative, but skewness seemed to dictate much of positive p valueMust ask the question as to why the pattern does not seem consistent, however it is interesting to note that percentage of power law significance has increased since the last plot, very few white dotsNegative kurtosis may decrease pattern/relation of skewness to p-value significanceEither way from these we are still able to see that power laws are significant throughout naturePatterns of power law significance does change with varying seasons
  13. Values of x-min are in general smaller it appears, more diverse a range of x-min values which is interesting