SlideShare a Scribd company logo
Marshall University
College of Science
Department of Mathematics
- STA 564 -
Time Series Analysis and Forecasting
focused on Air Pollution in an Urban
Area
By Kenneth Guzman
December 7, 2018
Contents
1 INTRODUCTION 2
2 Yeo-Johnson transformation, Kolmogorov-Smirnov Test for normality 3
3 Factor Analysis and PCA 3
4 Box Jenkins Methodology 3
5 Personal Study Carried out in R 4
6 R Code Explanation and Software Package Used 29
7 Conclusion 30
References 31
TIME SERIES ANALYSIS FORECASTING
NOTES
At its core, the influences of air pollution in the atmosphere are strongly managed by
meteorology. However, in the ”univariate” models we will consider it is assumed that the
final concentration of air pollutants in the atmosphere is the final result of all the complex
interactions of meteorology, chemistry, transport, diffusion etc. For this reason, the combined
information of their effect on air pollutant concentration is contained in the corresponding
time series in a stochastic way. Using this approach, calculations are simplified and performed
only using the time series of the pollutant without explicit inclusion of meteorological or other
measurements.
Four professors from Plovdiv University in Bulgaria, produced a research paper on time series
analysis concerning air pollution, the methods used were explicitly stated in their article as:
(i)
Identify correlation type dependencies and grouping of observed air pollutants using the
method of factor analysis to explain mutual effects of pollution.
(ii)
Conduct time series analysis by determining seasonal ARIMA(based on hourly data) relevant
parametric models of pollutants.
(iii)
Analysis and Diagnostics of constructed models.
(iv)
Application of models for short term forecasting.
(v)
Interpretation of the results and definition of the conditions contributing to the exceeding
of national and European, concentration norms for the considered air pollutants.
Their study was carried out using IBM SPSS 19 and EViews 7.[3]
1
TIME SERIES ANALYSIS FORECASTING
1 INTRODUCTION
Even though there are established regulations for monitoring and controlling effects on
air quality in certain territories, air quality may remain unsatisfactory. Lets consider the
particular case where our focus lies within the town of Blagoevgrad, Bulgaria. Blagoevgrad is
a typical representative of a small urban region, with a population of approximately 70,000.
Time Span of Study:1 year period from September 1st, 2011 to August 31st, 2012, based
on hourly measurements, six air pollutants were observed. Factor analysis and Box-Jenkins
methodology were applied to inspect concentrations of the primary air pollutants of interest.
The pollutants were grouped into three factors and the degree of contribution of the factors
to the overall pollution was determined, this contribution was interpreted as the presence
of common sources of pollution. The classical techniques of principal component analysis
(PCA) and factor analysis are important statistical instruments frequently used in the
environmental sciences.
The focus of the study involved the performance of time series analysis and the development
of univariate stochastic seasonal autoregressive integrated moving average (SARIMA) models
with recording on an hourly basis as seasonality. The study incorporates Yeo-Johnson power
transformation for variance stabilizing of the data, and model selection by using Bayesian
Information Criterion. The SARIMA models obtained in the study in Bulgaria demonstrated
good fitting with respect to the observed air pollutants and short term predictions for 72
hours ahead, specifically in the case of ozone and particulate matter PM10. The methods
presented, allowed the building of less complex models that are effective for short-term air
pollution forecasting and useful for advance warning purposes in urban areas.[3]
Continuous and careful monitoring and forecasting of atmospheric air pollutants is important
when evaluating regulatory control measures related to air quality. In Bulgaria, 12 types
of pollutants are systematically monitored by more than 36 automated stations run by the
Executive Environment Agency(EEA), which manages and coordinates activities related to
the control and environmental protection of the country. Atmospheric air quality reports
for the various regions of the country are regularly published, and from this much data is
accumulated. The data accumulation is what allows us to carry out statistical analysis which
leads to the discovery of, general patterns and dependencies for different time periods and
relationships between observed air pollutants. The observed air pollutants related to the
study carried out in Blagoevgrad, Bulgaria are concentrations of particulate matter PM10,
nitrogen oxide NO, nitrogen dioxide NO2, nitrogen oxides NOx, sulfur dioxide SO2, and
ground level ozone O3. The data measurements are expressed in units of mass concentration
of pollutants in µg/m3
, only NOx is in unit ppb(partsperbillion, as it is observing pollution
from all kinds of nitrogen oxides. The data consisted of 8,744 observations (hourly data).
The goal of their study was to demonstrate the capabilities of the mentioned methods, which
can be applied to other recorded sets including for shorter and longer periods of time.
2
TIME SERIES ANALYSIS FORECASTING
2 Yeo-Johnson transformation, Kolmogorov-Smirnov
Test for normality
Time series data often requires preparation before using forecasting methods; and for this
reason normal or near to normal distribution of the univariate data is important, because
it reduces issues when we forecast future values. The obtained K-S statistic indicated
non-normality of the data collected in Bulgaria, which led to the transformation of the
data prior to constructing the forecasting models. In that particular case the Yeo-Johnson
transformation was carried out, which lead to the satisfying of the Kolmogorov Smirnov Test
for normality at 0.05 level of significance and may be assumed to be normally distributed.
The Yeo-Johnson transformation finds the optimal value of lambda that minimizes the
KullBack-Leibler1
distance between the normal distribution and the transformed distribution.[1][2]
Properties of Yeo-Johnson transformation below:
g(x; λ) = {1(λ=0,x≥0)
(x + 1)λ
− 1
λ
{1(λ=0,x≥0) log(x + 1)
{1(λ=2,x<0)
(1 − x)2−λ
− 1
λ − 2
{1(λ=2,x<0) − log(1 − x)
3 Factor Analysis and PCA
The statistical techniques of factor analysis and principal component analysis, help identify
patterns in the correlation between variables. The patterns identified are used to create
factors, which was the case in Bulgaria and allowed the grouping of correlated pollutants.
The steps followed for the particular case in Bulgaria were: (a) calculation of correlation
matrix (b) testing the adequacy of factor anaylsis (c) factor extraction (d) factor rotation
and (e) score calculation of factor variables. The particular advantages of these methods are
that they reveal strong correlation relationships between observed variables and allow their
grouping into new variables (factors) in order to reduce the dimensions of the complex data
structure. The factors can thereafter be used to build regression or other types of models.[5]
4 Box Jenkins Methodology
Other methods frequently used in times series analysis and forecasting are the auto-regressive
integrated moving average(ARIMA) and seasonal ARIMA (SARIMA)models, also known as
Box-Jenkins stochastic models. Box-Jenkins methodology is widely applied in air quality
research among other disciplines, and is a systematic strategy for identifying, fitting, and
forecasting time series univariate data. ARIMA models generally take the form Arima(p,d,q)
1
In mathematical statistics, the KullbackLeibler divergence (also called relative entropy), is a measure of
how one probability distribution is different from a second, reference probability distribution.
3
TIME SERIES ANALYSIS FORECASTING
where p is the number of parameters describing the auto-regressive process, d is the number of
nonseasonal differences needed to reach stationarity, and q is the number of lagged forecast
errors in the prediction equation. Similarly, the SARIMA models take the general form
Arima(p,d,q)(P,D,Q)s, where P is the number of seasonal auto-regressive terms, D is the
order of seasonal differencing and Q is the number of seasonal moving average terms. In the
seasonal part of the model, the three parameters P,D,Q operate across multiples of lag s,
where s is the number of time periods until a pattern repeats itself.
Main advantages of the Box-Jenkins approach:
(i)
Applicability for modeling and forecasting practically any time series that is stationary or
can be reduced to stationary by a differencing procedure.
(ii)
Ability to extract all the trends and serial correlations in the data with a minimized sequence
of white noise(shock) through inclusion in one general model equation that gets to the basis
of historical data development.
(iii)
The method has been incorporated into many standard software packages which exist within
R, SPSS, etc., which speeds up and assists the modeling process considerably.
5 Personal Study Carried out in R
Using the presented methods, I was able to carry out my own study using the statistical
software R. Using data provided by our own Environmental Protection Agency here in the
United States (https://www.epa.gov/outdoor-air-quality-data), I accessed pollutant concentration
data for the city of Richmond, Virginia, which has a population of approximately 220,000.
Time Span of Observed Data: A total of 4 years of data was accessed, periods from January
2010 to December 2013 based on weekly measurements of the following air pollutants,
concentrations of particulate matter PM2.5, particulate matter PM10, lead Pb expressed
in units of mass concentration (µg/m3
), carbon monoxide CO and ground level ozone O3
are in units ppm(partspermillion), sulfur dioxide SO2 and nitrogen dioxide NO2 are in units
ppb(partsperbillion). The goal of my personal research is to apply the time series analysis
and forecasting methods from the research paper produced in Bulgaria, to a local city here
in the US. As was the case in Bulgaria, once these methods are applied to the Richmond
pollutant data I hope to visually show an appropriate forecast for each pollutant for the year
2013.
Before I proceed forward I would like to point out that while the research paper concerning
Bulgaria highlighted a factor analysis and principal component analysis approach, the correlation
matrix calculated in R concerning the Richmond pollutant data-sets, displayed no signs of
positive or negative correlation between the pollutants, therefore I did not proceed to carry
out any sort of factor analysis or PCA. Also, the 2013 pollutant data-sets were strictly used
to compare our forecast models to the actual data recorded by the EPA in 2013.
4
TIME SERIES ANALYSIS FORECASTING
Directly below is the correlation matrix for all 7 pollutants concerning data over the time
span of the years 2010, 2011, and 2012.
Analyzing PM-2.5 using 3 year data
The first pollutant we will analyze is particulate matter PM2.5
The lambda value used to transform the original PM-2.5 observations, λ = 0.227158.
Directly below is the time series plot for the 3 years after a yeo-johnson transformation.
5
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using only the forecast function in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for PM-2.5 and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on
the 2013 data. The lambda value used to transform the original PM-2.5 2013 observations,
λ = 0.05030683.
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant
PM-2.5.
6
TIME SERIES ANALYSIS FORECASTING
Using only 2012 data to predict 2013 values
The lambda value used to transform the original PM-2.5 observations for the year 2012,
λ = 0.7078218.
7
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot for 2012 after a yeo-johnson transformation.
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our arima model, the next step was to access our 2013 pollutant concentration
data for PM-2.5 to see how accurately auto.arima() predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on
the 2013 data. The lambda value used to transform the original PM-2.5 2013 observations,
λ = 0.05030683.
Finally, once we plot arima model against the 2013 time series plot, I believe the auto.arima
function is somewhat appropriate for predicting the trend of the Pollutant PM-2.5 for the
year 2013.
8
TIME SERIES ANALYSIS FORECASTING
Analyzing PM10 using 3 year data
The second pollutant we will analyze is particulate matter PM10
The lambda value used to transform the original PM10 observations, λ = 0.2409915.
Directly below is the time series plot for the 3 years after a yeo-johnson transformation.
Below is the time series plot using only the forecast function in R.
9
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for PM10 and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on
the 2013 data. The lambda value used to transform the original PM10 2013 observations,
λ = 0.7845362.
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
neither the forecast nor the auto.arima function is appropriate for predicting the values of
2013 for the Pollutant PM10.
10
TIME SERIES ANALYSIS FORECASTING
Using only 2012 data to predict 2013 values
The lambda value used to transform the original PM10 observations for the year 2012,
λ = −0.04297711.
Below is the time series plot for 2012 after a yeo-johnson transformation.
11
TIME SERIES ANALYSIS FORECASTING
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
The time series plot using the auto.arima function was not yielding an appropriate graph in
R.
Analyzing Pb(Lead) using 3 year data
The third pollutant we will analyze is lead Pb
The lambda value used to transform the original Pb observations, λ = −4.99994.
Directly below is the time series plot for the 3 years after a yeo-johnson transformation.
12
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using only the forecast function in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for Pb and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation
on the 2013 data. The lambda value used to transform the original Pb 2013 observations,
λ = −4.99994.
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant
13
TIME SERIES ANALYSIS FORECASTING
Pb.
Using only 2012 data to predict 2013 values
The lambda value used to transform the original Pb(Lead) observations for the year 2012,
λ = −4.99994.
Directly below is the time series plot for 2012 after a yeo-johnson transformation.
14
TIME SERIES ANALYSIS FORECASTING
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
The time series plot using the auto.arima function was not yielding an appropriate graph in
R.
Analyzing CO using 3 year data
The fourth pollutant we will analyze is carbon monoxide CO
The lambda value used to transform the original CO observations, λ = −3.577325.
Directly below is the time series plot for the 3 years after a yeo-johnson transformation.
15
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using only the forecast function in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for CO and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation
on the 2013 data. The lambda value used to transform the original CO 2013 observations,
λ = −2.432302.
16
TIME SERIES ANALYSIS FORECASTING
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant
CO.
Using only 2012 data to predict 2013 values
The lambda value used to transform the original CO observations for the year 2012, λ =
−3.641187.
Directly below is the time series plot for 2012 after a yeo-johnson transformation.
17
TIME SERIES ANALYSIS FORECASTING
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our arima model, the next step was to access our 2013 pollutant concentration
data for CO and see how accurately the arima model predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation
on the 2013 data. The lambda value used to transform the original CO 2013 observations,
λ = −2.432302.
Finally, once we plot the arima model against the 2013 time series plot, I believe the
auto.arima function is most appropriate for predicting the trend of the Pollutant CO for
2013.
18
TIME SERIES ANALYSIS FORECASTING
Analyzing O3 using 3 year data
The fifth pollutant we will analyze is ground level ozone O3
The lambda value used to transform the original O3 observations, λ = 3.615548.
Directly below is the time series plot for the 3 years after a yeo-johnson transformation.
19
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using only the forecast function in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for O3 and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation
on the 2013 data. The lambda value used to transform the original O3 2013 observations,
λ = 4.99994.
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
the forecast function is most appropriate for predicting the values of 2013 for the Pollutant
O3.
20
TIME SERIES ANALYSIS FORECASTING
Using only 2012 data to predict 2013 values
The lambda value used to transform the original O3 observations for the year 2012, λ =
4.99994.
Directly below is the time series plot for 2012 after a yeo-johnson transformation.
21
TIME SERIES ANALYSIS FORECASTING
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
The time series plot using the auto.arima function was not yielding an appropriate graph in
R.
Analyzing SO2 using 3 year data
The sixth pollutant we will analyze is sulfur dioxide SO2
The lambda value used to transform the original SO2 observations, λ = −0.227093.
Directly below is the time series plot for the 3 years after a yeo-johnson transformation.
22
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using only the forecast function in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for SO2 and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on
the 2013 data. The lambda value used to transform the original SO2 2013 observations,
λ = 0.2616144.
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
the forecast function is most appropriate for predicting the values of 2013 for the Pollutant
23
TIME SERIES ANALYSIS FORECASTING
SO2.
Using only 2012 data to predict 2013 values
The lambda value used to transform the original SO2 observations for the year 2012, λ =
−0.1123281.
Below is the time series plot for 2012 after a yeo-johnson transformation.
24
TIME SERIES ANALYSIS FORECASTING
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
The time series plot using the auto.arima function was not yielding an appropriate graph in
R.
Analyzing NO2 using 3 year data
The seventh and final pollutant we will analyze is nitrogen dioxide NO2
The lambda value used to transform the original NO2 observations, λ = 0.9783584.
Below is the time series plot for the 3 years after a yeo-johnson transformation.
25
TIME SERIES ANALYSIS FORECASTING
Directly below is the time series plot using only the forecast function in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our forecast and arima models, the next step was to access our 2013
pollutant concentration data for NO2 and compare each model to see how accurately it
predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on
the 2013 data. The lambda value used to transform the original NO2 2013 observations,
λ = 1.003092.
26
TIME SERIES ANALYSIS FORECASTING
Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe
the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant
NO2.
Using only 2012 data to predict 2013 values
The lambda value used to transform the original NO2 observations for the year 2012,
λ = 1.229131.
Directly below is the time series plot for 2012 after a yeo-johnson transformation.
27
TIME SERIES ANALYSIS FORECASTING
The time series plot using only the forecast function was not yielding an appropriate graph
in R.
Directly below is the time series plot using auto.arima function in R.
Now that we have our arima model, the next step was to access our 2013 pollutant concentration
data for NO2 and see how accurately the model predicted the 2013 values.
Before I created the time series plot for 2013, I preformed a yeo-johnson transformation
on the 2013 data. The lambda value used to transform the original CO 2013 observations,
λ = 1.003092.
Finally, once we plot the arima model against the 2013 time series plot, I believe the
auto.arima function is most appropriate for predicting the trend of the Pollutant NO2 for
2013.
28
TIME SERIES ANALYSIS FORECASTING
6 R Code Explanation and Software Packages Used
The following packages in the R software were used: MASS, bestNormalize, forecast.
• From MASS the function truehist was used to plot the histograms of the pollutant
data before and after the yeojohnson transformation was applied, to visually show the
transformation from non-normal to normal distribution of the data.
• From bestNormalize the function yeojohnson was used to transform the pollutant
data from non-normal to normally distributed, in order to better carry out our statistical
analysis.
• From forecast the functions forecast and auto.arima were used, each playing the most
important role in analyzing prior pollutant observations and forecasting our future
values as accurately as R allows for each pollutant.
The main functions that I will highlight in this sections are the forecast and auto.arima()
functions in R but I will also briefly explain my usage of the ts() and yeojohnson() functions.
It was very important to my study that within the forecast function level=F because while
having confidence intervals in our graphs could be useful, they were not particularly needed
for my study to be carried out, since I was mostly interested in the specific values that the
forecast function gave us in its output. Also, in the forecast package it was vary important
that we only forecast exactly 59 future values, which is simply due to the fact that there
are exactly 59 values in our EPA 2013 data for each pollutant. Now, in the auto.arima()
function, no restrictions needed to be called within the function but it was most important
that we accessed our forecast values by auto.arima()$f and just for reference we are also
able to access our original values that were put into the function by using auto.arima()$x.
One last note, when I was plotting the time series for the 3 year data, you should notice that
within each ts() function the frequency=(58) which I interpret as they were an average of
58 observations per year, and I simply got 58 by dividing the total amount of observations
in our 3 year data by 3, so 174/3 = 58. Within the yeojohnson() function you will notice
29
TIME SERIES ANALYSIS FORECASTING
that standardize=FALSE this is because if it is not declared within the function by default
R will further perform standardization of the values put into the function, I did not find
the further standardization useful in my case when dealing with the Richmond data, mainly
because the yeojohnson transformation was of interest in the Bulgaria study so I wanted to
follow that transformation as it is without further standardization.
7 Conclusion
In the Bulgaria study the researchers main goal was to be able to use the arima models in
order to forecast ahead 72 hours, because they used hourly data. Similarly, I feel it necessary
to highlight the importance the auto.arima() function played in helping forecast the year
2013. While it was not totally helpful with forecasting all pollutants, it was definitely more
helpful than the forecast() function, in identifying the trend or behavior of each pollutant
throughout the year(s). The most important finding I came across was that the 2012 data
alone was certainly not enough it most cases when attempting to forecast a future year, but
the 3 year(2010,2011,2012) data combination allowed both the forecast() and auto.arima()
functions to display their usefulness when forecasting. I certainly enjoyed preparing this
study and learning about time series and hope that I am given the opportunity to further
explore this discipline in the future.
30
TIME SERIES ANALYSIS FORECASTING
References
[1] Kullback, S. (1959), Information Theory and Statistics, John Wiley and Sons.
Republished by Dover Publications in 1968; reprinted in 1978: ISBN 0-8446-5625-9.
[2] Yeo, I. K., and Johnson, R. A. (2000). A new family of power transformations to improve
normality or symmetry. Biometrika.
[3] Gocheva-Ilieva, Snezhana; Ivanov, A; Voynikova, Desislava; Boyadzhiev, Doychin. (2013).
Time series analysis and forecasting for air pollution in small urban area: An SARIMA
and factor analysis approach. Stochastic Environmental Research and Risk Assessment.
28. 1045-1060. 10.1007/s00477-013-0800-4.
[4] Alcosser, Howard. ”Diamond Bar High School” Internal Assessment: Mathematical
Exploration. Web. 27 May 2015.
[5] Jolliffe, Ian. (1986). Principal Component Analysis and Factor Analysis. 10.1007/978 −
1 − 4757 − 1904 − 87. Principal component analysis and Factor Analysis.
31

More Related Content

What's hot

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...
TELKOMNIKA JOURNAL
 
IRJET- Rainfall Forecasting using Regression Techniques
IRJET- Rainfall Forecasting using Regression TechniquesIRJET- Rainfall Forecasting using Regression Techniques
IRJET- Rainfall Forecasting using Regression Techniques
IRJET Journal
 
paper mikrotremor
paper mikrotremorpaper mikrotremor
paper mikrotremor
Ahmad Al Imbron
 
Refining Underwater Target Localization and Tracking Estimates
Refining Underwater Target Localization and Tracking EstimatesRefining Underwater Target Localization and Tracking Estimates
Refining Underwater Target Localization and Tracking Estimates
CSCJournals
 
Am4103223229
Am4103223229Am4103223229
Am4103223229
IJERA Editor
 
Time Series Data Analysis for Forecasting – A Literature Review
Time Series Data Analysis for Forecasting – A Literature ReviewTime Series Data Analysis for Forecasting – A Literature Review
Time Series Data Analysis for Forecasting – A Literature Review
IJMER
 
Mechanistic models
Mechanistic modelsMechanistic models
Mechanistic models
MOHIT MAYOOR
 
IAS_SRF_Project_2015
IAS_SRF_Project_2015IAS_SRF_Project_2015
IAS_SRF_Project_2015
Swatah Snigdha Borkotoky
 
Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...
Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...
Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...
Mauricio Zambrano-Bigiarini
 
Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...
Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...
Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...
TELKOMNIKA JOURNAL
 
An improved method for predicting heat exchanger network area
An improved method for predicting heat exchanger network areaAn improved method for predicting heat exchanger network area
An improved method for predicting heat exchanger network area
Alexander Decker
 
Final presentation (2)
Final presentation (2)Final presentation (2)
Final presentation (2)
Mahdi Roozbahani
 
Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples
 Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples
Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples
hydrologyproject001
 
Consequence assessment methods for incidents from lng
Consequence assessment methods for incidents from lngConsequence assessment methods for incidents from lng
Consequence assessment methods for incidents from lng
aob
 
Comparison of MOC and Lax FDE for simulating transients in Pipe Flows
Comparison of  MOC and Lax FDE for simulating transients in Pipe FlowsComparison of  MOC and Lax FDE for simulating transients in Pipe Flows
Comparison of MOC and Lax FDE for simulating transients in Pipe Flows
IRJET Journal
 
MCP_ES_2012_Jie
MCP_ES_2012_JieMCP_ES_2012_Jie
MCP_ES_2012_Jie
MDO_Lab
 
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
IJDKP
 
Online flooding monitoring in packed towers
Online flooding monitoring in packed towersOnline flooding monitoring in packed towers
Online flooding monitoring in packed towers
James Cao
 
Determination of the corrosion rate of a mic influenced pipeline using four c...
Determination of the corrosion rate of a mic influenced pipeline using four c...Determination of the corrosion rate of a mic influenced pipeline using four c...
Determination of the corrosion rate of a mic influenced pipeline using four c...
GeraldoRossoniSisqui
 

What's hot (20)

International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...
Integration Method of Local-global SVR and Parallel Time Variant PSO in Water...
 
IRJET- Rainfall Forecasting using Regression Techniques
IRJET- Rainfall Forecasting using Regression TechniquesIRJET- Rainfall Forecasting using Regression Techniques
IRJET- Rainfall Forecasting using Regression Techniques
 
paper mikrotremor
paper mikrotremorpaper mikrotremor
paper mikrotremor
 
Refining Underwater Target Localization and Tracking Estimates
Refining Underwater Target Localization and Tracking EstimatesRefining Underwater Target Localization and Tracking Estimates
Refining Underwater Target Localization and Tracking Estimates
 
Am4103223229
Am4103223229Am4103223229
Am4103223229
 
Time Series Data Analysis for Forecasting – A Literature Review
Time Series Data Analysis for Forecasting – A Literature ReviewTime Series Data Analysis for Forecasting – A Literature Review
Time Series Data Analysis for Forecasting – A Literature Review
 
Mechanistic models
Mechanistic modelsMechanistic models
Mechanistic models
 
IAS_SRF_Project_2015
IAS_SRF_Project_2015IAS_SRF_Project_2015
IAS_SRF_Project_2015
 
Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...
Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...
Global Sensitivity Analysis for the Calibration of a Fully-distributed Hydrol...
 
Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...
Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...
Estimating Parameter of Nonlinear Bias Correction Method using NSGA-II in Dai...
 
An improved method for predicting heat exchanger network area
An improved method for predicting heat exchanger network areaAn improved method for predicting heat exchanger network area
An improved method for predicting heat exchanger network area
 
Final presentation (2)
Final presentation (2)Final presentation (2)
Final presentation (2)
 
Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples
 Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples
Download-manuals-ground water-manual-gw-volume2designmanualsamplingprinciples
 
Consequence assessment methods for incidents from lng
Consequence assessment methods for incidents from lngConsequence assessment methods for incidents from lng
Consequence assessment methods for incidents from lng
 
Comparison of MOC and Lax FDE for simulating transients in Pipe Flows
Comparison of  MOC and Lax FDE for simulating transients in Pipe FlowsComparison of  MOC and Lax FDE for simulating transients in Pipe Flows
Comparison of MOC and Lax FDE for simulating transients in Pipe Flows
 
MCP_ES_2012_Jie
MCP_ES_2012_JieMCP_ES_2012_Jie
MCP_ES_2012_Jie
 
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
 
Online flooding monitoring in packed towers
Online flooding monitoring in packed towersOnline flooding monitoring in packed towers
Online flooding monitoring in packed towers
 
Determination of the corrosion rate of a mic influenced pipeline using four c...
Determination of the corrosion rate of a mic influenced pipeline using four c...Determination of the corrosion rate of a mic influenced pipeline using four c...
Determination of the corrosion rate of a mic influenced pipeline using four c...
 

Similar to Time Series Analysis

ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
IJDKP
 
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
IJDKP
 
Air_Quality_Index_Forecasting Prediction BP
Air_Quality_Index_Forecasting Prediction BPAir_Quality_Index_Forecasting Prediction BP
Air_Quality_Index_Forecasting Prediction BP
AnbuShare
 
A Smart air pollution detector using SVM Classification
A Smart air pollution detector using SVM ClassificationA Smart air pollution detector using SVM Classification
A Smart air pollution detector using SVM Classification
IRJET Journal
 
Atmospheric Pollutant Concentration Prediction Based on KPCA BP
Atmospheric Pollutant Concentration Prediction Based on KPCA BPAtmospheric Pollutant Concentration Prediction Based on KPCA BP
Atmospheric Pollutant Concentration Prediction Based on KPCA BP
ijtsrd
 
Analysis Of Air Pollutants Affecting The Air Quality Using ARIMA
Analysis Of Air Pollutants Affecting The Air Quality Using ARIMAAnalysis Of Air Pollutants Affecting The Air Quality Using ARIMA
Analysis Of Air Pollutants Affecting The Air Quality Using ARIMA
IRJET Journal
 
Calibration and validation o s air quality
Calibration and validation o s air qualityCalibration and validation o s air quality
Calibration and validation o s air quality
ECRD IN
 
Ae4102224236
Ae4102224236Ae4102224236
Ae4102224236
IJERA Editor
 
IJESD
IJESDIJESD
Calculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methodsCalculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methods
mehmet şahin
 
Alin Pohoata: "Multiple characterizations of urban air pollution time series ...
Alin Pohoata: "Multiple characterizations of urban air pollution time series ...Alin Pohoata: "Multiple characterizations of urban air pollution time series ...
Alin Pohoata: "Multiple characterizations of urban air pollution time series ...
ifi8106tlu
 
Conference on the Environment- GUERRA presentation Nov 19, 2014
Conference on the Environment- GUERRA presentation Nov 19, 2014Conference on the Environment- GUERRA presentation Nov 19, 2014
Conference on the Environment- GUERRA presentation Nov 19, 2014
Sergio A. Guerra
 
APCBEE
APCBEEAPCBEE
PPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerningPPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerning
Misbanausheen1
 
air quality index forecasting using time series analysis.pptx
air quality index forecasting using time series analysis.pptxair quality index forecasting using time series analysis.pptx
air quality index forecasting using time series analysis.pptx
CUInnovationTeam
 
Defining Homogenous Climate zones of Bangladesh using Cluster Analysis
Defining Homogenous Climate zones of Bangladesh using Cluster AnalysisDefining Homogenous Climate zones of Bangladesh using Cluster Analysis
Defining Homogenous Climate zones of Bangladesh using Cluster Analysis
Premier Publishers
 
Air Quality Monitoring Using Model: A Review
Air Quality Monitoring Using Model: A ReviewAir Quality Monitoring Using Model: A Review
Air Quality Monitoring Using Model: A Review
International Journal of Science and Research (IJSR)
 
Use of Probabilistic Statistical Techniques in AERMOD Modeling Evaluations
Use of Probabilistic Statistical Techniques in AERMOD Modeling EvaluationsUse of Probabilistic Statistical Techniques in AERMOD Modeling Evaluations
Use of Probabilistic Statistical Techniques in AERMOD Modeling Evaluations
Sergio A. Guerra
 
Ott, Lesley: Low latency flux and concentration datasets in support of greenh...
Ott, Lesley: Low latency flux and concentration datasets in support of greenh...Ott, Lesley: Low latency flux and concentration datasets in support of greenh...
Ott, Lesley: Low latency flux and concentration datasets in support of greenh...
Integrated Carbon Observation System (ICOS)
 
Air Quality Mapping Using GIS
Air Quality Mapping Using GISAir Quality Mapping Using GIS
Air Quality Mapping Using GIS
National Cheng Kung University
 

Similar to Time Series Analysis (20)

ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
 
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
ONLINE SCALABLE SVM ENSEMBLE LEARNING METHOD (OSSELM) FOR SPATIO-TEMPORAL AIR...
 
Air_Quality_Index_Forecasting Prediction BP
Air_Quality_Index_Forecasting Prediction BPAir_Quality_Index_Forecasting Prediction BP
Air_Quality_Index_Forecasting Prediction BP
 
A Smart air pollution detector using SVM Classification
A Smart air pollution detector using SVM ClassificationA Smart air pollution detector using SVM Classification
A Smart air pollution detector using SVM Classification
 
Atmospheric Pollutant Concentration Prediction Based on KPCA BP
Atmospheric Pollutant Concentration Prediction Based on KPCA BPAtmospheric Pollutant Concentration Prediction Based on KPCA BP
Atmospheric Pollutant Concentration Prediction Based on KPCA BP
 
Analysis Of Air Pollutants Affecting The Air Quality Using ARIMA
Analysis Of Air Pollutants Affecting The Air Quality Using ARIMAAnalysis Of Air Pollutants Affecting The Air Quality Using ARIMA
Analysis Of Air Pollutants Affecting The Air Quality Using ARIMA
 
Calibration and validation o s air quality
Calibration and validation o s air qualityCalibration and validation o s air quality
Calibration and validation o s air quality
 
Ae4102224236
Ae4102224236Ae4102224236
Ae4102224236
 
IJESD
IJESDIJESD
IJESD
 
Calculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methodsCalculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methods
 
Alin Pohoata: "Multiple characterizations of urban air pollution time series ...
Alin Pohoata: "Multiple characterizations of urban air pollution time series ...Alin Pohoata: "Multiple characterizations of urban air pollution time series ...
Alin Pohoata: "Multiple characterizations of urban air pollution time series ...
 
Conference on the Environment- GUERRA presentation Nov 19, 2014
Conference on the Environment- GUERRA presentation Nov 19, 2014Conference on the Environment- GUERRA presentation Nov 19, 2014
Conference on the Environment- GUERRA presentation Nov 19, 2014
 
APCBEE
APCBEEAPCBEE
APCBEE
 
PPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerningPPT.pdf internship demo on machine lerning
PPT.pdf internship demo on machine lerning
 
air quality index forecasting using time series analysis.pptx
air quality index forecasting using time series analysis.pptxair quality index forecasting using time series analysis.pptx
air quality index forecasting using time series analysis.pptx
 
Defining Homogenous Climate zones of Bangladesh using Cluster Analysis
Defining Homogenous Climate zones of Bangladesh using Cluster AnalysisDefining Homogenous Climate zones of Bangladesh using Cluster Analysis
Defining Homogenous Climate zones of Bangladesh using Cluster Analysis
 
Air Quality Monitoring Using Model: A Review
Air Quality Monitoring Using Model: A ReviewAir Quality Monitoring Using Model: A Review
Air Quality Monitoring Using Model: A Review
 
Use of Probabilistic Statistical Techniques in AERMOD Modeling Evaluations
Use of Probabilistic Statistical Techniques in AERMOD Modeling EvaluationsUse of Probabilistic Statistical Techniques in AERMOD Modeling Evaluations
Use of Probabilistic Statistical Techniques in AERMOD Modeling Evaluations
 
Ott, Lesley: Low latency flux and concentration datasets in support of greenh...
Ott, Lesley: Low latency flux and concentration datasets in support of greenh...Ott, Lesley: Low latency flux and concentration datasets in support of greenh...
Ott, Lesley: Low latency flux and concentration datasets in support of greenh...
 
Air Quality Mapping Using GIS
Air Quality Mapping Using GISAir Quality Mapping Using GIS
Air Quality Mapping Using GIS
 

Recently uploaded

Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 

Recently uploaded (20)

Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 

Time Series Analysis

  • 1. Marshall University College of Science Department of Mathematics - STA 564 - Time Series Analysis and Forecasting focused on Air Pollution in an Urban Area By Kenneth Guzman December 7, 2018
  • 2. Contents 1 INTRODUCTION 2 2 Yeo-Johnson transformation, Kolmogorov-Smirnov Test for normality 3 3 Factor Analysis and PCA 3 4 Box Jenkins Methodology 3 5 Personal Study Carried out in R 4 6 R Code Explanation and Software Package Used 29 7 Conclusion 30 References 31
  • 3. TIME SERIES ANALYSIS FORECASTING NOTES At its core, the influences of air pollution in the atmosphere are strongly managed by meteorology. However, in the ”univariate” models we will consider it is assumed that the final concentration of air pollutants in the atmosphere is the final result of all the complex interactions of meteorology, chemistry, transport, diffusion etc. For this reason, the combined information of their effect on air pollutant concentration is contained in the corresponding time series in a stochastic way. Using this approach, calculations are simplified and performed only using the time series of the pollutant without explicit inclusion of meteorological or other measurements. Four professors from Plovdiv University in Bulgaria, produced a research paper on time series analysis concerning air pollution, the methods used were explicitly stated in their article as: (i) Identify correlation type dependencies and grouping of observed air pollutants using the method of factor analysis to explain mutual effects of pollution. (ii) Conduct time series analysis by determining seasonal ARIMA(based on hourly data) relevant parametric models of pollutants. (iii) Analysis and Diagnostics of constructed models. (iv) Application of models for short term forecasting. (v) Interpretation of the results and definition of the conditions contributing to the exceeding of national and European, concentration norms for the considered air pollutants. Their study was carried out using IBM SPSS 19 and EViews 7.[3] 1
  • 4. TIME SERIES ANALYSIS FORECASTING 1 INTRODUCTION Even though there are established regulations for monitoring and controlling effects on air quality in certain territories, air quality may remain unsatisfactory. Lets consider the particular case where our focus lies within the town of Blagoevgrad, Bulgaria. Blagoevgrad is a typical representative of a small urban region, with a population of approximately 70,000. Time Span of Study:1 year period from September 1st, 2011 to August 31st, 2012, based on hourly measurements, six air pollutants were observed. Factor analysis and Box-Jenkins methodology were applied to inspect concentrations of the primary air pollutants of interest. The pollutants were grouped into three factors and the degree of contribution of the factors to the overall pollution was determined, this contribution was interpreted as the presence of common sources of pollution. The classical techniques of principal component analysis (PCA) and factor analysis are important statistical instruments frequently used in the environmental sciences. The focus of the study involved the performance of time series analysis and the development of univariate stochastic seasonal autoregressive integrated moving average (SARIMA) models with recording on an hourly basis as seasonality. The study incorporates Yeo-Johnson power transformation for variance stabilizing of the data, and model selection by using Bayesian Information Criterion. The SARIMA models obtained in the study in Bulgaria demonstrated good fitting with respect to the observed air pollutants and short term predictions for 72 hours ahead, specifically in the case of ozone and particulate matter PM10. The methods presented, allowed the building of less complex models that are effective for short-term air pollution forecasting and useful for advance warning purposes in urban areas.[3] Continuous and careful monitoring and forecasting of atmospheric air pollutants is important when evaluating regulatory control measures related to air quality. In Bulgaria, 12 types of pollutants are systematically monitored by more than 36 automated stations run by the Executive Environment Agency(EEA), which manages and coordinates activities related to the control and environmental protection of the country. Atmospheric air quality reports for the various regions of the country are regularly published, and from this much data is accumulated. The data accumulation is what allows us to carry out statistical analysis which leads to the discovery of, general patterns and dependencies for different time periods and relationships between observed air pollutants. The observed air pollutants related to the study carried out in Blagoevgrad, Bulgaria are concentrations of particulate matter PM10, nitrogen oxide NO, nitrogen dioxide NO2, nitrogen oxides NOx, sulfur dioxide SO2, and ground level ozone O3. The data measurements are expressed in units of mass concentration of pollutants in µg/m3 , only NOx is in unit ppb(partsperbillion, as it is observing pollution from all kinds of nitrogen oxides. The data consisted of 8,744 observations (hourly data). The goal of their study was to demonstrate the capabilities of the mentioned methods, which can be applied to other recorded sets including for shorter and longer periods of time. 2
  • 5. TIME SERIES ANALYSIS FORECASTING 2 Yeo-Johnson transformation, Kolmogorov-Smirnov Test for normality Time series data often requires preparation before using forecasting methods; and for this reason normal or near to normal distribution of the univariate data is important, because it reduces issues when we forecast future values. The obtained K-S statistic indicated non-normality of the data collected in Bulgaria, which led to the transformation of the data prior to constructing the forecasting models. In that particular case the Yeo-Johnson transformation was carried out, which lead to the satisfying of the Kolmogorov Smirnov Test for normality at 0.05 level of significance and may be assumed to be normally distributed. The Yeo-Johnson transformation finds the optimal value of lambda that minimizes the KullBack-Leibler1 distance between the normal distribution and the transformed distribution.[1][2] Properties of Yeo-Johnson transformation below: g(x; λ) = {1(λ=0,x≥0) (x + 1)λ − 1 λ {1(λ=0,x≥0) log(x + 1) {1(λ=2,x<0) (1 − x)2−λ − 1 λ − 2 {1(λ=2,x<0) − log(1 − x) 3 Factor Analysis and PCA The statistical techniques of factor analysis and principal component analysis, help identify patterns in the correlation between variables. The patterns identified are used to create factors, which was the case in Bulgaria and allowed the grouping of correlated pollutants. The steps followed for the particular case in Bulgaria were: (a) calculation of correlation matrix (b) testing the adequacy of factor anaylsis (c) factor extraction (d) factor rotation and (e) score calculation of factor variables. The particular advantages of these methods are that they reveal strong correlation relationships between observed variables and allow their grouping into new variables (factors) in order to reduce the dimensions of the complex data structure. The factors can thereafter be used to build regression or other types of models.[5] 4 Box Jenkins Methodology Other methods frequently used in times series analysis and forecasting are the auto-regressive integrated moving average(ARIMA) and seasonal ARIMA (SARIMA)models, also known as Box-Jenkins stochastic models. Box-Jenkins methodology is widely applied in air quality research among other disciplines, and is a systematic strategy for identifying, fitting, and forecasting time series univariate data. ARIMA models generally take the form Arima(p,d,q) 1 In mathematical statistics, the KullbackLeibler divergence (also called relative entropy), is a measure of how one probability distribution is different from a second, reference probability distribution. 3
  • 6. TIME SERIES ANALYSIS FORECASTING where p is the number of parameters describing the auto-regressive process, d is the number of nonseasonal differences needed to reach stationarity, and q is the number of lagged forecast errors in the prediction equation. Similarly, the SARIMA models take the general form Arima(p,d,q)(P,D,Q)s, where P is the number of seasonal auto-regressive terms, D is the order of seasonal differencing and Q is the number of seasonal moving average terms. In the seasonal part of the model, the three parameters P,D,Q operate across multiples of lag s, where s is the number of time periods until a pattern repeats itself. Main advantages of the Box-Jenkins approach: (i) Applicability for modeling and forecasting practically any time series that is stationary or can be reduced to stationary by a differencing procedure. (ii) Ability to extract all the trends and serial correlations in the data with a minimized sequence of white noise(shock) through inclusion in one general model equation that gets to the basis of historical data development. (iii) The method has been incorporated into many standard software packages which exist within R, SPSS, etc., which speeds up and assists the modeling process considerably. 5 Personal Study Carried out in R Using the presented methods, I was able to carry out my own study using the statistical software R. Using data provided by our own Environmental Protection Agency here in the United States (https://www.epa.gov/outdoor-air-quality-data), I accessed pollutant concentration data for the city of Richmond, Virginia, which has a population of approximately 220,000. Time Span of Observed Data: A total of 4 years of data was accessed, periods from January 2010 to December 2013 based on weekly measurements of the following air pollutants, concentrations of particulate matter PM2.5, particulate matter PM10, lead Pb expressed in units of mass concentration (µg/m3 ), carbon monoxide CO and ground level ozone O3 are in units ppm(partspermillion), sulfur dioxide SO2 and nitrogen dioxide NO2 are in units ppb(partsperbillion). The goal of my personal research is to apply the time series analysis and forecasting methods from the research paper produced in Bulgaria, to a local city here in the US. As was the case in Bulgaria, once these methods are applied to the Richmond pollutant data I hope to visually show an appropriate forecast for each pollutant for the year 2013. Before I proceed forward I would like to point out that while the research paper concerning Bulgaria highlighted a factor analysis and principal component analysis approach, the correlation matrix calculated in R concerning the Richmond pollutant data-sets, displayed no signs of positive or negative correlation between the pollutants, therefore I did not proceed to carry out any sort of factor analysis or PCA. Also, the 2013 pollutant data-sets were strictly used to compare our forecast models to the actual data recorded by the EPA in 2013. 4
  • 7. TIME SERIES ANALYSIS FORECASTING Directly below is the correlation matrix for all 7 pollutants concerning data over the time span of the years 2010, 2011, and 2012. Analyzing PM-2.5 using 3 year data The first pollutant we will analyze is particulate matter PM2.5 The lambda value used to transform the original PM-2.5 observations, λ = 0.227158. Directly below is the time series plot for the 3 years after a yeo-johnson transformation. 5
  • 8. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using only the forecast function in R. Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for PM-2.5 and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original PM-2.5 2013 observations, λ = 0.05030683. Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant PM-2.5. 6
  • 9. TIME SERIES ANALYSIS FORECASTING Using only 2012 data to predict 2013 values The lambda value used to transform the original PM-2.5 observations for the year 2012, λ = 0.7078218. 7
  • 10. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot for 2012 after a yeo-johnson transformation. The time series plot using only the forecast function was not yielding an appropriate graph in R. Directly below is the time series plot using auto.arima function in R. Now that we have our arima model, the next step was to access our 2013 pollutant concentration data for PM-2.5 to see how accurately auto.arima() predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original PM-2.5 2013 observations, λ = 0.05030683. Finally, once we plot arima model against the 2013 time series plot, I believe the auto.arima function is somewhat appropriate for predicting the trend of the Pollutant PM-2.5 for the year 2013. 8
  • 11. TIME SERIES ANALYSIS FORECASTING Analyzing PM10 using 3 year data The second pollutant we will analyze is particulate matter PM10 The lambda value used to transform the original PM10 observations, λ = 0.2409915. Directly below is the time series plot for the 3 years after a yeo-johnson transformation. Below is the time series plot using only the forecast function in R. 9
  • 12. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for PM10 and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original PM10 2013 observations, λ = 0.7845362. Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe neither the forecast nor the auto.arima function is appropriate for predicting the values of 2013 for the Pollutant PM10. 10
  • 13. TIME SERIES ANALYSIS FORECASTING Using only 2012 data to predict 2013 values The lambda value used to transform the original PM10 observations for the year 2012, λ = −0.04297711. Below is the time series plot for 2012 after a yeo-johnson transformation. 11
  • 14. TIME SERIES ANALYSIS FORECASTING The time series plot using only the forecast function was not yielding an appropriate graph in R. The time series plot using the auto.arima function was not yielding an appropriate graph in R. Analyzing Pb(Lead) using 3 year data The third pollutant we will analyze is lead Pb The lambda value used to transform the original Pb observations, λ = −4.99994. Directly below is the time series plot for the 3 years after a yeo-johnson transformation. 12
  • 15. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using only the forecast function in R. Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for Pb and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original Pb 2013 observations, λ = −4.99994. Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant 13
  • 16. TIME SERIES ANALYSIS FORECASTING Pb. Using only 2012 data to predict 2013 values The lambda value used to transform the original Pb(Lead) observations for the year 2012, λ = −4.99994. Directly below is the time series plot for 2012 after a yeo-johnson transformation. 14
  • 17. TIME SERIES ANALYSIS FORECASTING The time series plot using only the forecast function was not yielding an appropriate graph in R. The time series plot using the auto.arima function was not yielding an appropriate graph in R. Analyzing CO using 3 year data The fourth pollutant we will analyze is carbon monoxide CO The lambda value used to transform the original CO observations, λ = −3.577325. Directly below is the time series plot for the 3 years after a yeo-johnson transformation. 15
  • 18. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using only the forecast function in R. Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for CO and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original CO 2013 observations, λ = −2.432302. 16
  • 19. TIME SERIES ANALYSIS FORECASTING Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant CO. Using only 2012 data to predict 2013 values The lambda value used to transform the original CO observations for the year 2012, λ = −3.641187. Directly below is the time series plot for 2012 after a yeo-johnson transformation. 17
  • 20. TIME SERIES ANALYSIS FORECASTING The time series plot using only the forecast function was not yielding an appropriate graph in R. Directly below is the time series plot using auto.arima function in R. Now that we have our arima model, the next step was to access our 2013 pollutant concentration data for CO and see how accurately the arima model predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original CO 2013 observations, λ = −2.432302. Finally, once we plot the arima model against the 2013 time series plot, I believe the auto.arima function is most appropriate for predicting the trend of the Pollutant CO for 2013. 18
  • 21. TIME SERIES ANALYSIS FORECASTING Analyzing O3 using 3 year data The fifth pollutant we will analyze is ground level ozone O3 The lambda value used to transform the original O3 observations, λ = 3.615548. Directly below is the time series plot for the 3 years after a yeo-johnson transformation. 19
  • 22. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using only the forecast function in R. Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for O3 and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original O3 2013 observations, λ = 4.99994. Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe the forecast function is most appropriate for predicting the values of 2013 for the Pollutant O3. 20
  • 23. TIME SERIES ANALYSIS FORECASTING Using only 2012 data to predict 2013 values The lambda value used to transform the original O3 observations for the year 2012, λ = 4.99994. Directly below is the time series plot for 2012 after a yeo-johnson transformation. 21
  • 24. TIME SERIES ANALYSIS FORECASTING The time series plot using only the forecast function was not yielding an appropriate graph in R. The time series plot using the auto.arima function was not yielding an appropriate graph in R. Analyzing SO2 using 3 year data The sixth pollutant we will analyze is sulfur dioxide SO2 The lambda value used to transform the original SO2 observations, λ = −0.227093. Directly below is the time series plot for the 3 years after a yeo-johnson transformation. 22
  • 25. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using only the forecast function in R. Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for SO2 and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original SO2 2013 observations, λ = 0.2616144. Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe the forecast function is most appropriate for predicting the values of 2013 for the Pollutant 23
  • 26. TIME SERIES ANALYSIS FORECASTING SO2. Using only 2012 data to predict 2013 values The lambda value used to transform the original SO2 observations for the year 2012, λ = −0.1123281. Below is the time series plot for 2012 after a yeo-johnson transformation. 24
  • 27. TIME SERIES ANALYSIS FORECASTING The time series plot using only the forecast function was not yielding an appropriate graph in R. The time series plot using the auto.arima function was not yielding an appropriate graph in R. Analyzing NO2 using 3 year data The seventh and final pollutant we will analyze is nitrogen dioxide NO2 The lambda value used to transform the original NO2 observations, λ = 0.9783584. Below is the time series plot for the 3 years after a yeo-johnson transformation. 25
  • 28. TIME SERIES ANALYSIS FORECASTING Directly below is the time series plot using only the forecast function in R. Directly below is the time series plot using auto.arima function in R. Now that we have our forecast and arima models, the next step was to access our 2013 pollutant concentration data for NO2 and compare each model to see how accurately it predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original NO2 2013 observations, λ = 1.003092. 26
  • 29. TIME SERIES ANALYSIS FORECASTING Finally, once we plot the forecast and arima models against the 2013 time series plot, I believe the auto.arima function is most appropriate for predicting the values of 2013 for the Pollutant NO2. Using only 2012 data to predict 2013 values The lambda value used to transform the original NO2 observations for the year 2012, λ = 1.229131. Directly below is the time series plot for 2012 after a yeo-johnson transformation. 27
  • 30. TIME SERIES ANALYSIS FORECASTING The time series plot using only the forecast function was not yielding an appropriate graph in R. Directly below is the time series plot using auto.arima function in R. Now that we have our arima model, the next step was to access our 2013 pollutant concentration data for NO2 and see how accurately the model predicted the 2013 values. Before I created the time series plot for 2013, I preformed a yeo-johnson transformation on the 2013 data. The lambda value used to transform the original CO 2013 observations, λ = 1.003092. Finally, once we plot the arima model against the 2013 time series plot, I believe the auto.arima function is most appropriate for predicting the trend of the Pollutant NO2 for 2013. 28
  • 31. TIME SERIES ANALYSIS FORECASTING 6 R Code Explanation and Software Packages Used The following packages in the R software were used: MASS, bestNormalize, forecast. • From MASS the function truehist was used to plot the histograms of the pollutant data before and after the yeojohnson transformation was applied, to visually show the transformation from non-normal to normal distribution of the data. • From bestNormalize the function yeojohnson was used to transform the pollutant data from non-normal to normally distributed, in order to better carry out our statistical analysis. • From forecast the functions forecast and auto.arima were used, each playing the most important role in analyzing prior pollutant observations and forecasting our future values as accurately as R allows for each pollutant. The main functions that I will highlight in this sections are the forecast and auto.arima() functions in R but I will also briefly explain my usage of the ts() and yeojohnson() functions. It was very important to my study that within the forecast function level=F because while having confidence intervals in our graphs could be useful, they were not particularly needed for my study to be carried out, since I was mostly interested in the specific values that the forecast function gave us in its output. Also, in the forecast package it was vary important that we only forecast exactly 59 future values, which is simply due to the fact that there are exactly 59 values in our EPA 2013 data for each pollutant. Now, in the auto.arima() function, no restrictions needed to be called within the function but it was most important that we accessed our forecast values by auto.arima()$f and just for reference we are also able to access our original values that were put into the function by using auto.arima()$x. One last note, when I was plotting the time series for the 3 year data, you should notice that within each ts() function the frequency=(58) which I interpret as they were an average of 58 observations per year, and I simply got 58 by dividing the total amount of observations in our 3 year data by 3, so 174/3 = 58. Within the yeojohnson() function you will notice 29
  • 32. TIME SERIES ANALYSIS FORECASTING that standardize=FALSE this is because if it is not declared within the function by default R will further perform standardization of the values put into the function, I did not find the further standardization useful in my case when dealing with the Richmond data, mainly because the yeojohnson transformation was of interest in the Bulgaria study so I wanted to follow that transformation as it is without further standardization. 7 Conclusion In the Bulgaria study the researchers main goal was to be able to use the arima models in order to forecast ahead 72 hours, because they used hourly data. Similarly, I feel it necessary to highlight the importance the auto.arima() function played in helping forecast the year 2013. While it was not totally helpful with forecasting all pollutants, it was definitely more helpful than the forecast() function, in identifying the trend or behavior of each pollutant throughout the year(s). The most important finding I came across was that the 2012 data alone was certainly not enough it most cases when attempting to forecast a future year, but the 3 year(2010,2011,2012) data combination allowed both the forecast() and auto.arima() functions to display their usefulness when forecasting. I certainly enjoyed preparing this study and learning about time series and hope that I am given the opportunity to further explore this discipline in the future. 30
  • 33. TIME SERIES ANALYSIS FORECASTING References [1] Kullback, S. (1959), Information Theory and Statistics, John Wiley and Sons. Republished by Dover Publications in 1968; reprinted in 1978: ISBN 0-8446-5625-9. [2] Yeo, I. K., and Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika. [3] Gocheva-Ilieva, Snezhana; Ivanov, A; Voynikova, Desislava; Boyadzhiev, Doychin. (2013). Time series analysis and forecasting for air pollution in small urban area: An SARIMA and factor analysis approach. Stochastic Environmental Research and Risk Assessment. 28. 1045-1060. 10.1007/s00477-013-0800-4. [4] Alcosser, Howard. ”Diamond Bar High School” Internal Assessment: Mathematical Exploration. Web. 27 May 2015. [5] Jolliffe, Ian. (1986). Principal Component Analysis and Factor Analysis. 10.1007/978 − 1 − 4757 − 1904 − 87. Principal component analysis and Factor Analysis. 31