Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
difference between dynamic programming and divide and conquer
1. CIA – 3B:
Time Series Analysis
on Sunspots
By –
Srishti Srivastava
Siddharth Menon
Submitted To –
Dr. Ashish Sharma
Dr. Stephen Raj
2. Table of contents
Description of the dataset
01 About the Dataset Methods Used
Methods used for making inferences
02
Analysis
Analysis of the output obtained
03 Conclusion & Suggestions
Conclusion derived with suggestions for improvement
04
5. Sunspots, as seen in the
above image are dark areas
on the surface of the sun.
Studying sunspots is crucial
due to their potential impact
on Earth’s environment and
climate.
About
Why to study?
https://data.world/hyuto/sun
spots/workspace/file?filenam
e=Sunspots.csv
Time series analysis plays a vital
role in understanding sunspot
cycles and their implications
for Earth’s climate.
Data Source
Use of Time
Series Analysis
7. Conversion to Time Series
To begin with the analysis of the data we convert
the data into time series data and obtain a
graph plot,
● A visualization of the Wolf sunspot number, a
measure of solar activity, from 1750 to 1950.
● The graph shows a cyclical pattern, with
sunspot numbers rising and falling over time.
● The graph suggests a possible long-term
increase in sunspot numbers over the two
centuries shown.
8. Multiplicative Decomposition
Multiplicative decomposition is preferred when
the magnitude of seasonality varies with the level
of the series.
● The trend component shows a clear upward
slope over time.
● The seasonal component seems to fluctuate
around a constant value throughout the year.
● Since the data has a trend and potentially
seasonal variations, it’s likely non-stationary.
9. Autocorrelation Function (ACF) Plot
Gradually decreasing lines in an ACF plot suggest
weak stationarity or a series close to being
stationary.
While this implies a weakening dependence on
past values, it doesn’t guarantee
complete independence.
If the ACF lines gradually decrease towards zero
at all lags, it suggests that the correlations
between the current value and its lagged values
are fading out as the lag increases.
10. Partial Autocorrelation Function (PACF) Plot
The lines outside the confidence interval
represent lags where the correlation might be
statistically significant.
If there are many spikes exceeding the confidence
interval, it suggests that the current value
might be influenced by past values
at multiple lags.
This indicates the need for a potentially higher-
order AR (Autoregressive) model in your
ARIMA analysis.
11. Augmented Dickey-Fuller (ADF) Test
Using Augmented Dickey-Fuller test for stationarity based on the ADF test results, with a
p-value of 0.01 (which is smaller than the commonly used significance level of 0.05),you
have evidence to reject the null hypothesis of non-stationarity and conclude that the series
data_sun is stationary.
12. Fitting of ARIMA model
The model is an ARIMA(2,1,2) model.
The coefficients represent the estimated
parameters of the ARIMA model.
Standard errors (s.e.) are provided for each
coefficient estimate.
The goodness of fit of the model has also been
measured.
The model’s information criteria include AIC, AICc
and BIC.
16. Conclusion
1. The ARIMA(2,1,2) model fitted to the sunspots time
series data provides a reasonably good fit to the
observed patterns.
2. The model captures the autoregressive and moving
average dynamics of the data, as indicated by the
estimated coefficients and their significance.
3. The information criteria suggest that the model
adequately balances goodness of fit with model
complexity.
4. However, there may still be room for improvement in
capturing certain nuances of the data.