SlideShare a Scribd company logo
1 of 32
Download to read offline
TIME SERIES ANALYSIS:
THEORY AND PRACTICE
LMLP MEETUP
TIME SERIES ANALYSIS:THEORY AND PRACTICE
SOME HOUSEKEEPING
▸ Call for presenters over the summer period
▸ Please don’t use the CodeNode bar after the meetup since
it’s booked for a private event - go to the pub across the
road
2
TIME SERIES ANALYSIS:THEORY AND PRACTICE
DEFINITION OF TIME SERIES DATA
▸ Sequence of measurements (data points) -
▸ that follow non-random order (i.e. are successive) -
▸ taken over regular time intervals -
▸ usually with no more than one data point per interval (if
there’s more than one data point - we call it multiple time
series analysis and use slightly different approaches to
modelling).
3
TIME SERIES ANALYSIS:THEORY AND PRACTICE
HOW ARE TIME SERIES DIFFERENT FROM OTHER TYPES OF DATA?
▸ Panel data
▸ Cross-sectional data
▸ Time series is a type of cross-sectional data set where one
measurement is differentiated from another by time stamp only
4
TIME SERIES ANALYSIS:THEORY AND PRACTICE
APPLICATIONS
▸ Financial markets
▸ Weather forecasting
▸ Sales forecasting
▸ Signal processing
▸ Natural language processing
5
TIME SERIES ANALYSIS:THEORY AND PRACTICE
PROPERTIES OF TIME SERIES
▸ Seasonality
▸ Trending
▸ Cycles
6
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TRENDING
▸ A trend exists when there is a long-term increase or decrease in the
data. It does not have to be linear. A trend can “change direction” and,
say, go from increasing to decreasing.
▸ Trends usually become visible when a linear function is fitted to the
data.
7
Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
TIME SERIES ANALYSIS:THEORY AND PRACTICE
SEASONALITY AND CYCLES
▸ A seasonal pattern exists when a series is influenced by
seasonal factors (e.g. the month of the year or day of the
week). Seasonality is always of a fixed and of a known period.
▸ A cyclic pattern exists when data exhibit rises and falls that
are not of fixed period. The duration of these fluctuations is
usually of at least 2 years (e.g. economic cycles).
▸ What may seem to be a trend over a short period of time
may be due to seasonality/cycle over a longer period of time.
Always zoom in/zoom out when plotting your data!
8
TIME SERIES ANALYSIS:THEORY AND PRACTICE
WHAT DOES IT ALL LOOK LIKE ON A CHART?
9
Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
TIME SERIES ANALYSIS:THEORY AND PRACTICE
WHAT DOES IT ALL LOOK LIKE ON A CHART?
10
Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
TIME SERIES ANALYSIS:THEORY AND PRACTICE
WHAT DOES IT ALL LOOK LIKE ON A CHART?
11
Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
TIME SERIES ANALYSIS:THEORY AND PRACTICE
WHAT DOES IT ALL LOOK LIKE ON A CHART?
12
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TESTING FOR TRENDS AND SEASONALITY
▸ Checking for seasonality: autocorrelation.
▸ Checking for trends: fit a simple curve or a rolling average
and eyeball the chart. No proven automatic tests. Strong
autocorrelation with the time period immediately
preceding the measurement also suggests a trend
component.
13
TIME SERIES ANALYSIS:THEORY AND PRACTICE
ON THE IMPORTANCE OF ASKING THE RIGHT QUESTIONS
▸ What are you trying to predict?
▸ Do you know how the measurements were taken?
▸ Do you have any missing values in the dataset? If yes, what
do they represent?
▸ Do you need to adjust for seasonality or trend?
▸ What “shape” is your dataset?
▸ What are the assumptions being made?
14
TIME SERIES ANALYSIS:THEORY AND PRACTICE
ON THE IMPORTANCE OF ASKING THE RIGHT QUESTIONS
15
TIME SERIES ANALYSIS:THEORY AND PRACTICE
NOW TO THE PRACTICE BIT
▸ You can’t use the same procedures to analyse snapshot
and time series data.
▸ For example, you can’t randomly pick the data points that
will be withheld for cross-validation and testing purposes.
Why?
▸ Make sure to understand as much as possible about the
underlying factors that affect the measurements.
16
TIME SERIES ANALYSIS:THEORY AND PRACTICE
PLOT, PLOT, THEN PLOT AGAIN
▸ Plotting your data will allow you to uncover the structure
of the dataset, spot irregularities in the data and figure out
which adjustments need to be made before proceeding
with the modelling.
▸ Useful libraries: pandas, numpy, json, matplotlib.pyplot,
pathlib, seaborn, scipy stats, statsmodels.
17
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TIPS AND TRICKS FOR PLOTTING
▸ Basic function: plot
18
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TIPS AND TRICKS FOR PLOTTING
▸ Plotting multiple lines
19
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TIPS AND TRICKS FOR PLOTTING
▸ Autocorrelation
▸ Use autocorrelation_plot from
pandas.tools.plotting
20
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TIPS AND TRICKS FOR PLOTTING
▸ Autocorrelation
21
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TIPS AND TRICKS FOR PLOTTING
▸ Smoothing - linear and exponential
▸ To see the “bigger picture” you may want to look at a moving average of the
input values.
▸ This is what they call “smoothing”.
▸ Linear smoothing gives equal weight to all the points it’s averaging over,
exponential smoothing gives more weight to more recent points.
▸ Points taken as inputs by moving average can be either centred around the
original value or directly behind it.
▸ Use [ColumnName].rolling.(window=[window size], center=True).mean().plot()
to plot rolling average. You can also replace mean by median.
22
TIME SERIES ANALYSIS:THEORY AND PRACTICE
TIPS AND TRICKS FOR PLOTTING
▸ For more plotting tools from pandas, visit
▸ http://pandas.pydata.org/pandas-docs/stable/
visualization.html#visualization-autocorrelation
▸ http://pandas.pydata.org/pandas-docs/stable/
computation.html#rolling-windows
23
TIME SERIES ANALYSIS:THEORY AND PRACTICE
DATA LOADING AND PREPROCESSING
▸ The data often comes in the form of multiple large csv files that
need to be concatenated together for further processing or slicing.
▸ Here is a useful discussion on Stack Overflow covering this issue:
http://stackoverflow.com/questions/25210819/speeding-up-data-
import-function-pandas-and-appending-to-dataframe/
25210900#25210900
▸ A useful aside: to speed up processing, specify columns to import
and their data type when you’re reading csv into a data frame - and
you can specify different data types for different columns by using
a dictionary: http://pandas.pydata.org/pandas-docs/stable/
generated/pandas.read_csv.html
24
TIME SERIES ANALYSIS:THEORY AND PRACTICE
MODELLING APPROACHES-ARMA
▸ ARMA: autoregressive moving average
▸ Example: http://statsmodels.sourceforge.net/devel/
examples/notebooks/generated/tsa_arma.html
▸ ARMA models combine t autoregressive and moving-
average terms to predict (t+1)-th term
25
TIME SERIES ANALYSIS:THEORY AND PRACTICE
MODELLING APPROACHES-ARMA
▸ Autoregressive model of order p:
▸ c is a constant, φ are parameters, ε is the error term (white
noise).
▸ Moving average model of order q:
▸ μ is expectation of Xt, ε is again the error term, θ are
parameters.
▸ Combined:
26
TIME SERIES ANALYSIS:THEORY AND PRACTICE
MODELLING APPROACHES - ARMA
▸ Why do we combine AR and MA models?
▸ AR model assumes steady change and is poor for
predicting sudden fluctuations.
▸ MA model takes error terms as an input which allows us to
take into account sudden changes in output faster than AR
model would have done on its own.
▸ Data doesn’t come with errors predefined - these are in fact
extrapolated by first fitting a model like AR. See any issues?
27
TIME SERIES ANALYSIS:THEORY AND PRACTICE
OTHER MODELLING APPROACHES
▸ Spectrum/Fourier analysis
▸ Attempts to decompose the function into a sum of sinusoidal
waves.
▸ Main aim is to determine the length and amplitude of
underlying cycles in cases where they are not immediately
obvious.
▸ More useful for things like sun spot activity than sales
forecasting (in the latter case seasonal component is easily
guessed by just eyeballing the data).
28
TIME SERIES ANALYSIS:THEORY AND PRACTICE
LIMITATIONS OF STANDARD APPROACHES
▸ Difficulty capturing high level dependencies - additional
rules typically have the be hardcoded.
▸ Can’t handle all of the possible data structures effectively.
29
TIME SERIES ANALYSIS:THEORY AND PRACTICE
PREDICTION HORIZON
▸ Why can’t we see far into the future?
▸ An interlude on chaos theory
30
TIME SERIES ANALYSIS:THEORY AND PRACTICE
NEURAL NETWORKS - A POSSIBLE ALTERNATIVE
▸ Neural network architectures can be modified to capture
global dependencies (e.g. LSTM).
▸ Capable of both regression and classification, depending
on the choice of activation function.
▸ Next time we will discuss
31
TIME SERIES ANALYSIS:THEORY AND PRACTICE
USEFUL LINKS
▸ https://documents.software.dell.com/statistics/textbook/time-series-analysis
▸ https://en.wikipedia.org/wiki/Time_series
▸ http://www.fil.ion.ucl.ac.uk/~wpenny/course/array.pdf
▸ https://en.wikipedia.org/wiki/Weather_forecasting
▸ https://www.otexts.org/fpp/6/1
▸ http://pandas.pydata.org/pandas-docs/stable/cookbook.html#cookbook-plotting
▸ http://pandas.pydata.org/pandas-docs/stable/visualization.html
▸ http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
▸ http://en.wikipedia.org/wiki/Autoregressive–moving-average_model
▸ http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
32

More Related Content

What's hot

Time series and forecasting
Time series and forecastingTime series and forecasting
Time series and forecastingmvskrishna
 
Arima model
Arima modelArima model
Arima modelJassika
 
Mba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysisMba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysisChandra Kodituwakku
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Simplilearn
 
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...Simplilearn
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingDerek Kane
 
Forecasting techniques, time series analysis
Forecasting techniques, time series analysisForecasting techniques, time series analysis
Forecasting techniques, time series analysisSATISH KUMAR
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)Kumar P
 
Time Series - Auto Regressive Models
Time Series - Auto Regressive ModelsTime Series - Auto Regressive Models
Time Series - Auto Regressive ModelsBhaskar T
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptxSunny429247
 
Analyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec domsAnalyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec domsBabasab Patil
 
Trend analysis and time Series Analysis
Trend analysis and time Series Analysis Trend analysis and time Series Analysis
Trend analysis and time Series Analysis Amna Kouser
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingMaruthi Nataraj K
 

What's hot (20)

Time Series Analysis Ravi
Time Series Analysis RaviTime Series Analysis Ravi
Time Series Analysis Ravi
 
time series analysis
time series analysistime series analysis
time series analysis
 
Time series and forecasting
Time series and forecastingTime series and forecasting
Time series and forecasting
 
Arima model
Arima modelArima model
Arima model
 
Mba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysisMba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysis
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
 
ARIMA
ARIMA ARIMA
ARIMA
 
Time series
Time seriesTime series
Time series
 
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
 
Data Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series ForecastingData Science - Part X - Time Series Forecasting
Data Science - Part X - Time Series Forecasting
 
Forecasting techniques, time series analysis
Forecasting techniques, time series analysisForecasting techniques, time series analysis
Forecasting techniques, time series analysis
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)
 
Time Series - 1
Time Series - 1Time Series - 1
Time Series - 1
 
Time Series - Auto Regressive Models
Time Series - Auto Regressive ModelsTime Series - Auto Regressive Models
Time Series - Auto Regressive Models
 
1634 time series and trend analysis
1634 time series and trend analysis1634 time series and trend analysis
1634 time series and trend analysis
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
 
Analyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec domsAnalyzing and forecasting time series data ppt @ bec doms
Analyzing and forecasting time series data ppt @ bec doms
 
Trend analysis and time Series Analysis
Trend analysis and time Series Analysis Trend analysis and time Series Analysis
Trend analysis and time Series Analysis
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
Time series Analysis
Time series AnalysisTime series Analysis
Time series Analysis
 

Viewers also liked

How to become a data scientist in 6 months
How to become a data scientist in 6 monthsHow to become a data scientist in 6 months
How to become a data scientist in 6 monthsTetiana Ivanova
 
Time Series
Time SeriesTime Series
Time Seriesyush313
 
Time series analysis in Stata
Time series analysis in StataTime series analysis in Stata
Time series analysis in Statashahisec1
 
Analysis of time series
Analysis of time seriesAnalysis of time series
Analysis of time seriesPablosperessos
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series AnalysisQAware GmbH
 
STATA - Time Series Analysis
STATA - Time Series AnalysisSTATA - Time Series Analysis
STATA - Time Series Analysisstata_org_uk
 

Viewers also liked (8)

How to become a data scientist in 6 months
How to become a data scientist in 6 monthsHow to become a data scientist in 6 months
How to become a data scientist in 6 months
 
Time Series
Time SeriesTime Series
Time Series
 
Time series analysis in Stata
Time series analysis in StataTime series analysis in Stata
Time series analysis in Stata
 
Time series
Time seriesTime series
Time series
 
Analysis of time series
Analysis of time seriesAnalysis of time series
Analysis of time series
 
Time series Forecasting
Time series ForecastingTime series Forecasting
Time series Forecasting
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series Analysis
 
STATA - Time Series Analysis
STATA - Time Series AnalysisSTATA - Time Series Analysis
STATA - Time Series Analysis
 

Similar to Time Series Analysis: Theory and Practice

TIME SERIES ANALYSIS.docx
TIME SERIES ANALYSIS.docxTIME SERIES ANALYSIS.docx
TIME SERIES ANALYSIS.docxMilhhanMohsin
 
TIME SERIES & CROSS ‎SECTIONAL ANALYSIS
TIME SERIES & CROSS ‎SECTIONAL ANALYSISTIME SERIES & CROSS ‎SECTIONAL ANALYSIS
TIME SERIES & CROSS ‎SECTIONAL ANALYSISLibcorpio
 
Time series analysis
Time series analysisTime series analysis
Time series analysisFaltu Focat
 
Weather forecasting model.pptx
Weather forecasting model.pptxWeather forecasting model.pptx
Weather forecasting model.pptxVisheshYadav12
 
Mining Transactional and Time Series Data
Mining Transactional and Time Series DataMining Transactional and Time Series Data
Mining Transactional and Time Series DataBrenda Wolfe
 
Large Scale Automatic Forecasting for Millions of Forecasts
Large Scale Automatic Forecasting for Millions of ForecastsLarge Scale Automatic Forecasting for Millions of Forecasts
Large Scale Automatic Forecasting for Millions of ForecastsAjay Ohri
 
FIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docx
FIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docxFIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docx
FIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docxAKHIL969626
 
Quality Journey -Introduction to 7QC Tools2.0.pdf
Quality Journey -Introduction to 7QC Tools2.0.pdfQuality Journey -Introduction to 7QC Tools2.0.pdf
Quality Journey -Introduction to 7QC Tools2.0.pdfNileshJajoo2
 
Quality management methodology
Quality management methodologyQuality management methodology
Quality management methodologyselinasimpson2201
 
Lesson 1 introduction_to_time_series
Lesson 1 introduction_to_time_seriesLesson 1 introduction_to_time_series
Lesson 1 introduction_to_time_seriesankit_ppt
 
Quality management methodologies
Quality management methodologiesQuality management methodologies
Quality management methodologiesselinasimpson331
 
Inter Time Series Sales Forecasting
Inter Time Series Sales ForecastingInter Time Series Sales Forecasting
Inter Time Series Sales ForecastingIJASCSE
 
Quality management system procedures
Quality management system proceduresQuality management system procedures
Quality management system proceduresselinasimpson2101
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand Forecastingyashpal01
 
OLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENTOLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENTANNA UNIVERSITY
 

Similar to Time Series Analysis: Theory and Practice (20)

TIME SERIES ANALYSIS.docx
TIME SERIES ANALYSIS.docxTIME SERIES ANALYSIS.docx
TIME SERIES ANALYSIS.docx
 
TIME SERIES & CROSS ‎SECTIONAL ANALYSIS
TIME SERIES & CROSS ‎SECTIONAL ANALYSISTIME SERIES & CROSS ‎SECTIONAL ANALYSIS
TIME SERIES & CROSS ‎SECTIONAL ANALYSIS
 
Time series analysis
Time series analysisTime series analysis
Time series analysis
 
Weather forecasting model.pptx
Weather forecasting model.pptxWeather forecasting model.pptx
Weather forecasting model.pptx
 
Run Chart
Run ChartRun Chart
Run Chart
 
Mining Transactional and Time Series Data
Mining Transactional and Time Series DataMining Transactional and Time Series Data
Mining Transactional and Time Series Data
 
Large Scale Automatic Forecasting for Millions of Forecasts
Large Scale Automatic Forecasting for Millions of ForecastsLarge Scale Automatic Forecasting for Millions of Forecasts
Large Scale Automatic Forecasting for Millions of Forecasts
 
Demand forecasting
Demand forecastingDemand forecasting
Demand forecasting
 
FIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docx
FIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docxFIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docx
FIRE ADMIN UNIT 1 .orct121320#ffffff#fa951a#FFFFFF#e7b3513VERSON.docx
 
Chapter 18
Chapter 18Chapter 18
Chapter 18
 
Ac26185187
Ac26185187Ac26185187
Ac26185187
 
Quality Journey -Introduction to 7QC Tools2.0.pdf
Quality Journey -Introduction to 7QC Tools2.0.pdfQuality Journey -Introduction to 7QC Tools2.0.pdf
Quality Journey -Introduction to 7QC Tools2.0.pdf
 
Quality management methodology
Quality management methodologyQuality management methodology
Quality management methodology
 
Lesson 1 introduction_to_time_series
Lesson 1 introduction_to_time_seriesLesson 1 introduction_to_time_series
Lesson 1 introduction_to_time_series
 
Quality management methodologies
Quality management methodologiesQuality management methodologies
Quality management methodologies
 
Inter Time Series Sales Forecasting
Inter Time Series Sales ForecastingInter Time Series Sales Forecasting
Inter Time Series Sales Forecasting
 
Quality management system procedures
Quality management system proceduresQuality management system procedures
Quality management system procedures
 
Demand Forecasting
Demand ForecastingDemand Forecasting
Demand Forecasting
 
Tqm old tools
Tqm old toolsTqm old tools
Tqm old tools
 
OLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENTOLD SEVEN TOOLS OF QUALTIY MANAGEMENT
OLD SEVEN TOOLS OF QUALTIY MANAGEMENT
 

Recently uploaded

The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 

Recently uploaded (17)

The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 

Time Series Analysis: Theory and Practice

  • 1. TIME SERIES ANALYSIS: THEORY AND PRACTICE LMLP MEETUP
  • 2. TIME SERIES ANALYSIS:THEORY AND PRACTICE SOME HOUSEKEEPING ▸ Call for presenters over the summer period ▸ Please don’t use the CodeNode bar after the meetup since it’s booked for a private event - go to the pub across the road 2
  • 3. TIME SERIES ANALYSIS:THEORY AND PRACTICE DEFINITION OF TIME SERIES DATA ▸ Sequence of measurements (data points) - ▸ that follow non-random order (i.e. are successive) - ▸ taken over regular time intervals - ▸ usually with no more than one data point per interval (if there’s more than one data point - we call it multiple time series analysis and use slightly different approaches to modelling). 3
  • 4. TIME SERIES ANALYSIS:THEORY AND PRACTICE HOW ARE TIME SERIES DIFFERENT FROM OTHER TYPES OF DATA? ▸ Panel data ▸ Cross-sectional data ▸ Time series is a type of cross-sectional data set where one measurement is differentiated from another by time stamp only 4
  • 5. TIME SERIES ANALYSIS:THEORY AND PRACTICE APPLICATIONS ▸ Financial markets ▸ Weather forecasting ▸ Sales forecasting ▸ Signal processing ▸ Natural language processing 5
  • 6. TIME SERIES ANALYSIS:THEORY AND PRACTICE PROPERTIES OF TIME SERIES ▸ Seasonality ▸ Trending ▸ Cycles 6
  • 7. TIME SERIES ANALYSIS:THEORY AND PRACTICE TRENDING ▸ A trend exists when there is a long-term increase or decrease in the data. It does not have to be linear. A trend can “change direction” and, say, go from increasing to decreasing. ▸ Trends usually become visible when a linear function is fitted to the data. 7 Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
  • 8. TIME SERIES ANALYSIS:THEORY AND PRACTICE SEASONALITY AND CYCLES ▸ A seasonal pattern exists when a series is influenced by seasonal factors (e.g. the month of the year or day of the week). Seasonality is always of a fixed and of a known period. ▸ A cyclic pattern exists when data exhibit rises and falls that are not of fixed period. The duration of these fluctuations is usually of at least 2 years (e.g. economic cycles). ▸ What may seem to be a trend over a short period of time may be due to seasonality/cycle over a longer period of time. Always zoom in/zoom out when plotting your data! 8
  • 9. TIME SERIES ANALYSIS:THEORY AND PRACTICE WHAT DOES IT ALL LOOK LIKE ON A CHART? 9 Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
  • 10. TIME SERIES ANALYSIS:THEORY AND PRACTICE WHAT DOES IT ALL LOOK LIKE ON A CHART? 10 Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
  • 11. TIME SERIES ANALYSIS:THEORY AND PRACTICE WHAT DOES IT ALL LOOK LIKE ON A CHART? 11 Source: http://jcflowers1.iweb.bsu.edu/rlo/trends.htm
  • 12. TIME SERIES ANALYSIS:THEORY AND PRACTICE WHAT DOES IT ALL LOOK LIKE ON A CHART? 12
  • 13. TIME SERIES ANALYSIS:THEORY AND PRACTICE TESTING FOR TRENDS AND SEASONALITY ▸ Checking for seasonality: autocorrelation. ▸ Checking for trends: fit a simple curve or a rolling average and eyeball the chart. No proven automatic tests. Strong autocorrelation with the time period immediately preceding the measurement also suggests a trend component. 13
  • 14. TIME SERIES ANALYSIS:THEORY AND PRACTICE ON THE IMPORTANCE OF ASKING THE RIGHT QUESTIONS ▸ What are you trying to predict? ▸ Do you know how the measurements were taken? ▸ Do you have any missing values in the dataset? If yes, what do they represent? ▸ Do you need to adjust for seasonality or trend? ▸ What “shape” is your dataset? ▸ What are the assumptions being made? 14
  • 15. TIME SERIES ANALYSIS:THEORY AND PRACTICE ON THE IMPORTANCE OF ASKING THE RIGHT QUESTIONS 15
  • 16. TIME SERIES ANALYSIS:THEORY AND PRACTICE NOW TO THE PRACTICE BIT ▸ You can’t use the same procedures to analyse snapshot and time series data. ▸ For example, you can’t randomly pick the data points that will be withheld for cross-validation and testing purposes. Why? ▸ Make sure to understand as much as possible about the underlying factors that affect the measurements. 16
  • 17. TIME SERIES ANALYSIS:THEORY AND PRACTICE PLOT, PLOT, THEN PLOT AGAIN ▸ Plotting your data will allow you to uncover the structure of the dataset, spot irregularities in the data and figure out which adjustments need to be made before proceeding with the modelling. ▸ Useful libraries: pandas, numpy, json, matplotlib.pyplot, pathlib, seaborn, scipy stats, statsmodels. 17
  • 18. TIME SERIES ANALYSIS:THEORY AND PRACTICE TIPS AND TRICKS FOR PLOTTING ▸ Basic function: plot 18
  • 19. TIME SERIES ANALYSIS:THEORY AND PRACTICE TIPS AND TRICKS FOR PLOTTING ▸ Plotting multiple lines 19
  • 20. TIME SERIES ANALYSIS:THEORY AND PRACTICE TIPS AND TRICKS FOR PLOTTING ▸ Autocorrelation ▸ Use autocorrelation_plot from pandas.tools.plotting 20
  • 21. TIME SERIES ANALYSIS:THEORY AND PRACTICE TIPS AND TRICKS FOR PLOTTING ▸ Autocorrelation 21
  • 22. TIME SERIES ANALYSIS:THEORY AND PRACTICE TIPS AND TRICKS FOR PLOTTING ▸ Smoothing - linear and exponential ▸ To see the “bigger picture” you may want to look at a moving average of the input values. ▸ This is what they call “smoothing”. ▸ Linear smoothing gives equal weight to all the points it’s averaging over, exponential smoothing gives more weight to more recent points. ▸ Points taken as inputs by moving average can be either centred around the original value or directly behind it. ▸ Use [ColumnName].rolling.(window=[window size], center=True).mean().plot() to plot rolling average. You can also replace mean by median. 22
  • 23. TIME SERIES ANALYSIS:THEORY AND PRACTICE TIPS AND TRICKS FOR PLOTTING ▸ For more plotting tools from pandas, visit ▸ http://pandas.pydata.org/pandas-docs/stable/ visualization.html#visualization-autocorrelation ▸ http://pandas.pydata.org/pandas-docs/stable/ computation.html#rolling-windows 23
  • 24. TIME SERIES ANALYSIS:THEORY AND PRACTICE DATA LOADING AND PREPROCESSING ▸ The data often comes in the form of multiple large csv files that need to be concatenated together for further processing or slicing. ▸ Here is a useful discussion on Stack Overflow covering this issue: http://stackoverflow.com/questions/25210819/speeding-up-data- import-function-pandas-and-appending-to-dataframe/ 25210900#25210900 ▸ A useful aside: to speed up processing, specify columns to import and their data type when you’re reading csv into a data frame - and you can specify different data types for different columns by using a dictionary: http://pandas.pydata.org/pandas-docs/stable/ generated/pandas.read_csv.html 24
  • 25. TIME SERIES ANALYSIS:THEORY AND PRACTICE MODELLING APPROACHES-ARMA ▸ ARMA: autoregressive moving average ▸ Example: http://statsmodels.sourceforge.net/devel/ examples/notebooks/generated/tsa_arma.html ▸ ARMA models combine t autoregressive and moving- average terms to predict (t+1)-th term 25
  • 26. TIME SERIES ANALYSIS:THEORY AND PRACTICE MODELLING APPROACHES-ARMA ▸ Autoregressive model of order p: ▸ c is a constant, φ are parameters, ε is the error term (white noise). ▸ Moving average model of order q: ▸ μ is expectation of Xt, ε is again the error term, θ are parameters. ▸ Combined: 26
  • 27. TIME SERIES ANALYSIS:THEORY AND PRACTICE MODELLING APPROACHES - ARMA ▸ Why do we combine AR and MA models? ▸ AR model assumes steady change and is poor for predicting sudden fluctuations. ▸ MA model takes error terms as an input which allows us to take into account sudden changes in output faster than AR model would have done on its own. ▸ Data doesn’t come with errors predefined - these are in fact extrapolated by first fitting a model like AR. See any issues? 27
  • 28. TIME SERIES ANALYSIS:THEORY AND PRACTICE OTHER MODELLING APPROACHES ▸ Spectrum/Fourier analysis ▸ Attempts to decompose the function into a sum of sinusoidal waves. ▸ Main aim is to determine the length and amplitude of underlying cycles in cases where they are not immediately obvious. ▸ More useful for things like sun spot activity than sales forecasting (in the latter case seasonal component is easily guessed by just eyeballing the data). 28
  • 29. TIME SERIES ANALYSIS:THEORY AND PRACTICE LIMITATIONS OF STANDARD APPROACHES ▸ Difficulty capturing high level dependencies - additional rules typically have the be hardcoded. ▸ Can’t handle all of the possible data structures effectively. 29
  • 30. TIME SERIES ANALYSIS:THEORY AND PRACTICE PREDICTION HORIZON ▸ Why can’t we see far into the future? ▸ An interlude on chaos theory 30
  • 31. TIME SERIES ANALYSIS:THEORY AND PRACTICE NEURAL NETWORKS - A POSSIBLE ALTERNATIVE ▸ Neural network architectures can be modified to capture global dependencies (e.g. LSTM). ▸ Capable of both regression and classification, depending on the choice of activation function. ▸ Next time we will discuss 31
  • 32. TIME SERIES ANALYSIS:THEORY AND PRACTICE USEFUL LINKS ▸ https://documents.software.dell.com/statistics/textbook/time-series-analysis ▸ https://en.wikipedia.org/wiki/Time_series ▸ http://www.fil.ion.ucl.ac.uk/~wpenny/course/array.pdf ▸ https://en.wikipedia.org/wiki/Weather_forecasting ▸ https://www.otexts.org/fpp/6/1 ▸ http://pandas.pydata.org/pandas-docs/stable/cookbook.html#cookbook-plotting ▸ http://pandas.pydata.org/pandas-docs/stable/visualization.html ▸ http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html ▸ http://en.wikipedia.org/wiki/Autoregressive–moving-average_model ▸ http://jcflowers1.iweb.bsu.edu/rlo/trends.htm 32