SlideShare a Scribd company logo
1 of 29
Download to read offline
Time Series Analysis in Python with statsmodels

                   Wes McKinney1                 Josef Perktold2               Skipper Seabold3

                                            1 Departmentof Statistical Science
                                                    Duke University
                                            2 Department of Economics

                                    University of North Carolina at Chapel Hill
                                               3 Departmentof Economics
                                                  American University


                       10th Python in Science Conference, 13 July 2011



McKinney, Perktold, Seabold (statsmodels)        Python Time Series Analysis          SciPy Conference 2011   1 / 29
What is statsmodels?




          A library for statistical modeling, implementing standard statistical
          models in Python using NumPy and SciPy
          Includes:
                  Linear (regression) models of many forms
                  Descriptive statistics
                  Statistical tests
                  Time series analysis
                  ...and much more




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   2 / 29
What is Time Series Analysis?




          Statistical modeling of time-ordered data observations
          Inferring structure, forecasting and simulation, and testing
          distributional assumptions about the data
          Modeling dynamic relationships among multiple time series
          Broad applications e.g. in economics, finance, neuroscience, signal
          processing...




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   3 / 29
Talk Overview



          Brief update on statsmodels development
          Aside: user interface and data structures
          Descriptive statistics and tests
          Auto-regressive moving average models (ARMA)
          Vector autoregression (VAR) models
          Filtering tools (Hodrick-Prescott and others)
          Near future: Bayesian dynamic linear models (DLMs), ARCH /
          GARCH volatility models and beyond




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   4 / 29
Statsmodels development update



          We’re now on GitHub! Join us:

                         http://github.com/statsmodels/statsmodels

          Check out the slick Sphinx docs:

                                http://statsmodels.sourceforge.net

          Development focus has been largely computational, i.e. writing
          correct, tested implementations of all the common classes of
          statistical models




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   5 / 29
Statsmodels development update




          Major work to be done on providing a nice integrated user interface
          We must work together to close the gap between R and Python!
          Some important areas:
                  Formula framework, for specifying model design matrices
                  Need integrated rich statistical data structures (pandas)
                  Data visualization of results should always be a few keystrokes away
                  Write a “Statsmodels for R users” guide




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   6 / 29
Aside: statistical data structures and user interface



          While I have a captive audience...
          Controversial fact: pandas is the only Python library currently
          providing data structures matching (and in many places exceeding)
          the richness of R’s data structures (for statistics)
                  Let’s have a BoF session so I can justify this statement
          Feedback I hear is that end users find the fragmented, incohesive set
          of Python tools for data analysis and statistics to be confusing,
          frustrating, and certainly not compelling them to use Python...
                  (Not to mention the packaging headaches)




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   7 / 29
Aside: statistical data structures and user interface




          We need to “commit” ASAP (not 12 months from now) to a high
          level data structure(s) as the “primary data structure(s) for statistical
          data analysis” and communicate that clearly to end users
                  Or we might as well all start programming in R...




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   8 / 29
Example data: EEG trace data


               300

               200

               100

                 0

               100

               200

               300

               400

               500

               600
                  0         500           0      0           0              0      0          0             0
                                      100     150         200         250       300        350        400




McKinney, Perktold, Seabold (statsmodels)     Python Time Series Analysis              SciPy Conference 2011    9 / 29
Example data: Macroeconomic data


              5.5
              5.0      cpi
              4.5
              4.0
              3.5
              3.0
              7.5
              7.0      m1
              6.5
              6.0
              5.5
              5.0
              4.5
              9.5
              9.0
                       realgdp
              8.5
              8.0
                  0   4     8  2  6   0   4   8   2   6   0   4    8
               196 196 196 197 197 198 198 198 199 199 200 200 200




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   10 / 29
Example data: Stock data


              800
                         AAPL
              700        GOOG
                         MSFT
              600        YHOO
              500
              400
              300
              200
              100
                0
                          1         2          3        4           5      6           7      8       9
                       200       200        200      200      200       200      200       200     200




McKinney, Perktold, Seabold (statsmodels)          Python Time Series Analysis              SciPy Conference 2011   11 / 29
Descriptive statistics
            Autocorrelation, partial autocorrelation plots
            Commonly used for identification in ARMA(p,q) and ARIMA(p,d,q)
            models
            acf = tsa . acf ( eeg , 50)
            pacf = tsa . pacf ( eeg , 50)

     1.0                  Autocorrelation                     1.0               Partial Autocorrelation


     0.5                                                      0.5


     0.0                                                      0.0


     0.5                                                      0.5


     1.00         10        20        30    40        50      1.00         10        20        30         40    50

McKinney, Perktold, Seabold (statsmodels)    Python Time Series Analysis               SciPy Conference 2011   12 / 29
Statistical tests




          Ljung-Box test for zero autocorrelation
          Unit root test for cointegration (Augmented Dickey-Fuller test)
          Granger-causality
          Whiteness (iid-ness) and normality
          See our conference paper (when the proceedings get published!)




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   13 / 29
Autoregressive moving average (ARMA) models
          One of most common univariate time series models:

                   yt = µ + a1 yt−1 + ... + ak yt−p +                t    + b1   t−1   + ... + bq       t−q
                                                                                           2
                   where E ( t , s ) = 0, for t = s and                   t   ∼ N (0, σ )


          Exact log-likelihood can be evaluated via the Kalman filter, but the
          “conditional” likelihood is easier and commonly used
          statsmodels has tools for simulating ARMA processes with known
          coefficients ai , bi and also estimation given specified lag orders
              import scikits.statsmodels.tsa.arima_process as ap
              ar_coef = [1, .75, -.25]; ma_coef = [1, -.5]
              nobs = 100
              y = ap.arma_generate_sample(ar_coef, ma_coef, nobs)
              y += 4 # add in constant

McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis                SciPy Conference 2011   14 / 29
ARMA Estimation



          Several likelihood-based estimators implemented (see docs)
              model = tsa.ARMA(y)
              result = model.fit(order=(2, 1), trend=’c’,
                                 method=’css-mle’, disp=-1)
              result.params
              # array([ 3.97, -0.97, -0.05, -0.13])


          Standard model diagnostics, standard errors, information criteria
          (AIC, BIC, ...), etc available in the returned ARMAResults object




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   15 / 29
Vector Autoregression (VAR) models



          Widely used model for modeling multiple (K -variate) time series,
          especially in macroeconomics:

                           Yt = A1 Yt−1 + . . . + Ap Yt−p +               t,   t   ∼ N (0, Σ)

          Matrices Ai are K × K .
          Yt must be a stationary process (sometimes achieved by
          differencing). Related class of models (VECM) for modeling
          nonstationary (including cointegrated) processes




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis            SciPy Conference 2011   16 / 29
Vector Autoregression (VAR) models

   >>> model = VAR(data); model.select_order(8)
                    VAR Order Selection
   =====================================================
              aic          bic          fpe         hqic
   -----------------------------------------------------
   0       -27.83       -27.78    8.214e-13       -27.81
   1       -28.77       -28.57    3.189e-13       -28.69
   2       -29.00      -28.64*    2.556e-13       -28.85
   3       -29.10       -28.60    2.304e-13      -28.90*
   4       -29.09       -28.43    2.330e-13       -28.82
   5       -29.13       -28.33    2.228e-13       -28.81
   6      -29.14*       -28.18   2.213e-13*       -28.75
   7       -29.07       -27.96    2.387e-13       -28.62
   =====================================================
   * Minimum

McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   17 / 29
Vector Autoregression (VAR) models

   >>> result = model.fit(2)
   >>> result.summary() # print summary for each variable
   <snip>
   Results for equation m1
   ====================================================
               coefficient    std. error t-stat    prob
   ----------------------------------------------------
   const          0.004968      0.001850   2.685 0.008
   L1.m1          0.363636      0.071307   5.100 0.000
   L1.realgdp    -0.077460      0.092975 -0.833 0.406
   L1.cpi        -0.052387      0.128161 -0.409 0.683
   L2.m1          0.250589      0.072050   3.478 0.001
   L2.realgdp    -0.085874      0.092032 -0.933 0.352
   L2.cpi         0.169803      0.128376   1.323 0.188
   ====================================================
   <snip>


McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   18 / 29
Vector Autoregression (VAR) models




   >>> result = model.fit(2)
   >>> result.summary() # print summary for each variable
   <snip>
   Correlation matrix of residuals
                    m1   realgdp       cpi
   m1         1.000000 -0.055690 -0.297494
   realgdp   -0.055690 1.000000 0.115597
   cpi       -0.297494 0.115597 1.000000




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   19 / 29
VAR: Impulse Response analysis
          Analyze systematic impact of unit “shock” to a single variable

   irf = result.irf(10)
   irf.plot()

                                                                  Impulse responses
                                      m1 → m1                         realgdp → m1                        cpi → m1
                         1.0                               0.2                               0.4
                         0.8                               0.1                               0.3
                                                                                             0.2
                         0.6                               0.0                               0.1
                         0.4                               0.1                               0.0
                         0.2                               0.2                               0.1
                                                                                             0.2
                         0.0                               0.3                               0.3
                         0.20        4                     0.40          4                10 0.40
                                2            6
                                    m1 → realgdp   8   10         2 realgdp → realgdp 8
                                                                                6                   2   cpi4→ realgdp
                                                                                                                  6     8   10
                        0.20                               1.0                               0.2
                        0.15                               0.8                               0.1
                        0.10                               0.6                               0.0
                        0.05
                                                           0.4                               0.1
                        0.00
                        0.05                               0.2                               0.2
                        0.10                               0.0                               0.3
                        0.150   2     4      6     8   10 0.20    2     4                    0.40         4 → cpi
                                      m1 → cpi                        realgdp →6
                                                                               cpi   8    10        2     cpi 6         8   10
                        0.20                              0.15                               1.0
                        0.15                              0.10                               0.8
                        0.10                              0.05                               0.6
                        0.05                              0.00
                        0.00                              0.05                               0.4
                        0.05                              0.10                               0.2
                        0.100   2     4     6      8   10 0.150   2     4      6     8    10 0.00   2     4      6      8   10



McKinney, Perktold, Seabold (statsmodels)                 Python Time Series Analysis                                SciPy Conference 2011   20 / 29
VAR: Forecast Error Variance Decomposition
          Analyze contribution of each variable to forecasting error

   fevd = result.fevd(20)
   fevd.plot()

                                                Forecast error variance decomposition (FEVD)         m1
                         1.0                                 m1                                      realgdp
                         0.8                                                                         cpi
                         0.6
                         0.4
                         0.2
                         0.00               5                 10                        15     20
                         1.2                               realgdp
                         1.0
                         0.8
                         0.6
                         0.4
                         0.2
                         0.00               5                10                         15     20
                         1.2                                 cpi
                         1.0
                         0.8
                         0.6
                         0.4
                         0.2
                         0.00               5                 10                        15     20



McKinney, Perktold, Seabold (statsmodels)       Python Time Series Analysis                     SciPy Conference 2011   21 / 29
VAR: Statistical tests



   In [137]: result.test_causality(’m1’, [’cpi’, ’realgdp’])
   Granger causality f-test
   =========================================================
      Test statistic   Critical Value      p-value        df
   ---------------------------------------------------------
            1.248787         2.387325        0.289 (4, 579)
   =========================================================
   H_0: [’cpi’, ’realgdp’] do not Granger-cause m1
   Conclusion: fail to reject H_0 at 5.00% significance level




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   22 / 29
Filtering

          Hodrick-Prescott (HP) filter separates a time series yt into a trend τt
          and a cyclical component ζt , so that yt = τt + ζt .

              14
                                                                                       Inflation
              12                                                                       Cyclical component
              10                                                                       Trend component
               8
               6
               4
                2
               0
                2
                4
                       2      6      0      4      8       2       6       0       4      8        2       6
                    196    196    197    197    197    198     198     199     199     199      200    200

McKinney, Perktold, Seabold (statsmodels)        Python Time Series Analysis                  SciPy Conference 2011   23 / 29
Filtering

          In addition to the HP filter, 2 other filters popular in finance and
          economics, Baxter-King and Christiano-Fitzgerald, are available
          We refer you to our paper and the documentation for details on these:

                          Inflation and Unemployment: BK Filtered                           Inflation and Unemployment: CF Filtered
                                                                    INFL                                                              INFL
              4                                                               4                                                       UNEMP
                                                                    UNEMP

              2                                                               2


              0                                                               0


              2                                                               2


              4                                                               4
                                                                                  63



                                                                                               73



                                                                                                           83



                                                                                                                       93
                                                                                       68



                                                                                                     78



                                                                                                                 88



                                                                                                                             98

                                                                                                                                      03
                         71




                                      81




                                                    91




                                                                                                                                           08
                    66




                                76




                                              86




                                                           96

                                                                    01

                                                                         06



                                                                                  19



                                                                                              19



                                                                                                          19



                                                                                                                      19
                                                                                       19



                                                                                                    19



                                                                                                                19



                                                                                                                            19
                         19




                                     19




                                                   19




                                                                                                                                  20
                  19




                              19




                                            19




                                                         19




                                                                                                                                           20
                                                                20

                                                                         20




McKinney, Perktold, Seabold (statsmodels)                   Python Time Series Analysis                         SciPy Conference 2011           24 / 29
Preview: Bayesian dynamic linear models (DLM)



          A state space model by another name:

                                      yt = Ft θt + νt ,       νt ∼ N (0, Vt )
                                      θt = G θt−1 + ωt ,          ωt ∼ N (0, Wt )

          Estimation of basic model by Kalman filter recursions. Provides
          elegant way to do time-varying linear regressions for forecasting
          Extensions: multivariate DLMs, stochastic volatility (SV) models,
          MCMC-based posterior sampling, mixtures of DLMs




McKinney, Perktold, Seabold (statsmodels)    Python Time Series Analysis        SciPy Conference 2011   25 / 29
Preview: DLM Example (Constant+Trend model)

   model = Polynomial(2)
   dlm = DLM(close_px[’AAPL’], model.F, G=model.G, # model
             m0=m0, C0=C0, n0=n0, s0=s0, # priors
             state_discount=.95) # discount factor
                                                                Constant + Trend DLM



                        200



                        150



                        100



                         50
                                       8            9        009            9        009               9               9
                                    200          200        2            200    Jul 2            200             200
                              Nov          Jan          Mar        May                     Sep             Nov

McKinney, Perktold, Seabold (statsmodels)                 Python Time Series Analysis                              SciPy Conference 2011   26 / 29
Preview: Stochastic volatility models


              1.6                       JPY-USD Exchange Rate Volatility Process

              1.4

              1.2

              1.0

              0.8

              0.6

              0.4

              0.20                200             400               600            800             1000



McKinney, Perktold, Seabold (statsmodels)      Python Time Series Analysis          SciPy Conference 2011   27 / 29
Future: sandbox and beyond




          ARCH / GARCH models for volatility
          Structural VAR and error correction models (ECM) for cointegrated
          processes
          Models with non-normally distributed errors
          Better data description, visualization, and interactive research tools
          More sophisticated Bayesian time series models




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   28 / 29
Conclusions




          We’ve implemented many foundational models for time series
          analysis, but the field is very broad
          User interface can and should be much improved
          Repo: http://github.com/statsmodels/statsmodels
          Docs: http://statsmodels.sourceforge.net
          Contact: pystatsmodels@googlegroups.com




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   29 / 29

More Related Content

What's hot

Data Analysis and Statistics in Python using pandas and statsmodels
Data Analysis and Statistics in Python using pandas and statsmodelsData Analysis and Statistics in Python using pandas and statsmodels
Data Analysis and Statistics in Python using pandas and statsmodelsWes McKinney
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with PythonDavis David
 
Exploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science ClubExploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science ClubMartin Bago
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionGianluca Bontempi
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Simplilearn
 
Auto-Train a Time-Series Forecast Model With AML + ADB
Auto-Train a Time-Series Forecast Model With AML + ADBAuto-Train a Time-Series Forecast Model With AML + ADB
Auto-Train a Time-Series Forecast Model With AML + ADBDatabricks
 
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...Simplilearn
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1Venkata Reddy Konasani
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An OverviewMachinePulse
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learningIvo Andreev
 
Exploratory data analysis
Exploratory data analysis Exploratory data analysis
Exploratory data analysis Peter Reimann
 
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?Smarten Augmented Analytics
 
Time Series Analysis: Theory and Practice
Time Series Analysis: Theory and PracticeTime Series Analysis: Theory and Practice
Time Series Analysis: Theory and PracticeTetiana Ivanova
 
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...Simplilearn
 
Data Exploration, Validation and Sanitization
Data Exploration, Validation and SanitizationData Exploration, Validation and Sanitization
Data Exploration, Validation and SanitizationVenkata Reddy Konasani
 

What's hot (20)

Data Analysis and Statistics in Python using pandas and statsmodels
Data Analysis and Statistics in Python using pandas and statsmodelsData Analysis and Statistics in Python using pandas and statsmodels
Data Analysis and Statistics in Python using pandas and statsmodels
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Time series deep learning
Time series   deep learningTime series   deep learning
Time series deep learning
 
Manufacturing Data Analytics
Manufacturing Data AnalyticsManufacturing Data Analytics
Manufacturing Data Analytics
 
ARIMA
ARIMA ARIMA
ARIMA
 
predictive analytics
predictive analyticspredictive analytics
predictive analytics
 
Exploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science ClubExploratory data analysis in R - Data Science Club
Exploratory data analysis in R - Data Science Club
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
 
Auto-Train a Time-Series Forecast Model With AML + ADB
Auto-Train a Time-Series Forecast Model With AML + ADBAuto-Train a Time-Series Forecast Model With AML + ADB
Auto-Train a Time-Series Forecast Model With AML + ADB
 
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
Time Series Analysis - 1 | Time Series in R | Time Series Forecasting | Data ...
 
Introduction to predictive modeling v1
Introduction to predictive modeling v1Introduction to predictive modeling v1
Introduction to predictive modeling v1
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
 
08 clustering
08 clustering08 clustering
08 clustering
 
Exploratory data analysis
Exploratory data analysis Exploratory data analysis
Exploratory data analysis
 
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
 
Time Series Analysis: Theory and Practice
Time Series Analysis: Theory and PracticeTime Series Analysis: Theory and Practice
Time Series Analysis: Theory and Practice
 
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
Scikit-Learn Tutorial | Machine Learning With Scikit-Learn | Sklearn | Python...
 
Data Exploration, Validation and Sanitization
Data Exploration, Validation and SanitizationData Exploration, Validation and Sanitization
Data Exploration, Validation and Sanitization
 

Viewers also liked

Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasWes McKinney
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and StatisticsWes McKinney
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonWes McKinney
 
Time travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsTime travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsAlexander Hendorf
 
Revenue Growth through Machine Learning
Revenue Growth through Machine LearningRevenue Growth through Machine Learning
Revenue Growth through Machine LearningDataWorks Summit
 
SciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkSciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkWes McKinney
 
PyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open dataPyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open dataNicole A. Donnelly, CMCP
 
ET_with_EEG
ET_with_EEGET_with_EEG
ET_with_EEGXuan Guo
 
How Chile used social media during the Earthquake
How Chile used social media during the EarthquakeHow Chile used social media during the Earthquake
How Chile used social media during the EarthquakeSebastian Salazar
 
Structured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and StatisticsStructured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and StatisticsWes McKinney
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time seriesLuigi Piva CQF
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial usersWhat's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial usersWes McKinney
 
Productive Data Tools for Quants
Productive Data Tools for QuantsProductive Data Tools for Quants
Productive Data Tools for QuantsWes McKinney
 
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy ComparisonAnalysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparisonijsrd.com
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionChittagong Independent University
 
Time series database, InfluxDB & PHP
Time series database, InfluxDB & PHPTime series database, InfluxDB & PHP
Time series database, InfluxDB & PHPCorley S.r.l.
 
ForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential SmoothingForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential SmoothingDeepThought, Inc.
 

Viewers also liked (20)

Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statistics
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in Python
 
Time travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsTime travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodels
 
Revenue Growth through Machine Learning
Revenue Growth through Machine LearningRevenue Growth through Machine Learning
Revenue Growth through Machine Learning
 
SciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkSciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talk
 
PyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open dataPyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open data
 
ET_with_EEG
ET_with_EEGET_with_EEG
ET_with_EEG
 
How Chile used social media during the Earthquake
How Chile used social media during the EarthquakeHow Chile used social media during the Earthquake
How Chile used social media during the Earthquake
 
Laughing Squid Opportunity Analysis Project
Laughing Squid Opportunity Analysis ProjectLaughing Squid Opportunity Analysis Project
Laughing Squid Opportunity Analysis Project
 
Structured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and StatisticsStructured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and Statistics
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time series
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial usersWhat's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial users
 
Productive Data Tools for Quants
Productive Data Tools for QuantsProductive Data Tools for Quants
Productive Data Tools for Quants
 
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy ComparisonAnalysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparison
 
Time series Forecasting using svm
Time series Forecasting using  svmTime series Forecasting using  svm
Time series Forecasting using svm
 
Pocoyo
PocoyoPocoyo
Pocoyo
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector Regression
 
Time series database, InfluxDB & PHP
Time series database, InfluxDB & PHPTime series database, InfluxDB & PHP
Time series database, InfluxDB & PHP
 
ForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential SmoothingForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential Smoothing
 

Similar to Scipy 2011 Time Series Analysis in Python

Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008bosc_2008
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopythontiago
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningAnubhav Jain
 
人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれから人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれからIchigaku Takigawa
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAijun Zhang
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for HadoopJim Dowling
 
Colored petri nets theory and applications
Colored petri nets theory and applicationsColored petri nets theory and applications
Colored petri nets theory and applicationsAbu Hussein
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistryguest5929fa7
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistrybaoilleach
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingUniversity of Washington
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationBrenno Menezes
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsIRJET Journal
 

Similar to Scipy 2011 Time Series Analysis in Python (20)

Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
 
人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれから人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれから
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
Sci computing using python
Sci computing using pythonSci computing using python
Sci computing using python
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for Hadoop
 
Colored petri nets theory and applications
Colored petri nets theory and applicationsColored petri nets theory and applications
Colored petri nets theory and applications
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistry
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistry
 
2015 03-28-eb-final
2015 03-28-eb-final2015 03-28-eb-final
2015 03-28-eb-final
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity Computing
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimization
 
Ibmr 2014
Ibmr 2014Ibmr 2014
Ibmr 2014
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
CMSI計算科学技術特論C (2015) ALPS と量子多体問題②
CMSI計算科学技術特論C (2015) ALPS と量子多体問題②CMSI計算科学技術特論C (2015) ALPS と量子多体問題②
CMSI計算科学技術特論C (2015) ALPS と量子多体問題②
 

More from Wes McKinney

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowWes McKinney
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityWes McKinney
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkWes McKinney
 
New Directions for Apache Arrow
New Directions for Apache ArrowNew Directions for Apache Arrow
New Directions for Apache ArrowWes McKinney
 
Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportWes McKinney
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesWes McKinney
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Wes McKinney
 
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future Wes McKinney
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackWes McKinney
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionWes McKinney
 
Apache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science StackApache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science StackWes McKinney
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Wes McKinney
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"Wes McKinney
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataWes McKinney
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataWes McKinney
 
Shared Infrastructure for Data Science
Shared Infrastructure for Data ScienceShared Infrastructure for Data Science
Shared Infrastructure for Data ScienceWes McKinney
 
Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)Wes McKinney
 
Memory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine LearningMemory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine LearningWes McKinney
 

More from Wes McKinney (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
 
New Directions for Apache Arrow
New Directions for Apache ArrowNew Directions for Apache Arrow
New Directions for Apache Arrow
 
Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data Transport
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020
 
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics Stack
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
 
Apache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science StackApache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science Stack
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory Data
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Shared Infrastructure for Data Science
Shared Infrastructure for Data ScienceShared Infrastructure for Data Science
Shared Infrastructure for Data Science
 
Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)
 
Memory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine LearningMemory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine Learning
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Scipy 2011 Time Series Analysis in Python

  • 1. Time Series Analysis in Python with statsmodels Wes McKinney1 Josef Perktold2 Skipper Seabold3 1 Departmentof Statistical Science Duke University 2 Department of Economics University of North Carolina at Chapel Hill 3 Departmentof Economics American University 10th Python in Science Conference, 13 July 2011 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 1 / 29
  • 2. What is statsmodels? A library for statistical modeling, implementing standard statistical models in Python using NumPy and SciPy Includes: Linear (regression) models of many forms Descriptive statistics Statistical tests Time series analysis ...and much more McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 2 / 29
  • 3. What is Time Series Analysis? Statistical modeling of time-ordered data observations Inferring structure, forecasting and simulation, and testing distributional assumptions about the data Modeling dynamic relationships among multiple time series Broad applications e.g. in economics, finance, neuroscience, signal processing... McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 3 / 29
  • 4. Talk Overview Brief update on statsmodels development Aside: user interface and data structures Descriptive statistics and tests Auto-regressive moving average models (ARMA) Vector autoregression (VAR) models Filtering tools (Hodrick-Prescott and others) Near future: Bayesian dynamic linear models (DLMs), ARCH / GARCH volatility models and beyond McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 4 / 29
  • 5. Statsmodels development update We’re now on GitHub! Join us: http://github.com/statsmodels/statsmodels Check out the slick Sphinx docs: http://statsmodels.sourceforge.net Development focus has been largely computational, i.e. writing correct, tested implementations of all the common classes of statistical models McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 5 / 29
  • 6. Statsmodels development update Major work to be done on providing a nice integrated user interface We must work together to close the gap between R and Python! Some important areas: Formula framework, for specifying model design matrices Need integrated rich statistical data structures (pandas) Data visualization of results should always be a few keystrokes away Write a “Statsmodels for R users” guide McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 6 / 29
  • 7. Aside: statistical data structures and user interface While I have a captive audience... Controversial fact: pandas is the only Python library currently providing data structures matching (and in many places exceeding) the richness of R’s data structures (for statistics) Let’s have a BoF session so I can justify this statement Feedback I hear is that end users find the fragmented, incohesive set of Python tools for data analysis and statistics to be confusing, frustrating, and certainly not compelling them to use Python... (Not to mention the packaging headaches) McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 7 / 29
  • 8. Aside: statistical data structures and user interface We need to “commit” ASAP (not 12 months from now) to a high level data structure(s) as the “primary data structure(s) for statistical data analysis” and communicate that clearly to end users Or we might as well all start programming in R... McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 8 / 29
  • 9. Example data: EEG trace data 300 200 100 0 100 200 300 400 500 600 0 500 0 0 0 0 0 0 0 100 150 200 250 300 350 400 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 9 / 29
  • 10. Example data: Macroeconomic data 5.5 5.0 cpi 4.5 4.0 3.5 3.0 7.5 7.0 m1 6.5 6.0 5.5 5.0 4.5 9.5 9.0 realgdp 8.5 8.0 0 4 8 2 6 0 4 8 2 6 0 4 8 196 196 196 197 197 198 198 198 199 199 200 200 200 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 10 / 29
  • 11. Example data: Stock data 800 AAPL 700 GOOG MSFT 600 YHOO 500 400 300 200 100 0 1 2 3 4 5 6 7 8 9 200 200 200 200 200 200 200 200 200 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 11 / 29
  • 12. Descriptive statistics Autocorrelation, partial autocorrelation plots Commonly used for identification in ARMA(p,q) and ARIMA(p,d,q) models acf = tsa . acf ( eeg , 50) pacf = tsa . pacf ( eeg , 50) 1.0 Autocorrelation 1.0 Partial Autocorrelation 0.5 0.5 0.0 0.0 0.5 0.5 1.00 10 20 30 40 50 1.00 10 20 30 40 50 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 12 / 29
  • 13. Statistical tests Ljung-Box test for zero autocorrelation Unit root test for cointegration (Augmented Dickey-Fuller test) Granger-causality Whiteness (iid-ness) and normality See our conference paper (when the proceedings get published!) McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 13 / 29
  • 14. Autoregressive moving average (ARMA) models One of most common univariate time series models: yt = µ + a1 yt−1 + ... + ak yt−p + t + b1 t−1 + ... + bq t−q 2 where E ( t , s ) = 0, for t = s and t ∼ N (0, σ ) Exact log-likelihood can be evaluated via the Kalman filter, but the “conditional” likelihood is easier and commonly used statsmodels has tools for simulating ARMA processes with known coefficients ai , bi and also estimation given specified lag orders import scikits.statsmodels.tsa.arima_process as ap ar_coef = [1, .75, -.25]; ma_coef = [1, -.5] nobs = 100 y = ap.arma_generate_sample(ar_coef, ma_coef, nobs) y += 4 # add in constant McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 14 / 29
  • 15. ARMA Estimation Several likelihood-based estimators implemented (see docs) model = tsa.ARMA(y) result = model.fit(order=(2, 1), trend=’c’, method=’css-mle’, disp=-1) result.params # array([ 3.97, -0.97, -0.05, -0.13]) Standard model diagnostics, standard errors, information criteria (AIC, BIC, ...), etc available in the returned ARMAResults object McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 15 / 29
  • 16. Vector Autoregression (VAR) models Widely used model for modeling multiple (K -variate) time series, especially in macroeconomics: Yt = A1 Yt−1 + . . . + Ap Yt−p + t, t ∼ N (0, Σ) Matrices Ai are K × K . Yt must be a stationary process (sometimes achieved by differencing). Related class of models (VECM) for modeling nonstationary (including cointegrated) processes McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 16 / 29
  • 17. Vector Autoregression (VAR) models >>> model = VAR(data); model.select_order(8) VAR Order Selection ===================================================== aic bic fpe hqic ----------------------------------------------------- 0 -27.83 -27.78 8.214e-13 -27.81 1 -28.77 -28.57 3.189e-13 -28.69 2 -29.00 -28.64* 2.556e-13 -28.85 3 -29.10 -28.60 2.304e-13 -28.90* 4 -29.09 -28.43 2.330e-13 -28.82 5 -29.13 -28.33 2.228e-13 -28.81 6 -29.14* -28.18 2.213e-13* -28.75 7 -29.07 -27.96 2.387e-13 -28.62 ===================================================== * Minimum McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 17 / 29
  • 18. Vector Autoregression (VAR) models >>> result = model.fit(2) >>> result.summary() # print summary for each variable <snip> Results for equation m1 ==================================================== coefficient std. error t-stat prob ---------------------------------------------------- const 0.004968 0.001850 2.685 0.008 L1.m1 0.363636 0.071307 5.100 0.000 L1.realgdp -0.077460 0.092975 -0.833 0.406 L1.cpi -0.052387 0.128161 -0.409 0.683 L2.m1 0.250589 0.072050 3.478 0.001 L2.realgdp -0.085874 0.092032 -0.933 0.352 L2.cpi 0.169803 0.128376 1.323 0.188 ==================================================== <snip> McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 18 / 29
  • 19. Vector Autoregression (VAR) models >>> result = model.fit(2) >>> result.summary() # print summary for each variable <snip> Correlation matrix of residuals m1 realgdp cpi m1 1.000000 -0.055690 -0.297494 realgdp -0.055690 1.000000 0.115597 cpi -0.297494 0.115597 1.000000 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 19 / 29
  • 20. VAR: Impulse Response analysis Analyze systematic impact of unit “shock” to a single variable irf = result.irf(10) irf.plot() Impulse responses m1 → m1 realgdp → m1 cpi → m1 1.0 0.2 0.4 0.8 0.1 0.3 0.2 0.6 0.0 0.1 0.4 0.1 0.0 0.2 0.2 0.1 0.2 0.0 0.3 0.3 0.20 4 0.40 4 10 0.40 2 6 m1 → realgdp 8 10 2 realgdp → realgdp 8 6 2 cpi4→ realgdp 6 8 10 0.20 1.0 0.2 0.15 0.8 0.1 0.10 0.6 0.0 0.05 0.4 0.1 0.00 0.05 0.2 0.2 0.10 0.0 0.3 0.150 2 4 6 8 10 0.20 2 4 0.40 4 → cpi m1 → cpi realgdp →6 cpi 8 10 2 cpi 6 8 10 0.20 0.15 1.0 0.15 0.10 0.8 0.10 0.05 0.6 0.05 0.00 0.00 0.05 0.4 0.05 0.10 0.2 0.100 2 4 6 8 10 0.150 2 4 6 8 10 0.00 2 4 6 8 10 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 20 / 29
  • 21. VAR: Forecast Error Variance Decomposition Analyze contribution of each variable to forecasting error fevd = result.fevd(20) fevd.plot() Forecast error variance decomposition (FEVD) m1 1.0 m1 realgdp 0.8 cpi 0.6 0.4 0.2 0.00 5 10 15 20 1.2 realgdp 1.0 0.8 0.6 0.4 0.2 0.00 5 10 15 20 1.2 cpi 1.0 0.8 0.6 0.4 0.2 0.00 5 10 15 20 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 21 / 29
  • 22. VAR: Statistical tests In [137]: result.test_causality(’m1’, [’cpi’, ’realgdp’]) Granger causality f-test ========================================================= Test statistic Critical Value p-value df --------------------------------------------------------- 1.248787 2.387325 0.289 (4, 579) ========================================================= H_0: [’cpi’, ’realgdp’] do not Granger-cause m1 Conclusion: fail to reject H_0 at 5.00% significance level McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 22 / 29
  • 23. Filtering Hodrick-Prescott (HP) filter separates a time series yt into a trend τt and a cyclical component ζt , so that yt = τt + ζt . 14 Inflation 12 Cyclical component 10 Trend component 8 6 4 2 0 2 4 2 6 0 4 8 2 6 0 4 8 2 6 196 196 197 197 197 198 198 199 199 199 200 200 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 23 / 29
  • 24. Filtering In addition to the HP filter, 2 other filters popular in finance and economics, Baxter-King and Christiano-Fitzgerald, are available We refer you to our paper and the documentation for details on these: Inflation and Unemployment: BK Filtered Inflation and Unemployment: CF Filtered INFL INFL 4 4 UNEMP UNEMP 2 2 0 0 2 2 4 4 63 73 83 93 68 78 88 98 03 71 81 91 08 66 76 86 96 01 06 19 19 19 19 19 19 19 19 19 19 19 20 19 19 19 19 20 20 20 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 24 / 29
  • 25. Preview: Bayesian dynamic linear models (DLM) A state space model by another name: yt = Ft θt + νt , νt ∼ N (0, Vt ) θt = G θt−1 + ωt , ωt ∼ N (0, Wt ) Estimation of basic model by Kalman filter recursions. Provides elegant way to do time-varying linear regressions for forecasting Extensions: multivariate DLMs, stochastic volatility (SV) models, MCMC-based posterior sampling, mixtures of DLMs McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 25 / 29
  • 26. Preview: DLM Example (Constant+Trend model) model = Polynomial(2) dlm = DLM(close_px[’AAPL’], model.F, G=model.G, # model m0=m0, C0=C0, n0=n0, s0=s0, # priors state_discount=.95) # discount factor Constant + Trend DLM 200 150 100 50 8 9 009 9 009 9 9 200 200 2 200 Jul 2 200 200 Nov Jan Mar May Sep Nov McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 26 / 29
  • 27. Preview: Stochastic volatility models 1.6 JPY-USD Exchange Rate Volatility Process 1.4 1.2 1.0 0.8 0.6 0.4 0.20 200 400 600 800 1000 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 27 / 29
  • 28. Future: sandbox and beyond ARCH / GARCH models for volatility Structural VAR and error correction models (ECM) for cointegrated processes Models with non-normally distributed errors Better data description, visualization, and interactive research tools More sophisticated Bayesian time series models McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 28 / 29
  • 29. Conclusions We’ve implemented many foundational models for time series analysis, but the field is very broad User interface can and should be much improved Repo: http://github.com/statsmodels/statsmodels Docs: http://statsmodels.sourceforge.net Contact: pystatsmodels@googlegroups.com McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 29 / 29