Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Time Series at Scale for Weak Memory Systems

1,101 views

Published on

by Francois Belletti

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Time Series at Scale for Weak Memory Systems

  1. 1. Time series at scale for weak memory systems Francois Belletti, Evan Sparks, Michael Franklin, Alexandre M. Bayen UC Berkeley
  2. 2. Time series analysis Canonical time series analysis Figure: Time series analysis in the old days
  3. 3. Time series analysis New systems to analyze Figure: Analyzing the El Nino pattern with Wavelets
  4. 4. Time series analysis Life is a time series Figure: Today, time series analysis ranges from financial markets to smart advertising
  5. 5. Outline Embarrassingly Parallel Analysis of Time Series Weak memory time series analysis The overlapping block framework Applications Extensions
  6. 6. Second order stationary models Weak memory Time Series analysis Observed process (Xt)t2Z 2 Rd Process is ergodic and more specifically: I E (Xt) = µX 2 Rd (constant) I gX (t, h) = Cov(Xt,Xt+h) is only a function of h I h ! gX (h) 2 Rdxd is the autocovariance function. Example: multidimensional white noise I 8t 2 Z, E (et) = 0 I 8t 2 Z, E eteT t = ⌃e I 8t,s 2 Z : t 6= s, E eteT s = 0.
  7. 7. The importance of autocorrelation Weak memory Time Series analysis
  8. 8. Di erent levels of memory in Time Series (1/3) Weak memory Time Series analysis Brownian motions, order 1 integrated processes
  9. 9. Di erent levels of memory in Time Series (2/3) Weak memory Time Series analysis Trending process with increasing amplitude and seasonality
  10. 10. Di erent levels of memory in Time Series (3/3) Weak memory Time Series analysis Controversial partially integrated time series
  11. 11. How to erase memory? Weak memory Time Series analysis By di erentiating the time series: ( Xt)t2Z = (Xt Xt 1)t2Z
  12. 12. Essential statistics for Second Order Stationary models Weak memory Time Series analysis Frequentist estimates: I Mean: cµX ⇣ (Xt)t2{1...N} ⌘ = 1 N ÂN k=1 Xk, I Autocovariance: gX (h) ⇣ (Xt)t2{1...N} ⌘ = 1 N h 1 ÂN h k=1 XkXT k+h. I Autocorrelation: crX h = dCor(Xt,Xt+h) = diag ⇣ gX (0) ⌘ 1 2 gX (h)diag ⇣ gX (0) ⌘ 1 2 . I Partial autocorrelation: 2 6 6 4 gX (0) ··· gX ( (p 1)) ... ... ... gX (p 1) ··· gX (0) 3 7 7 5 2 6 6 6 6 6 4 ⇣ U (p) 1 ⌘T ... ⇣ U (p) p ⌘T 3 7 7 7 7 7 5 = 2 6 6 4 gX (1) ... gX (p) 3 7 7 5
  13. 13. Common computational structure Weak memory Time Series analysis Map-reduce computation of M-estimators
  14. 14. The VARMA family of models Weak memory Time Series analysis Vector Autoregressive (VAR) models, linear predictor with iid noise Xt = A1Xt 1 +...+ApXt p +et. Vector Moving Average (VMA) models, autocorrelated noise Xt = et +B1et 1 +...+Bqet q. Vector Autoregressive Moving Average (VARMA) models, linear predictor with autocorrelated noise Xt = A1Xt 1 +...+ApXt p +et +B1et 1 +...+Bqet q. I To estimate the parameters, compute gX (h) ⇣ (Xt)t2{1...N} ⌘ forh = 1...p +q.
  15. 15. The issue of naive partitioning The overlapping block framework Data is partitioned along the time axis How do we compute 1 N h 1 ÂN h k=1 XkXT k+h with partitioned data? How do we enable some look-ahead or look-back with partitioned data?
  16. 16. From informational structure to memory layout The overlapping block framework Partitioned-data, avoid joins, avoid communication in general Only short-range dependency in M-estimation and Z-estimation Data replication enables embarrassingly parallel computations
  17. 17. Computational accounting and the target system The overlapping block framework A kernel is computed on a target Only genuine data points (partition index == origin index) can be targets Guarantees there are no redundant computations
  18. 18. A simple programming paradigm The overlapping block framework Second order essential statistics trait: I def kernelWidth: IntervalSize I def zero: ResultT I def kernel(slice: Array[(IndexT, ValueT)]): ResultT = ??? (Your kernel) I def reducer(r1: ResultT, r2: ResultT): ResultT = ??? (Your reducing operation)
  19. 19. R-like API The overlapping block framework Create an overlapping block RDD: I SingleAxisBlockRDD((paddingMillis, paddingMillis), nPartitions, inSampleData) API calls for exploratory data analysis: I val mean = MeanEstimator(timeSeriesRDD) I val meanProfile = MeanProfileEstimator(timeSeriesRDD, hashFunction) I val (correlations, _) = CrossCorrelation(timeSeriesRDD, h) API calls for modeling: I val (estVARMatrices, _) = VARModel(timeSeriesRDD, p) I val residualVAR = VARPredictor(timeSeriesRDD, estVARMatrices, Some(mean))
  20. 20. High dimensional data-intensive system ID Applications Prediction of Uber demand in New York Uber ride requests
  21. 21. Statistical properties of Uber demand in New York Applications 40515 samples 314 dimensions Demand for Uber rides in April Uber demand New York April 2014
  22. 22. Seasonality analysis of Uber rides Applications Compute and subtract weekly average profile
  23. 23. VAR coe cients (AR1, Uber rides) Applications We identify matrix A1, we can predict demand Xt (at time t) based on demand Xt 1 (at time t 1). I The best predictor for Xt given Xt 1 will be A1Xt 1.
  24. 24. Univariate residuals (Uber rides) Applications Covariance of univariate residuals
  25. 25. Multivariate residuals (Uber rides) Applications Covariance of multivariate residuals
  26. 26. What any scale time series analysis enables (1/3) Further steps GDELT data set, interaction between news providers
  27. 27. What any scale time series analysis enables (2/3) Further steps Climate studies, geophysical systems
  28. 28. What any scale time series analysis enables (3/3) Further steps Large scale cyber-physical systems
  29. 29. Concluding remarks and questions SparkGeoTS Packages such as Thunder and SparkTS were only optimized for univariate time series analysis I Partitioning was only done with respect to sensing dimensions We enable time axis partitioning I With overlapping blocks we can calibrate all models of the ARMA family Now this scheme will be extended to FARIMA models.

×