SlideShare a Scribd company logo
Time Observation
4/10/1990 23:54:12 4.5
4/10/1990 23:54:13 5.5
4/10/1990 23:54:14 6.6
4/10/1990 23:54:15 7.8
4/10/1990 23:54:16 3.3
Time Something Something Else
4/10/1990 23:54:12 4.5 100.4
4/10/1990 23:54:13 5.5 101.3
4/10/1990 23:54:14 6.6 450.2
4/10/1990 23:54:15 7.8 600
4/10/1990 23:54:16 3.3 2000
●
●
●
●
○
●
○
Time
vec = datestr(busdays('1/2/01','1/9/01','weekly'))
vec =
05-Jan-2001
12-Jan-2001
●
○
●
○
●
○
●
○
SELECT buyerid, saletime, qtysold,
LAG(qtysold,1) OVER (order by buyerid, saletime) AS prev_qtysold
FROM sales WHERE buyerid = 3 ORDER BY buyerid, saletime;
buyerid | saletime | qtysold | prev_qtysold
---------+---------------------+---------+--------------
3 | 2008-01-16 01:06:09 | 1 |
3 | 2008-01-28 02:10:01 | 1 | 1
3 | 2008-03-12 10:39:53 | 1 | 1
3 | 2008-03-13 02:56:07 | 1 | 1
3 | 2008-03-29 08:21:39 | 2 | 1
3 | 2008-04-27 02:39:01 | 1 | 2
windowSpec = 
Window
.partitionBy(df['category']) 
.orderBy(df['revenue'].desc()) 
.rangeBetween(-sys.maxsize, sys.maxsize)
dataFrame = sqlContext.table("productRevenue")
revenue_difference = 
(func.max(dataFrame['revenue']).over(windowSpec) - dataFrame['revenue'])
dataFrame.select(
dataFrame['product'],
dataFrame['category'],
dataFrame['revenue'],
revenue_difference.alias("revenue_difference"))
●
○
●
○
“Observations”
Timestamp Key Value
2015-04-10 A 2.0
2015-04-11 A 3.0
2015-04-10 B 4.5
2015-04-11 B 1.5
2015-04-10 C 6.0
“Instants”
Timestamp A B C
2015-04-10 2.0 4.5 6.0
2015-04-11 3.0 1.5 NaN
“Time Series”
DateTimeIndex: [2015-04-10, 2015-04-11]
Key Series
A [2.0, 3.0]
B [4.5, 1.5]
C [6.0, NaN]
●
●
○
○
○
rdd: RDD[String, Vector[Double]]
index: DateTimeIndex
5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM
GOOG $523 $524 $600 $574 $400
AAPL $384 $384 $385 $385 $378 $345
YHOO $40 $60 $70 $80
MSFT $134 $138 $175 $178 $123 $184
ORCL $23 $30 $35 $45 $38
5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM
GOOG $523 $524 $600 $574 $400
AAPL $384 $384 $385 $385 $378 $345
YHOO $40 $60 $70 $80
MSFT $134 $138 $175 $178 $123 $184
ORCL $23 $30 $35 $45 $38
val tsRdd: TimeSeriesRDD = ...
// Find a sub-slice between two dates
val subslice = tsRdd.slice(
ZonedDateTime.parse("2015-4-10", ISO_DATE),
ZonedDateTime.parse("2015-4-14", ISO_DATE))
// Fill in missing values based on linear interpolation
val filled = subslice.fill("linear")
// Use an AR(1) model to remove serial correlations
val residuals = filled.mapSeries(series =>
ar(series, 1).removeTimeDependentEffects(series))
val dtIndex = DateTimeIndex.uniform(
ZonedDateTime.parse("2015-4-10", ISO_DATE),
ZonedDateTime.parse("2015-5-14", ISO_DATE),
2 businessDays) // wowza that’s some syntactic sugar
dtIndex.dateTimeAtLoc(5)
ARIMA:
val modelsRdd = tsRdd.map { ts =>
ARIMA.autoFit(ts)
}
GARCH:
val modelsRdd = tsRdd.map { ts =>
GARCH.fitModel(ts)
}
val mostAutocorrelated = tsRdd.map { ts =>
(TimeSeriesStatisticalTests.dwtest(ts), ts)
}.max
●
●
○
●
○
●
○
●
●
●
●
●
○
Time Series Analysis with Spark

More Related Content

Viewers also liked

Introducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph ProcessingIntroducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph Processingsscdotopen
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
Turi, Inc.
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
rhatr
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
Jeff Holoman
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
DataWorks Summit
 
Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)
Wes McKinney
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
DataWorks Summit/Hadoop Summit
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
Next-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache ArrowNext-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache Arrow
Wes McKinney
 

Viewers also liked (10)

Introducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph ProcessingIntroducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph Processing
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
 
Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Next-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache ArrowNext-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache Arrow
 

Recently uploaded

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 

Recently uploaded (20)

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 

Time Series Analysis with Spark

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Time Observation 4/10/1990 23:54:12 4.5 4/10/1990 23:54:13 5.5 4/10/1990 23:54:14 6.6 4/10/1990 23:54:15 7.8 4/10/1990 23:54:16 3.3
  • 9. Time Something Something Else 4/10/1990 23:54:12 4.5 100.4 4/10/1990 23:54:13 5.5 101.3 4/10/1990 23:54:14 6.6 450.2 4/10/1990 23:54:15 7.8 600 4/10/1990 23:54:16 3.3 2000
  • 10.
  • 11.
  • 13. Time
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 21. SELECT buyerid, saletime, qtysold, LAG(qtysold,1) OVER (order by buyerid, saletime) AS prev_qtysold FROM sales WHERE buyerid = 3 ORDER BY buyerid, saletime; buyerid | saletime | qtysold | prev_qtysold ---------+---------------------+---------+-------------- 3 | 2008-01-16 01:06:09 | 1 | 3 | 2008-01-28 02:10:01 | 1 | 1 3 | 2008-03-12 10:39:53 | 1 | 1 3 | 2008-03-13 02:56:07 | 1 | 1 3 | 2008-03-29 08:21:39 | 2 | 1 3 | 2008-04-27 02:39:01 | 1 | 2
  • 22. windowSpec = Window .partitionBy(df['category']) .orderBy(df['revenue'].desc()) .rangeBetween(-sys.maxsize, sys.maxsize) dataFrame = sqlContext.table("productRevenue") revenue_difference = (func.max(dataFrame['revenue']).over(windowSpec) - dataFrame['revenue']) dataFrame.select( dataFrame['product'], dataFrame['category'], dataFrame['revenue'], revenue_difference.alias("revenue_difference"))
  • 24.
  • 25.
  • 26.
  • 27. “Observations” Timestamp Key Value 2015-04-10 A 2.0 2015-04-11 A 3.0 2015-04-10 B 4.5 2015-04-11 B 1.5 2015-04-10 C 6.0
  • 28. “Instants” Timestamp A B C 2015-04-10 2.0 4.5 6.0 2015-04-11 3.0 1.5 NaN
  • 29. “Time Series” DateTimeIndex: [2015-04-10, 2015-04-11] Key Series A [2.0, 3.0] B [4.5, 1.5] C [6.0, NaN]
  • 32. 5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM GOOG $523 $524 $600 $574 $400 AAPL $384 $384 $385 $385 $378 $345 YHOO $40 $60 $70 $80 MSFT $134 $138 $175 $178 $123 $184 ORCL $23 $30 $35 $45 $38
  • 33. 5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM GOOG $523 $524 $600 $574 $400 AAPL $384 $384 $385 $385 $378 $345 YHOO $40 $60 $70 $80 MSFT $134 $138 $175 $178 $123 $184 ORCL $23 $30 $35 $45 $38
  • 34. val tsRdd: TimeSeriesRDD = ... // Find a sub-slice between two dates val subslice = tsRdd.slice( ZonedDateTime.parse("2015-4-10", ISO_DATE), ZonedDateTime.parse("2015-4-14", ISO_DATE)) // Fill in missing values based on linear interpolation val filled = subslice.fill("linear") // Use an AR(1) model to remove serial correlations val residuals = filled.mapSeries(series => ar(series, 1).removeTimeDependentEffects(series))
  • 35. val dtIndex = DateTimeIndex.uniform( ZonedDateTime.parse("2015-4-10", ISO_DATE), ZonedDateTime.parse("2015-5-14", ISO_DATE), 2 businessDays) // wowza that’s some syntactic sugar dtIndex.dateTimeAtLoc(5)
  • 36. ARIMA: val modelsRdd = tsRdd.map { ts => ARIMA.autoFit(ts) } GARCH: val modelsRdd = tsRdd.map { ts => GARCH.fitModel(ts) }
  • 37. val mostAutocorrelated = tsRdd.map { ts => (TimeSeriesStatisticalTests.dwtest(ts), ts) }.max