SlideShare a Scribd company logo
Time Observation
4/10/1990 23:54:12 4.5
4/10/1990 23:54:13 5.5
4/10/1990 23:54:14 6.6
4/10/1990 23:54:15 7.8
4/10/1990 23:54:16 3.3
Time Something Something Else
4/10/1990 23:54:12 4.5 100.4
4/10/1990 23:54:13 5.5 101.3
4/10/1990 23:54:14 6.6 450.2
4/10/1990 23:54:15 7.8 600
4/10/1990 23:54:16 3.3 2000
●
●
●
●
○
●
○
Time
vec = datestr(busdays('1/2/01','1/9/01','weekly'))
vec =
05-Jan-2001
12-Jan-2001
●
○
●
○
●
○
●
○
SELECT buyerid, saletime, qtysold,
LAG(qtysold,1) OVER (order by buyerid, saletime) AS prev_qtysold
FROM sales WHERE buyerid = 3 ORDER BY buyerid, saletime;
buyerid | saletime | qtysold | prev_qtysold
---------+---------------------+---------+--------------
3 | 2008-01-16 01:06:09 | 1 |
3 | 2008-01-28 02:10:01 | 1 | 1
3 | 2008-03-12 10:39:53 | 1 | 1
3 | 2008-03-13 02:56:07 | 1 | 1
3 | 2008-03-29 08:21:39 | 2 | 1
3 | 2008-04-27 02:39:01 | 1 | 2
windowSpec = 
Window
.partitionBy(df['category']) 
.orderBy(df['revenue'].desc()) 
.rangeBetween(-sys.maxsize, sys.maxsize)
dataFrame = sqlContext.table("productRevenue")
revenue_difference = 
(func.max(dataFrame['revenue']).over(windowSpec) - dataFrame['revenue'])
dataFrame.select(
dataFrame['product'],
dataFrame['category'],
dataFrame['revenue'],
revenue_difference.alias("revenue_difference"))
●
○
●
○
“Observations”
Timestamp Key Value
2015-04-10 A 2.0
2015-04-11 A 3.0
2015-04-10 B 4.5
2015-04-11 B 1.5
2015-04-10 C 6.0
“Instants”
Timestamp A B C
2015-04-10 2.0 4.5 6.0
2015-04-11 3.0 1.5 NaN
“Time Series”
DateTimeIndex: [2015-04-10, 2015-04-11]
Key Series
A [2.0, 3.0]
B [4.5, 1.5]
C [6.0, NaN]
●
●
○
○
○
rdd: RDD[String, Vector[Double]]
index: DateTimeIndex
5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM
GOOG $523 $524 $600 $574 $400
AAPL $384 $384 $385 $385 $378 $345
YHOO $40 $60 $70 $80
MSFT $134 $138 $175 $178 $123 $184
ORCL $23 $30 $35 $45 $38
5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM
GOOG $523 $524 $600 $574 $400
AAPL $384 $384 $385 $385 $378 $345
YHOO $40 $60 $70 $80
MSFT $134 $138 $175 $178 $123 $184
ORCL $23 $30 $35 $45 $38
val tsRdd: TimeSeriesRDD = ...
// Find a sub-slice between two dates
val subslice = tsRdd.slice(
ZonedDateTime.parse("2015-4-10", ISO_DATE),
ZonedDateTime.parse("2015-4-14", ISO_DATE))
// Fill in missing values based on linear interpolation
val filled = subslice.fill("linear")
// Use an AR(1) model to remove serial correlations
val residuals = filled.mapSeries(series =>
ar(series, 1).removeTimeDependentEffects(series))
val dtIndex = DateTimeIndex.uniform(
ZonedDateTime.parse("2015-4-10", ISO_DATE),
ZonedDateTime.parse("2015-5-14", ISO_DATE),
2 businessDays) // wowza that’s some syntactic sugar
dtIndex.dateTimeAtLoc(5)
ARIMA:
val modelsRdd = tsRdd.map { ts =>
ARIMA.autoFit(ts)
}
GARCH:
val modelsRdd = tsRdd.map { ts =>
GARCH.fitModel(ts)
}
val mostAutocorrelated = tsRdd.map { ts =>
(TimeSeriesStatisticalTests.dwtest(ts), ts)
}.max
●
●
○
●
○
●
○
●
●
●
●
●
○
Time Series Analysis with Spark

More Related Content

Viewers also liked

Introducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph ProcessingIntroducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph Processing
sscdotopen
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
Turi, Inc.
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
rhatr
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
Jeff Holoman
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
DataWorks Summit
 
Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)
Wes McKinney
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
DataWorks Summit/Hadoop Summit
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
Cloudera, Inc.
 
Next-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache ArrowNext-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache Arrow
Wes McKinney
 

Viewers also liked (10)

Introducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph ProcessingIntroducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph Processing
 
Machine Learning with GraphLab Create
Machine Learning with GraphLab CreateMachine Learning with GraphLab Create
Machine Learning with GraphLab Create
 
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphXIntroduction into scalable graph analysis with Apache Giraph and Spark GraphX
Introduction into scalable graph analysis with Apache Giraph and Spark GraphX
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Hadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache GiraphHadoop Graph Processing with Apache Giraph
Hadoop Graph Processing with Apache Giraph
 
Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)Apache Arrow (Strata-Hadoop World San Jose 2016)
Apache Arrow (Strata-Hadoop World San Jose 2016)
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Next-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache ArrowNext-generation Python Big Data Tools, powered by Apache Arrow
Next-generation Python Big Data Tools, powered by Apache Arrow
 

Recently uploaded

University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
slg6lamcq
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
NABLAS株式会社
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
tzu5xla
 

Recently uploaded (20)

University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
一比一原版南十字星大学毕业证(SCU毕业证书)学历如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .社内勉強会資料_Hallucination of LLMs               .
社内勉強会資料_Hallucination of LLMs               .
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理 原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
原版一比一爱尔兰都柏林大学毕业证(UCD毕业证书)如何办理
 

Time Series Analysis with Spark

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Time Observation 4/10/1990 23:54:12 4.5 4/10/1990 23:54:13 5.5 4/10/1990 23:54:14 6.6 4/10/1990 23:54:15 7.8 4/10/1990 23:54:16 3.3
  • 9. Time Something Something Else 4/10/1990 23:54:12 4.5 100.4 4/10/1990 23:54:13 5.5 101.3 4/10/1990 23:54:14 6.6 450.2 4/10/1990 23:54:15 7.8 600 4/10/1990 23:54:16 3.3 2000
  • 10.
  • 11.
  • 13. Time
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 21. SELECT buyerid, saletime, qtysold, LAG(qtysold,1) OVER (order by buyerid, saletime) AS prev_qtysold FROM sales WHERE buyerid = 3 ORDER BY buyerid, saletime; buyerid | saletime | qtysold | prev_qtysold ---------+---------------------+---------+-------------- 3 | 2008-01-16 01:06:09 | 1 | 3 | 2008-01-28 02:10:01 | 1 | 1 3 | 2008-03-12 10:39:53 | 1 | 1 3 | 2008-03-13 02:56:07 | 1 | 1 3 | 2008-03-29 08:21:39 | 2 | 1 3 | 2008-04-27 02:39:01 | 1 | 2
  • 22. windowSpec = Window .partitionBy(df['category']) .orderBy(df['revenue'].desc()) .rangeBetween(-sys.maxsize, sys.maxsize) dataFrame = sqlContext.table("productRevenue") revenue_difference = (func.max(dataFrame['revenue']).over(windowSpec) - dataFrame['revenue']) dataFrame.select( dataFrame['product'], dataFrame['category'], dataFrame['revenue'], revenue_difference.alias("revenue_difference"))
  • 24.
  • 25.
  • 26.
  • 27. “Observations” Timestamp Key Value 2015-04-10 A 2.0 2015-04-11 A 3.0 2015-04-10 B 4.5 2015-04-11 B 1.5 2015-04-10 C 6.0
  • 28. “Instants” Timestamp A B C 2015-04-10 2.0 4.5 6.0 2015-04-11 3.0 1.5 NaN
  • 29. “Time Series” DateTimeIndex: [2015-04-10, 2015-04-11] Key Series A [2.0, 3.0] B [4.5, 1.5] C [6.0, NaN]
  • 32. 5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM GOOG $523 $524 $600 $574 $400 AAPL $384 $384 $385 $385 $378 $345 YHOO $40 $60 $70 $80 MSFT $134 $138 $175 $178 $123 $184 ORCL $23 $30 $35 $45 $38
  • 33. 5:00 PM 6:00 PM 7:00 PM 8:00 PM 9:00 PM 10:00 PM GOOG $523 $524 $600 $574 $400 AAPL $384 $384 $385 $385 $378 $345 YHOO $40 $60 $70 $80 MSFT $134 $138 $175 $178 $123 $184 ORCL $23 $30 $35 $45 $38
  • 34. val tsRdd: TimeSeriesRDD = ... // Find a sub-slice between two dates val subslice = tsRdd.slice( ZonedDateTime.parse("2015-4-10", ISO_DATE), ZonedDateTime.parse("2015-4-14", ISO_DATE)) // Fill in missing values based on linear interpolation val filled = subslice.fill("linear") // Use an AR(1) model to remove serial correlations val residuals = filled.mapSeries(series => ar(series, 1).removeTimeDependentEffects(series))
  • 35. val dtIndex = DateTimeIndex.uniform( ZonedDateTime.parse("2015-4-10", ISO_DATE), ZonedDateTime.parse("2015-5-14", ISO_DATE), 2 businessDays) // wowza that’s some syntactic sugar dtIndex.dateTimeAtLoc(5)
  • 36. ARIMA: val modelsRdd = tsRdd.map { ts => ARIMA.autoFit(ts) } GARCH: val modelsRdd = tsRdd.map { ts => GARCH.fitModel(ts) }
  • 37. val mostAutocorrelated = tsRdd.map { ts => (TimeSeriesStatisticalTests.dwtest(ts), ts) }.max