SlideShare a Scribd company logo
1 of 31
Download to read offline
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
SPRINGONE2GX
WASHINGTON, DC
Implementing a highly scalable
Stock prediction system with R,
Apache Geode and Spring XD
Fred Melo

@fredmelo_br
William Markito

@william_markito
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
About us
Fred Melo
Technical Director for Data
fmelo@pivotal.io
@fredmelo_br
2
William Markito
Enterprise Architect for GemFire
wmarkito@pivotal.io
@william_markito
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 3
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 4
It's all about DATA
Data Sources
Look for patterns
Prediction
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
What do we want to build?
5
"Smart System"
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
… in our specific case
6
Trading Data
"Smart System"
Historical Data
Repository
Learns with historical trends
"How were the medium average
price and relative strength
reading when the latest failures
happened? "
Live data
becomes
historical
over time
Real-Time
Evaluates live data
“According to historical trends,
there’s an 80% chance this stock
prices might go downhill within
the next hour"
Historical
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
… in our specific case
7
Trading Data
"Smart System"
Historical Data
Repository
Learns with historical trends
"How were the medium average
price and relative strength reading
when the latest failures happened? "
Live data
becomes
historical
over time
Real-Time
Evaluates live data
“According to historical trends,
there’s an 80% chance this stock
prices might go downhill within the
next hour"
Historical
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 8
Live Data
Data
Temperature
Hot
Cold
Greenplum DB
Apache Geode / GemFire
1- Live data is ingested into the grid
3 - Results are pushed
immediately to deployed
applications
4 - “Hot" data ages,
becoming part of the
historical dataset
Machine Learning
model 5 - Re-training is triggered,
updating the model with
the latest historical data
Spring XD
Spring XD
The ML pipeline data flow
2 - Trained ML model compares
new data to historical patterns
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 9
Live Data
Apache Geode / GemFire
1- Live data is ingested into the grid
2 - Trained ML model compares
new data to historical patterns
3 - Results are pushed
immediately to deployed
applications
Machine Learning
model
4 - Re-training is triggered, updating
the model with the latest historical data
Spring XD
Spring XD
Simplified demo model Data
Temperature
Hot
Warm
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 10
Transform Sink
SpringXD
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable
Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 11
Eating it in small bites…
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 12
SpringXD GemFire
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
• Cache
• Configurable through XML, ,Java
• Region
• Distributed j.u.Map on steroids
• Highly available, redundant
• Member
• Locator, Server, Client
• Callbacks
• Listener, Writer, AsyncEventListener, Parallel/Serial
Apache Geode & GemFire Concepts
13
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Geode & GemFire, why ?
• Performance
• Consistency
• Resiliency
14
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Geode & GemFire, why ?
15
© Copyright 2014 Pivotal. All rights reserved.
Pivotal GemFire High Availability and Fault Tolerance in 6 acts
Failing data copies are
replaced transparently
Data is replicated to other
clusters and sites (WAN)
Network segmentations are
identified and fixed automatically
Client and cluster disconnections
are handled gracefully
Data is persisted on local
disk for ultimate durability
“split brain”
Failed function executions
are restarted automatically
restart
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Some interesting cases…
16
China Railway

Corporation
5,700 train stations
4.5 million tickets per day
20 million daily users
1.4 billion page views per day
40,000 visits per second
* http://pivotal.io/big-data/pivotal-gemfire
Indian Railways
7,000 stations
72,000 miles of track
23 million passengers daily
120,000 concurrent users
10,000 transactions per minute
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Use cases and industries
17
Indian RailwaysChina Railway Corporation
World: ~7,349,000,000
~36% of the world population
Population: 1,251,695,6161,401,586,609
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
• Commercial product available since 2004
• Native clients in Java, C++, C#, REST
• Event Subscriptions and Continuous
Queries
• Configurable WAN Gateway between
clusters
• Enterprise Support, commercial features
Apache Geode & Pivotal GemFire
• Open Sourced in April/2015
• Java Native Client, REST
• 98% of GemFire API
• Event subscriptions
• ~30 contributors
• Under Incubation
18
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 19
SpringXD GemFire
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
SpringXD Basic Concepts
• Streams
• Pipelines
• Sources
• Sinks
• Filters
• Taps
20
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
SpringXD Basic Concepts
21
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
A simple example
22
twittersearch --consumerKey=XXX —consumerSecret=XXX --
query=SpringOne2GX --outputType=application/json | gemfire-json-
server --useLocator=true --host=localhost --port=10334 --
regionName=tweets --keyExpression=payload.getField('id_str')
twittersearch --query=SpringOne2GX | gemfire-json-server --host=localhost--regionName=tweets
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 23
SpringXD GemFire
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Spark Concepts
• RDD
• Dataframe
• Driver
• Worker
24
"An RDD in Spark is simply an immutable distributed collection of objects.
Each RDD is split into multiple partitions, which may be computed on different nodes
of the cluster. RDDs can contain any type of Python, Java, or Scala objects,
including user-defined classes."
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Apache Spark Concepts
• RDD
• Dataframe
• Driver
• Worker
25
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 26
medium
avg (x+1)
relative
strength (x)
medium avg (x)
price(x)
Machine Learning Model
(e.g. Linear Regression)
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 27
medium
avg (x+1)
relative
strength (x)
medium avg (x)
price(x)
Machine Learning Model
(e.g. Linear Regression)
Features Label
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 28
Transform Sink
SpringXD
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable
Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 29
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Learn more!
30
https://github.com/Pivotal-Open-Source-Hub/geode-security-samples
https://github.com/Pivotal-Open-Source-Hub/WifiAnalyticsIoT
https://github.com/Pivotal-Open-Source-Hub/geode-social-demo
http://pivotal-open-source-hub.github.io/StockInference-Spark/
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Thank you
31
@william_markito @fredmelo_br
Related: Building Highly-Scalable Spring Applications
with In-Memory, Distributed Data Grids
by John Blum & Luke Shannon
September 15, 2015 -10:30 - Salon M
http://pivotal-open-source-hub.github.io/StockInference-Spark/

More Related Content

What's hot

Pivotal HAWQ 소개
Pivotal HAWQ 소개Pivotal HAWQ 소개
Pivotal HAWQ 소개Seungdon Choi
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!DataWorks Summit
 
Building Audi’s enterprise big data platform
Building Audi’s enterprise big data platformBuilding Audi’s enterprise big data platform
Building Audi’s enterprise big data platformDataWorks Summit
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLDatabricks
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Alex Zeltov
 
EDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to PostgresEDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to PostgresEDB
 
Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020Pavel Hardak
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...DataWorks Summit/Hadoop Summit
 
SAM—streaming analytics made easy
SAM—streaming analytics made easySAM—streaming analytics made easy
SAM—streaming analytics made easyDataWorks Summit
 
Big Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al EssaBig Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al EssaSpark Summit
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Databricks
 
Uber's data science workbench
Uber's data science workbenchUber's data science workbench
Uber's data science workbenchRan Wei
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...Databricks
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...Dremio Corporation
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldJeffrey T. Pollock
 
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark Summit
 
DbyDx Software Corporate Presentation
DbyDx Software Corporate PresentationDbyDx Software Corporate Presentation
DbyDx Software Corporate PresentationDbyDx Software
 

What's hot (19)

Spark sql meetup
Spark sql meetupSpark sql meetup
Spark sql meetup
 
Pivotal HAWQ 소개
Pivotal HAWQ 소개Pivotal HAWQ 소개
Pivotal HAWQ 소개
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
 
Building Audi’s enterprise big data platform
Building Audi’s enterprise big data platformBuilding Audi’s enterprise big data platform
Building Audi’s enterprise big data platform
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETL
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
 
EDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to PostgresEDB's Migration Portal - Migrate from Oracle to Postgres
EDB's Migration Portal - Migrate from Oracle to Postgres
 
Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020Spark Development Lifecycle at Workday - ApacheCon 2020
Spark Development Lifecycle at Workday - ApacheCon 2020
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
SAM—streaming analytics made easy
SAM—streaming analytics made easySAM—streaming analytics made easy
SAM—streaming analytics made easy
 
Big Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al EssaBig Data Meets Learning Science: Keynote by Al Essa
Big Data Meets Learning Science: Keynote by Al Essa
 
Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...Improving Python and Spark Performance and Interoperability with Apache Arrow...
Improving Python and Spark Performance and Interoperability with Apache Arrow...
 
Uber's data science workbench
Uber's data science workbenchUber's data science workbench
Uber's data science workbench
 
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
ML, Statistics, and Spark with Databricks for Maximizing Revenue in a Delayed...
 
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
 
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
 
DbyDx Software Corporate Presentation
DbyDx Software Corporate PresentationDbyDx Software Corporate Presentation
DbyDx Software Corporate Presentation
 

Viewers also liked

Apache Geode (incubating) Introduction with Docker
Apache Geode (incubating) Introduction with DockerApache Geode (incubating) Introduction with Docker
Apache Geode (incubating) Introduction with DockerWilliam Markito Oliveira
 
Big and Fast Data - Building Infinitely Scalable Systems
Big and Fast Data - Building Infinitely Scalable SystemsBig and Fast Data - Building Infinitely Scalable Systems
Big and Fast Data - Building Infinitely Scalable SystemsFred Melo
 
OSGeo와 Open Data
OSGeo와 Open DataOSGeo와 Open Data
OSGeo와 Open Datar-kor
 
황성수 공공데이터 개방과 공공이슈 해결
황성수 공공데이터 개방과 공공이슈 해결황성수 공공데이터 개방과 공공이슈 해결
황성수 공공데이터 개방과 공공이슈 해결r-kor
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsR Systems International
 
Distributed R: The Next Generation Platform for Predictive Analytics
Distributed R: The Next Generation Platform for Predictive AnalyticsDistributed R: The Next Generation Platform for Predictive Analytics
Distributed R: The Next Generation Platform for Predictive AnalyticsJorge Martinez de Salinas
 
Optimizing Facebook Campaigns with R
Optimizing Facebook Campaigns with ROptimizing Facebook Campaigns with R
Optimizing Facebook Campaigns with RDomino Data Lab
 
Cloud Conf 2015 - Develop and Deploy IOT Applications
Cloud Conf 2015 - Develop and Deploy IOT ApplicationsCloud Conf 2015 - Develop and Deploy IOT Applications
Cloud Conf 2015 - Develop and Deploy IOT ApplicationsCorley S.r.l.
 
The Next List: R&D Breakthroughs that are Changing the World
The Next List: R&D Breakthroughs that are Changing the WorldThe Next List: R&D Breakthroughs that are Changing the World
The Next List: R&D Breakthroughs that are Changing the WorldGE
 
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...In-Memory Computing Summit
 
오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화
오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화
오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화r-kor
 
Trading System Design
Trading System DesignTrading System Design
Trading System DesignMarketcalls
 
구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD
구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD
구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LDr-kor
 
Trading sentimental analysis
Trading sentimental analysisTrading sentimental analysis
Trading sentimental analysisMarketcalls
 
II-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla Air
II-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla AirII-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla Air
II-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla AirDr. Haxel Consult
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
H2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy WangH2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy WangSri Ambati
 

Viewers also liked (20)

How to Contribute to Apache Geode
How to Contribute to Apache GeodeHow to Contribute to Apache Geode
How to Contribute to Apache Geode
 
Apache Geode (incubating) Introduction with Docker
Apache Geode (incubating) Introduction with DockerApache Geode (incubating) Introduction with Docker
Apache Geode (incubating) Introduction with Docker
 
Big and Fast Data - Building Infinitely Scalable Systems
Big and Fast Data - Building Infinitely Scalable SystemsBig and Fast Data - Building Infinitely Scalable Systems
Big and Fast Data - Building Infinitely Scalable Systems
 
OSGeo와 Open Data
OSGeo와 Open DataOSGeo와 Open Data
OSGeo와 Open Data
 
황성수 공공데이터 개방과 공공이슈 해결
황성수 공공데이터 개방과 공공이슈 해결황성수 공공데이터 개방과 공공이슈 해결
황성수 공공데이터 개방과 공공이슈 해결
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analytics
 
Distributed R: The Next Generation Platform for Predictive Analytics
Distributed R: The Next Generation Platform for Predictive AnalyticsDistributed R: The Next Generation Platform for Predictive Analytics
Distributed R: The Next Generation Platform for Predictive Analytics
 
Optimizing Facebook Campaigns with R
Optimizing Facebook Campaigns with ROptimizing Facebook Campaigns with R
Optimizing Facebook Campaigns with R
 
Cloud Conf 2015 - Develop and Deploy IOT Applications
Cloud Conf 2015 - Develop and Deploy IOT ApplicationsCloud Conf 2015 - Develop and Deploy IOT Applications
Cloud Conf 2015 - Develop and Deploy IOT Applications
 
R lecture oga
R lecture ogaR lecture oga
R lecture oga
 
The Next List: R&D Breakthroughs that are Changing the World
The Next List: R&D Breakthroughs that are Changing the WorldThe Next List: R&D Breakthroughs that are Changing the World
The Next List: R&D Breakthroughs that are Changing the World
 
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
 
오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화
오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화
오픈데이터와 오픈소스 소프트웨어를 이용한 의료이용정보의 시각화
 
Trading System Design
Trading System DesignTrading System Design
Trading System Design
 
구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD
구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD
구조화된 데이터: Schema.org와 Microdata, RDFa, JSON-LD
 
Trading sentimental analysis
Trading sentimental analysisTrading sentimental analysis
Trading sentimental analysis
 
II-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla Air
II-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla AirII-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla Air
II-SDV 2016 Patrick Beaucamp - Data Science with R and Vanilla Air
 
Language R
Language RLanguage R
Language R
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
H2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy WangH2O World - Intro to R, Python, and Flow - Amy Wang
H2O World - Intro to R, Python, and Flow - Amy Wang
 

Similar to Implementing a highly scalable stock prediction system with R, Geode, SpringXD and Spark

Building Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data GridsBuilding Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data GridsJohn Blum
 
Migrating from Big Data Architecture to Spring Cloud
Migrating from Big Data Architecture to Spring CloudMigrating from Big Data Architecture to Spring Cloud
Migrating from Big Data Architecture to Spring CloudVMware Tanzu
 
What We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerce
What We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerceWhat We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerce
What We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerceVMware Tanzu
 
Federated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and BeyondFederated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and BeyondChristian Tzolov
 
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional AppsYugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional AppsVMware Tanzu
 
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...VMware Tanzu
 
Containerizing a Data Warehouse for Kubernetes
Containerizing a Data Warehouse for KubernetesContainerizing a Data Warehouse for Kubernetes
Containerizing a Data Warehouse for KubernetesVMware Tanzu
 
Spring Cloud Gateway - Ryan Baxter
Spring Cloud Gateway - Ryan BaxterSpring Cloud Gateway - Ryan Baxter
Spring Cloud Gateway - Ryan BaxterVMware Tanzu
 
It’s a Multi-Cloud World, But What About The Data?
It’s a Multi-Cloud World, But What About The Data?It’s a Multi-Cloud World, But What About The Data?
It’s a Multi-Cloud World, But What About The Data?VMware Tanzu
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Cross-Platform Observability for Cloud Foundry
Cross-Platform Observability for Cloud FoundryCross-Platform Observability for Cloud Foundry
Cross-Platform Observability for Cloud FoundryVMware Tanzu
 
Developer Secure Containers for the Cyberspace Battlefield
Developer Secure Containers for the Cyberspace BattlefieldDeveloper Secure Containers for the Cyberspace Battlefield
Developer Secure Containers for the Cyberspace BattlefieldVMware Tanzu
 
Achieving High Throughput With Reliability In Transactional Systems
Achieving High Throughput With Reliability In Transactional SystemsAchieving High Throughput With Reliability In Transactional Systems
Achieving High Throughput With Reliability In Transactional SystemsVMware Tanzu
 
Beyond Caching: Extending Redis Enterprise for Real-Time Streams Processing
Beyond Caching: Extending Redis Enterprise for Real-Time Streams ProcessingBeyond Caching: Extending Redis Enterprise for Real-Time Streams Processing
Beyond Caching: Extending Redis Enterprise for Real-Time Streams ProcessingVMware Tanzu
 
Ratpack - SpringOne2GX 2015
Ratpack - SpringOne2GX 2015Ratpack - SpringOne2GX 2015
Ratpack - SpringOne2GX 2015Daniel Woods
 
Building a Secure App with Google Polymer and Java / Spring
Building a Secure App with Google Polymer and Java / SpringBuilding a Secure App with Google Polymer and Java / Spring
Building a Secure App with Google Polymer and Java / Springsdeeg
 
Living on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora IberkleidLiving on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora IberkleidVMware Tanzu
 
Living on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora IberkleidLiving on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora IberkleidVMware Tanzu
 
Building Data Environments for Production Microservices with Geode
Building Data Environments for Production Microservices with GeodeBuilding Data Environments for Production Microservices with Geode
Building Data Environments for Production Microservices with GeodeVMware Tanzu
 

Similar to Implementing a highly scalable stock prediction system with R, Geode, SpringXD and Spark (20)

Building Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data GridsBuilding Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data Grids
 
Migrating from Big Data Architecture to Spring Cloud
Migrating from Big Data Architecture to Spring CloudMigrating from Big Data Architecture to Spring Cloud
Migrating from Big Data Architecture to Spring Cloud
 
What We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerce
What We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerceWhat We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerce
What We're Learning Adopting Spring Boot and PCF for Dell.com's eCommerce
 
Federated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and BeyondFederated Queries with HAWQ - SQL on Hadoop and Beyond
Federated Queries with HAWQ - SQL on Hadoop and Beyond
 
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional AppsYugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
YugaByte DB—A Planet-Scale Database for Low Latency Transactional Apps
 
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
Simple Data Movement Patterns: Legacy Application to Cloud-Native Environment...
 
S1P: Spring Cloud on PKS
S1P: Spring Cloud on PKSS1P: Spring Cloud on PKS
S1P: Spring Cloud on PKS
 
Containerizing a Data Warehouse for Kubernetes
Containerizing a Data Warehouse for KubernetesContainerizing a Data Warehouse for Kubernetes
Containerizing a Data Warehouse for Kubernetes
 
Spring Cloud Gateway - Ryan Baxter
Spring Cloud Gateway - Ryan BaxterSpring Cloud Gateway - Ryan Baxter
Spring Cloud Gateway - Ryan Baxter
 
It’s a Multi-Cloud World, But What About The Data?
It’s a Multi-Cloud World, But What About The Data?It’s a Multi-Cloud World, But What About The Data?
It’s a Multi-Cloud World, But What About The Data?
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Cross-Platform Observability for Cloud Foundry
Cross-Platform Observability for Cloud FoundryCross-Platform Observability for Cloud Foundry
Cross-Platform Observability for Cloud Foundry
 
Developer Secure Containers for the Cyberspace Battlefield
Developer Secure Containers for the Cyberspace BattlefieldDeveloper Secure Containers for the Cyberspace Battlefield
Developer Secure Containers for the Cyberspace Battlefield
 
Achieving High Throughput With Reliability In Transactional Systems
Achieving High Throughput With Reliability In Transactional SystemsAchieving High Throughput With Reliability In Transactional Systems
Achieving High Throughput With Reliability In Transactional Systems
 
Beyond Caching: Extending Redis Enterprise for Real-Time Streams Processing
Beyond Caching: Extending Redis Enterprise for Real-Time Streams ProcessingBeyond Caching: Extending Redis Enterprise for Real-Time Streams Processing
Beyond Caching: Extending Redis Enterprise for Real-Time Streams Processing
 
Ratpack - SpringOne2GX 2015
Ratpack - SpringOne2GX 2015Ratpack - SpringOne2GX 2015
Ratpack - SpringOne2GX 2015
 
Building a Secure App with Google Polymer and Java / Spring
Building a Secure App with Google Polymer and Java / SpringBuilding a Secure App with Google Polymer and Java / Spring
Building a Secure App with Google Polymer and Java / Spring
 
Living on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora IberkleidLiving on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora Iberkleid
 
Living on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora IberkleidLiving on the Edge With Spring Cloud Gateway - Cora Iberkleid
Living on the Edge With Spring Cloud Gateway - Cora Iberkleid
 
Building Data Environments for Production Microservices with Geode
Building Data Environments for Production Microservices with GeodeBuilding Data Environments for Production Microservices with Geode
Building Data Environments for Production Microservices with Geode
 

Recently uploaded

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxalwaysnagaraju26
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 

Recently uploaded (20)

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
Abortion Pill Prices Boksburg [(+27832195400*)] 🏥 Women's Abortion Clinic in ...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
WSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - KanchanaWSO2Con2024 - Hello Choreo Presentation - Kanchana
WSO2Con2024 - Hello Choreo Presentation - Kanchana
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 

Implementing a highly scalable stock prediction system with R, Geode, SpringXD and Spark

  • 1. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SPRINGONE2GX WASHINGTON, DC Implementing a highly scalable Stock prediction system with R, Apache Geode and Spring XD Fred Melo
 @fredmelo_br William Markito
 @william_markito
  • 2. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ About us Fred Melo Technical Director for Data fmelo@pivotal.io @fredmelo_br 2 William Markito Enterprise Architect for GemFire wmarkito@pivotal.io @william_markito
  • 3. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 3
  • 4. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 4 It's all about DATA Data Sources Look for patterns Prediction
  • 5. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ What do we want to build? 5 "Smart System"
  • 6. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ … in our specific case 6 Trading Data "Smart System" Historical Data Repository Learns with historical trends "How were the medium average price and relative strength reading when the latest failures happened? " Live data becomes historical over time Real-Time Evaluates live data “According to historical trends, there’s an 80% chance this stock prices might go downhill within the next hour" Historical
  • 7. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ … in our specific case 7 Trading Data "Smart System" Historical Data Repository Learns with historical trends "How were the medium average price and relative strength reading when the latest failures happened? " Live data becomes historical over time Real-Time Evaluates live data “According to historical trends, there’s an 80% chance this stock prices might go downhill within the next hour" Historical
  • 8. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 8 Live Data Data Temperature Hot Cold Greenplum DB Apache Geode / GemFire 1- Live data is ingested into the grid 3 - Results are pushed immediately to deployed applications 4 - “Hot" data ages, becoming part of the historical dataset Machine Learning model 5 - Re-training is triggered, updating the model with the latest historical data Spring XD Spring XD The ML pipeline data flow 2 - Trained ML model compares new data to historical patterns
  • 9. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 9 Live Data Apache Geode / GemFire 1- Live data is ingested into the grid 2 - Trained ML model compares new data to historical patterns 3 - Results are pushed immediately to deployed applications Machine Learning model 4 - Re-training is triggered, updating the model with the latest historical data Spring XD Spring XD Simplified demo model Data Temperature Hot Warm
  • 10. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 10 Transform Sink SpringXD Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native Machine Learning Enrich Filter Split Dashboard Indicators 1 2 Predict 3 Real data Simulator /Stocks /TechIndicators /Predictions
  • 11. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 11 Eating it in small bites…
  • 12. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 12 SpringXD GemFire
  • 13. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ • Cache • Configurable through XML, ,Java • Region • Distributed j.u.Map on steroids • Highly available, redundant • Member • Locator, Server, Client • Callbacks • Listener, Writer, AsyncEventListener, Parallel/Serial Apache Geode & GemFire Concepts 13
  • 14. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Geode & GemFire, why ? • Performance • Consistency • Resiliency 14
  • 15. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Geode & GemFire, why ? 15 © Copyright 2014 Pivotal. All rights reserved. Pivotal GemFire High Availability and Fault Tolerance in 6 acts Failing data copies are replaced transparently Data is replicated to other clusters and sites (WAN) Network segmentations are identified and fixed automatically Client and cluster disconnections are handled gracefully Data is persisted on local disk for ultimate durability “split brain” Failed function executions are restarted automatically restart
  • 16. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Some interesting cases… 16 China Railway
 Corporation 5,700 train stations 4.5 million tickets per day 20 million daily users 1.4 billion page views per day 40,000 visits per second * http://pivotal.io/big-data/pivotal-gemfire Indian Railways 7,000 stations 72,000 miles of track 23 million passengers daily 120,000 concurrent users 10,000 transactions per minute
  • 17. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Use cases and industries 17 Indian RailwaysChina Railway Corporation World: ~7,349,000,000 ~36% of the world population Population: 1,251,695,6161,401,586,609
  • 18. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ • Commercial product available since 2004 • Native clients in Java, C++, C#, REST • Event Subscriptions and Continuous Queries • Configurable WAN Gateway between clusters • Enterprise Support, commercial features Apache Geode & Pivotal GemFire • Open Sourced in April/2015 • Java Native Client, REST • 98% of GemFire API • Event subscriptions • ~30 contributors • Under Incubation 18
  • 19. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 19 SpringXD GemFire
  • 20. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SpringXD Basic Concepts • Streams • Pipelines • Sources • Sinks • Filters • Taps 20
  • 21. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SpringXD Basic Concepts 21
  • 22. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ A simple example 22 twittersearch --consumerKey=XXX —consumerSecret=XXX -- query=SpringOne2GX --outputType=application/json | gemfire-json- server --useLocator=true --host=localhost --port=10334 -- regionName=tweets --keyExpression=payload.getField('id_str') twittersearch --query=SpringOne2GX | gemfire-json-server --host=localhost--regionName=tweets
  • 23. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 23 SpringXD GemFire
  • 24. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Spark Concepts • RDD • Dataframe • Driver • Worker 24 "An RDD in Spark is simply an immutable distributed collection of objects. Each RDD is split into multiple partitions, which may be computed on different nodes of the cluster. RDDs can contain any type of Python, Java, or Scala objects, including user-defined classes."
  • 25. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Apache Spark Concepts • RDD • Dataframe • Driver • Worker 25
  • 26. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 26 medium avg (x+1) relative strength (x) medium avg (x) price(x) Machine Learning Model (e.g. Linear Regression)
  • 27. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 27 medium avg (x+1) relative strength (x) medium avg (x) price(x) Machine Learning Model (e.g. Linear Regression) Features Label
  • 28. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 28 Transform Sink SpringXD Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native Machine Learning Enrich Filter Split Dashboard Indicators 1 2 Predict 3 Real data Simulator /Stocks /TechIndicators /Predictions
  • 29. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ 29
  • 30. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Learn more! 30 https://github.com/Pivotal-Open-Source-Hub/geode-security-samples https://github.com/Pivotal-Open-Source-Hub/WifiAnalyticsIoT https://github.com/Pivotal-Open-Source-Hub/geode-social-demo http://pivotal-open-source-hub.github.io/StockInference-Spark/
  • 31. Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a
 Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ Thank you 31 @william_markito @fredmelo_br Related: Building Highly-Scalable Spring Applications with In-Memory, Distributed Data Grids by John Blum & Luke Shannon September 15, 2015 -10:30 - Salon M http://pivotal-open-source-hub.github.io/StockInference-Spark/