SlideShare a Scribd company logo
1 of 25
Spring XD 
Pivotal Confidential–Internal Use Only 
Glenn Renfro 
grenfro @pivotal.io 
@CPPWFS
Volume 
Pivotal Confidential–Internal Use Only 
Velocity 
Variety 
Veracity 
60-100 sensors in each car 
22 Billion sensors by 2020 
420 Million Wearables 
Data 
90% of enterprise data is 
unstructured 
500 million tweets each day 
2.3 Trillion GBs of each day 
86% suspect data 
inaccuracy 
30% revenue loss due to bad 
data quality 
Data Points: McKinsey, Twitter, Gartner, IBM
Batch and Streaming 
often handled by 
multiple platforms 
Fragmented Big Data 
Pivotal Confidential–Internal Use Only 
Ecosystem 
Not all data Hadoop 
bound
SPRING XD 
EXTREME DATA 
“One stop shop for 
developing and deploying 
Big Data Applications”
Spring XD to Rescue 
Batch and Streaming 
often handled by 
multiple platforms 
Fragmented Big Data 
Ecosystem 
Not all data Hadoop 
Pivotal Confidential–Internal Use Only 
bound 
 Unified Stream and Batch Operations 
 Hadoop Batch Workflow Orchestration 
 Predictive Analytics and Model Scoring 
 Portable on-prem, YARN, EC2, PCF, Mesos, 
Docker etc. 
 Easy to Use, Extend and Integrate with other 
Technologies 
 Built on proven Spring EAI and Batch projects 
(Volume, Velocity, Veracity, and Variety)
Pivotal Confidential–Internal Use Only 
INTEGRATION BATCH BIG DATA WEB 
Jobs, Steps, 
Readers, Writers 
Ingestion, Export, 
Orchestration, Hadoop 
Controllers, REST, 
WebSocket 
Channels, Adapters, 
Filters, Transformers 
SPRING CORE 
FRAMEWORK SECURITY GROOVY REACTOR 
DATA 
RELATIONAL 
DATA ACCESS 
NON-RELATIONAL 
DATA ACCESS 
BOOT 
Bootable, Minimal, Ops-Ready 
GRAILS 
Full-stack, Web 
XD 
Stream, Taps, 
Jobs 
IO EXECUTION 
IO FOUNDATION 
IO COORDINATION 
SPRING CLOUD
Spring XD - 10,000 Foot View 
Pivotal Confidential–Internal Use Only
Streams 
HTTP 
Tail 
File 
Mail 
Twitter 
Gemfire 
Syslog 
TCP 
UDP 
JMS 
RabbitMQ 
MQTT 
Trigger 
Reactor TCP/UDP 
Pivotal Confidential–Internal Use Only 
Filter 
Transformer 
Object-to-JSON 
JSON-to-Tuple 
Splitter 
Aggregator 
HTTP Client 
JPMML Evaluator 
Shell 
Groovy 
Python 
Java 
File 
HDFS 
JDBC 
TCP 
Log 
Mail 
RabbitMQ 
Gemfire 
Splunk 
MQTT 
Dynamic Router 
Counters
Pivotal Confidential–Internal Use Only 
Create a stream with http as a source and hdfs 
as a sink. The hdfs —rollover is set to a small 
value so that we can read the file on hdfs.
Spring XD - Distributed Runtime 
Pivotal Confidential–Internal Use Only 
XD Shell 
HTTP POST /streams/aStream “M1 | M2” 
XD Admin 
(leader) 
XD Admin XD Admin Container State 
XD Container XD Container 
Message Bus 
ZooKeeper 
Spring App Context 
M1 M2
Pivotal Confidential–Internal Use Only
Pivotal Confidential–Internal Use Only
Spring XD - Analytics 
• Counters and Gauges 
• Simple & Field Value Counter 
(how many tweets for #java) 
• Aggregate Counter (how many 
tweets for #java in the week/day/hr) 
• Gauge & Rich Gauge (how many 
requests / minute?) 
• Abstract API implemented in Redis 
in-memory 
Pivotal Confidential–Internal Use Only 
• Predictive Model Evaluation 
• JPMML 
• Is this transaction fraudulent? 
• What group does this user belong to? 
• Interoperable with R, Rattle, 
KNIME, RapidMiner, MADLib
Jobs 
Pivotal Confidential–Internal Use Only 
CSV to JDBC 
FTP to HDFS 
JDBC to HDFS 
HDFS to JDBC 
HDFS to MongoDB
SENSORS 
SOCIAL 
Pivotal Confidential–Internal Use Only 
REALTIME 
VIEWS 
BATCH 
VIEWS 
Spring 
XD 
Spring 
XD 
MASTER 
DATASET 
Spring 
BOOT 
Spring 
BOOT 
Spring 
BOOT 
FILES 
Stream 
Processing 
Analytics 
Ingest 
Workflow 
Orchestration 
Export 
XD> 
GemFire XD 
Predictive 
Modeling 
GemFire XD 
SPEED 
LAYER 
BATCH 
LAYER 
SERVING 
LAYER 
PCF - BOSH Service PCF - Apps 
MOBILE
Pivotal Confidential–Internal Use Only 
Unified runtime 
for both Real-time 
and Batch 
use cases 
Scalable, 
Distributed and 
Fault Tolerant 
Runtime 
Increased 
Productivity through 
out-of-the-box 
components 
Closed Loop 
Analytics through 
online (stream) and 
offline (batch) data 
Swiss-army knife of data 
movement and data 
pipelines 
Repeatable ‘turnkey’ 
solution for next generation 
data-centric use cases
Agility: Easy to Setup and Run 
Pivotal Confidential–Internal Use Only 
Writing HTTP Data 
to HDFS 
…that simple! 
or 
or 
or
Spring XD on YARN 
Pivotal Confidential–Internal Use Only 
Spring XD Running 
on 
YARN! 
Copies Files to 
Creates HDFS 
manifest.yml 
Spring Boot App 
‘xd-yarn start admin’ 
Spring Boot App 
‘xd-yarn start container’ 
Spring Boot App
Pivotal Confidential–Internal Use Only 
Even easier with PCF
Natural Fit: Reactive Streaming Pipelines 
Moving Average 
‘collect values every 500ms’ 
Pivotal Confidential–Internal Use Only 
Non-Blocking 
Backpressure 
“take all these items I have whether you can 
handle them or not” 
“give me the next N available items” 
OLD 
NEW Microbatching 
‘either 1024b or 350ms; trigger downstream processing’
Deployment Manifest – Module Count 
• http | doWork | hdfs 
http 
http 
Pivotal Confidential–Internal Use Only 
doWork 
doWork 
doWork 
doWork 
hdfs 
hdfs 
hdfs 
stream deploy –name s1 
--properties 
module.http.count=2, 
module.doWork.count=4, 
module.hdfs.count=3
Deployment Manifest – Module Placement 
• http | doWork | hdfs 
http 
http 
Pivotal Confidential–Internal Use Only 
doWork 
doWork 
doWork 
doWork 
hdfs 
hdfs 
hdfs 
stream deploy –name s1 
--properties 
module.http.count=2, 
module.doWork.count=4, 
module.hdfs.count=3, 
module.http.criteria = 
groups.contains(‘WEB’) 
WEB
Deployment Manifest – Data Partitioning 
• http | doWork | hdfs 
http 
http 
Pivotal Confidential–Internal Use Only 
doWork 
doWork 
doWork 
doWork 
hdfs 
hdfs 
hdfs 
stream deploy –name s1 
--properties 
... 
module.http.producer 
.partitionKeyExpression = 
payload.customerId 
WEB 
doWork modules will always 
process the same set of customer 
IDs
Learn More 
• Project: http://projects.spring.io/spring-xd/ 
• GitHub: https://github.com/spring-projects/spring-xd/ 
• Wiki: https://github.com/spring-projects/spring-xd/wiki 
• Samples: https://github.com/spring-projects/spring-xd-samples 
Pivotal Confidential–Internal Use Only
Pivotal Confidential–Internal Use Only 
A NEW PLATFORM FOR A NEW ERA

More Related Content

What's hot

Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...confluent
 
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and ZeppelinJim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and ZeppelinFlink Forward
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestKrishna Gade
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Spark Summit
 
Stream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data MicroservicesStream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data Microservicesmarius_bogoevici
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data confluent
 
Let's build a simple ingest to cloud datawarehouse with low code
Let's build a simple ingest to cloud datawarehouse with low codeLet's build a simple ingest to cloud datawarehouse with low code
Let's build a simple ingest to cloud datawarehouse with low codeTimothy Spann
 
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...HostedbyConfluent
 
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...Databricks
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJim Plush
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microserviceNilanjan Roy
 
DevNation Live: Kafka and Debezium
DevNation Live: Kafka and DebeziumDevNation Live: Kafka and Debezium
DevNation Live: Kafka and DebeziumRed Hat Developers
 
From R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep GillFrom R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep GillDatabricks
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Spark Summit
 
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data PipelinesETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelinesconfluent
 
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn confluent
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache DruidImply
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream DataDataWorks Summit
 

What's hot (20)

Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
Change Data Streaming Patterns For Microservices With Debezium (Gunnar Morlin...
 
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and ZeppelinJim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
Jim Dowling – Interactive Flink analytics with HopsWorks and Zeppelin
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
Accelerating Spark Genome Sequencing in Cloud—A Data Driven Approach, Case St...
 
Stream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data MicroservicesStream Processing in the Cloud With Data Microservices
Stream Processing in the Cloud With Data Microservices
 
Simplify Governance of Streaming Data
Simplify Governance of Streaming Data Simplify Governance of Streaming Data
Simplify Governance of Streaming Data
 
Let's build a simple ingest to cloud datawarehouse with low code
Let's build a simple ingest to cloud datawarehouse with low codeLet's build a simple ingest to cloud datawarehouse with low code
Let's build a simple ingest to cloud datawarehouse with low code
 
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
Change Data Capture Pipelines with Debezium and Kafka Streams (Gunnar Morling...
 
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
OAP: Optimized Analytics Package for Spark Platform with Daoyuan Wang and Yua...
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microservice
 
DevNation Live: Kafka and Debezium
DevNation Live: Kafka and DebeziumDevNation Live: Kafka and Debezium
DevNation Live: Kafka and Debezium
 
From R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep GillFrom R Script to Production Using rsparkling with Navdeep Gill
From R Script to Production Using rsparkling with Navdeep Gill
 
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
Analytics at the Real-Time Speed of Business: Spark Summit East talk by Manis...
 
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data PipelinesETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
 
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
 
Building Data Applications with Apache Druid
Building Data Applications with Apache DruidBuilding Data Applications with Apache Druid
Building Data Applications with Apache Druid
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
 

Similar to Big Data Applications Made Easy: Fact Or Fiction?

Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Rajit Saha
 
Spark meets Spring
Spark meets SpringSpark meets Spring
Spark meets Springmark_fisher
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Amazon Web Services
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivotalOpenSourceHub
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBconfluent
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceHortonworks
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...Sriskandarajah Suhothayan
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and MoreWSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and MoreWSO2
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Cloudera, Inc.
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Shirshanka Das
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...In-Memory Computing Summit
 

Similar to Big Data Applications Made Easy: Fact Or Fiction? (20)

Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
Spark meets Spring
Spark meets SpringSpark meets Spring
Spark meets Spring
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
 
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
 
Ibm db2 big sql
Ibm db2 big sqlIbm db2 big sql
Ibm db2 big sql
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
 
Real time analytics
Real time analyticsReal time analytics
Real time analytics
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and MoreWSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and More
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
 
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
Strata SG 2015: LinkedIn Self Serve Reporting Platform on Hadoop
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Me...
 

Recently uploaded

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 

Recently uploaded (20)

From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 

Big Data Applications Made Easy: Fact Or Fiction?

  • 1. Spring XD Pivotal Confidential–Internal Use Only Glenn Renfro grenfro @pivotal.io @CPPWFS
  • 2. Volume Pivotal Confidential–Internal Use Only Velocity Variety Veracity 60-100 sensors in each car 22 Billion sensors by 2020 420 Million Wearables Data 90% of enterprise data is unstructured 500 million tweets each day 2.3 Trillion GBs of each day 86% suspect data inaccuracy 30% revenue loss due to bad data quality Data Points: McKinsey, Twitter, Gartner, IBM
  • 3. Batch and Streaming often handled by multiple platforms Fragmented Big Data Pivotal Confidential–Internal Use Only Ecosystem Not all data Hadoop bound
  • 4. SPRING XD EXTREME DATA “One stop shop for developing and deploying Big Data Applications”
  • 5. Spring XD to Rescue Batch and Streaming often handled by multiple platforms Fragmented Big Data Ecosystem Not all data Hadoop Pivotal Confidential–Internal Use Only bound  Unified Stream and Batch Operations  Hadoop Batch Workflow Orchestration  Predictive Analytics and Model Scoring  Portable on-prem, YARN, EC2, PCF, Mesos, Docker etc.  Easy to Use, Extend and Integrate with other Technologies  Built on proven Spring EAI and Batch projects (Volume, Velocity, Veracity, and Variety)
  • 6. Pivotal Confidential–Internal Use Only INTEGRATION BATCH BIG DATA WEB Jobs, Steps, Readers, Writers Ingestion, Export, Orchestration, Hadoop Controllers, REST, WebSocket Channels, Adapters, Filters, Transformers SPRING CORE FRAMEWORK SECURITY GROOVY REACTOR DATA RELATIONAL DATA ACCESS NON-RELATIONAL DATA ACCESS BOOT Bootable, Minimal, Ops-Ready GRAILS Full-stack, Web XD Stream, Taps, Jobs IO EXECUTION IO FOUNDATION IO COORDINATION SPRING CLOUD
  • 7. Spring XD - 10,000 Foot View Pivotal Confidential–Internal Use Only
  • 8. Streams HTTP Tail File Mail Twitter Gemfire Syslog TCP UDP JMS RabbitMQ MQTT Trigger Reactor TCP/UDP Pivotal Confidential–Internal Use Only Filter Transformer Object-to-JSON JSON-to-Tuple Splitter Aggregator HTTP Client JPMML Evaluator Shell Groovy Python Java File HDFS JDBC TCP Log Mail RabbitMQ Gemfire Splunk MQTT Dynamic Router Counters
  • 9. Pivotal Confidential–Internal Use Only Create a stream with http as a source and hdfs as a sink. The hdfs —rollover is set to a small value so that we can read the file on hdfs.
  • 10. Spring XD - Distributed Runtime Pivotal Confidential–Internal Use Only XD Shell HTTP POST /streams/aStream “M1 | M2” XD Admin (leader) XD Admin XD Admin Container State XD Container XD Container Message Bus ZooKeeper Spring App Context M1 M2
  • 13. Spring XD - Analytics • Counters and Gauges • Simple & Field Value Counter (how many tweets for #java) • Aggregate Counter (how many tweets for #java in the week/day/hr) • Gauge & Rich Gauge (how many requests / minute?) • Abstract API implemented in Redis in-memory Pivotal Confidential–Internal Use Only • Predictive Model Evaluation • JPMML • Is this transaction fraudulent? • What group does this user belong to? • Interoperable with R, Rattle, KNIME, RapidMiner, MADLib
  • 14. Jobs Pivotal Confidential–Internal Use Only CSV to JDBC FTP to HDFS JDBC to HDFS HDFS to JDBC HDFS to MongoDB
  • 15. SENSORS SOCIAL Pivotal Confidential–Internal Use Only REALTIME VIEWS BATCH VIEWS Spring XD Spring XD MASTER DATASET Spring BOOT Spring BOOT Spring BOOT FILES Stream Processing Analytics Ingest Workflow Orchestration Export XD> GemFire XD Predictive Modeling GemFire XD SPEED LAYER BATCH LAYER SERVING LAYER PCF - BOSH Service PCF - Apps MOBILE
  • 16. Pivotal Confidential–Internal Use Only Unified runtime for both Real-time and Batch use cases Scalable, Distributed and Fault Tolerant Runtime Increased Productivity through out-of-the-box components Closed Loop Analytics through online (stream) and offline (batch) data Swiss-army knife of data movement and data pipelines Repeatable ‘turnkey’ solution for next generation data-centric use cases
  • 17. Agility: Easy to Setup and Run Pivotal Confidential–Internal Use Only Writing HTTP Data to HDFS …that simple! or or or
  • 18. Spring XD on YARN Pivotal Confidential–Internal Use Only Spring XD Running on YARN! Copies Files to Creates HDFS manifest.yml Spring Boot App ‘xd-yarn start admin’ Spring Boot App ‘xd-yarn start container’ Spring Boot App
  • 19. Pivotal Confidential–Internal Use Only Even easier with PCF
  • 20. Natural Fit: Reactive Streaming Pipelines Moving Average ‘collect values every 500ms’ Pivotal Confidential–Internal Use Only Non-Blocking Backpressure “take all these items I have whether you can handle them or not” “give me the next N available items” OLD NEW Microbatching ‘either 1024b or 350ms; trigger downstream processing’
  • 21. Deployment Manifest – Module Count • http | doWork | hdfs http http Pivotal Confidential–Internal Use Only doWork doWork doWork doWork hdfs hdfs hdfs stream deploy –name s1 --properties module.http.count=2, module.doWork.count=4, module.hdfs.count=3
  • 22. Deployment Manifest – Module Placement • http | doWork | hdfs http http Pivotal Confidential–Internal Use Only doWork doWork doWork doWork hdfs hdfs hdfs stream deploy –name s1 --properties module.http.count=2, module.doWork.count=4, module.hdfs.count=3, module.http.criteria = groups.contains(‘WEB’) WEB
  • 23. Deployment Manifest – Data Partitioning • http | doWork | hdfs http http Pivotal Confidential–Internal Use Only doWork doWork doWork doWork hdfs hdfs hdfs stream deploy –name s1 --properties ... module.http.producer .partitionKeyExpression = payload.customerId WEB doWork modules will always process the same set of customer IDs
  • 24. Learn More • Project: http://projects.spring.io/spring-xd/ • GitHub: https://github.com/spring-projects/spring-xd/ • Wiki: https://github.com/spring-projects/spring-xd/wiki • Samples: https://github.com/spring-projects/spring-xd-samples Pivotal Confidential–Internal Use Only
  • 25. Pivotal Confidential–Internal Use Only A NEW PLATFORM FOR A NEW ERA

Editor's Notes

  1. Big Data Overview: Everything starts with Data! Let’s look at the 4 V’s of Big Data. Volume: Data generation is at massive scale Velocity: Need for data agility is mandatory Veracity: Bad quality of data poses enormous risk Variety: Heterogeneous data requirements
  2. Flume Storm Spark * notes* oozie List the top challenges. Hadoop isn’t always the target… Mongo, RDBMS, Redis, In memory data grid, or as a stream to a micro service
  3. Pitch Spring XD! Relate to the discussed problem and progress to the next slide for solutions.
  4. Let’s see how Spring XD tackles the described challenges. http client hdfs stream create foo --definition "http |hdfs --rollover=11" —deploy http post --target http://localhost:9000 --data "hello world” hadoop fs ls /xd/foo hadoop config fs --namenode hdfs://localhost:8020
  5. Brief overview on Spring IO platform.
  6. Architecture overview.
  7. http client filter hdfs http client filter rdbms http client filter count on hdfs job move that data to mongo
  8. Closer look at Spring XD’s business value proposition. Unified runtime Runtime features Productivity Closed loop analytics Enterprise data pipelines Data-centric use cases
  9. Easy to setup.
  10. Even on YARN, it’s that SIMPLE!
  11. A
  12. Spring Reactor’s NIO and async dispatcher fits Spring XD model naturally.