Real time stock processing with apache nifi, apache flink and apache kafka

Timothy Spann
Timothy SpannDeveloper Advocate
Real-Time Stock Processing
With Apache NiFi, Apache Flink and Apache Kafka
Timothy Spann, Principal DataFlow Field Engineer @
Pierre Villard, Senior Product Manager @
Who?
Tim Spann
@PaasDev // Blog: www.datainmotion.dev
Principal DataFlow Field Engineer. Princeton Future of Data Meetup.
https://github.com/tspannhw/EverythingApacheNiFi
https://community.cloudera.com/t5/Community-Articles/Real-Time-Stock-Processing-With-A
pache-NiFi-and-Apache-Kafka/ta-p/249221
Pierre Villard
Twitter & Github - @pvillard31 // Blog: www.pierrevillard.com
Committer and PMC member for Apache NiFi (in the community since 2015)
Senior Product Manager at Cloudera for products around Apache NiFi, NiFi Registry and MiNiFi
Previously at Google & Hortonworks
What?
This talk is about ingesting real-time data from many sources and build a dashboard on top it to track in
real time what our stocks are.
This use case is a good example to show the combination of some of the best Apache solutions for
streaming applications.
NiFi, Kafka and Flink in a few numbers
- Apache NiFi (version 1.12.x) - created and open sourced by the NSA - initial release in 2006
350+ contributors, 1200+ people in Slack, 3.1M+ docker pulls
Many sub-projects: NiFi, MiNiFi Java, MiNiFi C++, NiFi Registry, etc
- Apache Kafka (version 2.6.x) - created and open sourced by LinkedIn - initial release in 2011
700+ contributors
- Apache Flink (version 1.11.x) - initial release in 2011
750+ contributors, 2nd top repository by number of commits, top active project on mailing lists
What is NiFi used for?
Analyze
Streaming OLAP
Analytics & Time Series
Store Powered by
Druid & Kudu
Buffer
Apache Kafka
Topics
Ingest Gateway
Powered by Kafka
Distribute
Apache NiFi
Data Flow Apps
Powered by NiFi
Buffer
Apache Kafka
Syndicate
topics
Syndicate Services
Powered by Kafka
Collect
Syndicate
topics
Syndicate Services
Powered by Kafka
Replication /
Data Deployment
Analyze
Streaming Analytics Apps
Stream Processing
Powered by Flink
Streaming Reference Architecture
Data Collection
at the Edge
Apache NiFi / MiNiFi
- sensors, IoT
- databases
- file systems
- app sidecar
- live streams
- MQ
- logs
- network
Anything… you
name it!
Where?
CDP services are optimized for the elastic
compute & ‘always-on’ storage services provided
by any cloud provider
Web service hosted and managed by Cloudera
Hosted in the your cloud environment, but
managed by the CDP Management Console
Shared Data Experience (SDX) technologies form
a secure and governed data lake backed by object
storage (S3, ADLS, GCS)
Flow Management Streams Messaging Streaming Analytics
How? This use case architecture
Stock Data
Logs
Errors
Aggregates
Other data
SQL
Analytics
Demonstration
Let’s see all of this in action…
Thanks! Questions?
Timothy Spann, Principal DataFlow Field Engineer @
Pierre Villard, Senior Product Manager @
1 of 10

Recommended

Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka by
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
16.1K views49 slides
Stream processing using Kafka by
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
1.6K views44 slides
The Top 5 Apache Kafka Use Cases and Architectures in 2022 by
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
8.4K views64 slides
Integrating Apache Spark and NiFi for Data Lakes by
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesDataWorks Summit/Hadoop Summit
10.9K views18 slides
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G... by
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
1K views56 slides
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose by
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseAldrin Piri
2.3K views38 slides

More Related Content

What's hot

Introduction to Kafka Streams by
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
29.7K views136 slides
Data ingestion and distribution with apache NiFi by
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiLev Brailovskiy
8.5K views35 slides
ksqlDB: A Stream-Relational Database System by
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
1.4K views37 slides
Kafka 101 by
Kafka 101Kafka 101
Kafka 101Clement Demonchy
2.4K views41 slides
Apache Flink and what it is used for by
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used forAljoscha Krettek
1.4K views22 slides
A Thorough Comparison of Delta Lake, Iceberg and Hudi by
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
11.1K views27 slides

What's hot(20)

Introduction to Kafka Streams by Guozhang Wang
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
Guozhang Wang29.7K views
Data ingestion and distribution with apache NiFi by Lev Brailovskiy
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
Lev Brailovskiy8.5K views
ksqlDB: A Stream-Relational Database System by confluent
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
confluent1.4K views
Apache Flink and what it is used for by Aljoscha Krettek
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek1.4K views
A Thorough Comparison of Delta Lake, Iceberg and Hudi by Databricks
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks11.1K views
Getting Started with Confluent Schema Registry by confluent
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
confluent485 views
Introduction to Apache NiFi dws19 DWS - DC 2019 by Timothy Spann
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
Timothy Spann1.8K views
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019 by confluent
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
confluent10.2K views
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree by Slim Baltagi
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Slim Baltagi14.5K views
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic by DataScienceConferenc1
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog... by Hortonworks
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks44.9K views
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita... by Kai Wähner
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
Kai Wähner1.1K views
Introduction to Apache Kafka and Confluent... and why they matter by confluent
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
confluent4.6K views
Introduction to Apache Kafka by Jeff Holoman
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman52.2K views
Kafka Tutorial - Introduction to Apache Kafka (Part 1) by Jean-Paul Azar
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar8.9K views
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi by Timothy Spann
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Timothy Spann2.2K views
Scaling your Data Pipelines with Apache Spark on Kubernetes by Databricks
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks2.1K views

Similar to Real time stock processing with apache nifi, apache flink and apache kafka

Real time cloud native open source streaming of any data to apache solr by
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solrTimothy Spann
759 views31 slides
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi by
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiSlim Baltagi
16.5K views116 slides
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks by
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksSlim Baltagi
4.1K views56 slides
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks by
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksDataWorks Summit/Hadoop Summit
2.1K views56 slides
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks by
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksSlim Baltagi
2K views56 slides
Running Apache NiFi with Apache Spark : Integration Options by
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
6.2K views38 slides

Similar to Real time stock processing with apache nifi, apache flink and apache kafka(20)

Real time cloud native open source streaming of any data to apache solr by Timothy Spann
Real time cloud native open source streaming of any data to apache solrReal time cloud native open source streaming of any data to apache solr
Real time cloud native open source streaming of any data to apache solr
Timothy Spann759 views
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi by Slim Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Slim Baltagi16.5K views
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks by Slim Baltagi
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Slim Baltagi4.1K views
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks by Slim Baltagi
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Slim Baltagi2K views
Running Apache NiFi with Apache Spark : Integration Options by Timothy Spann
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
Timothy Spann6.2K views
Unified Batch and Real-Time Stream Processing Using Apache Flink by Slim Baltagi
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
Slim Baltagi8K views
ApacheCon 2021 - Apache NiFi Deep Dive 300 by Timothy Spann
ApacheCon 2021 - Apache NiFi Deep Dive 300ApacheCon 2021 - Apache NiFi Deep Dive 300
ApacheCon 2021 - Apache NiFi Deep Dive 300
Timothy Spann690 views
Codeless pipelines with pulsar and flink by Timothy Spann
Codeless pipelines with pulsar and flinkCodeless pipelines with pulsar and flink
Codeless pipelines with pulsar and flink
Timothy Spann658 views
Using FLiP with influxdb for edgeai iot at scale 2022 by Timothy Spann
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
Timothy Spann465 views
Robust stream processing with Apache Flink by Aljoscha Krettek
Robust stream processing with Apache FlinkRobust stream processing with Apache Flink
Robust stream processing with Apache Flink
Aljoscha Krettek101 views
Cloud lunch and learn real-time streaming in azure by Timothy Spann
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann663 views
CoC23_ Looking at the New Features of Apache NiFi by Timothy Spann
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
Timothy Spann36 views
CoC23_ Looking at the New Features of Apache NiFi by ssuser73434e
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
ssuser73434e4 views
Flink Community Update 2015 June by Márton Balassi
Flink Community Update 2015 JuneFlink Community Update 2015 June
Flink Community Update 2015 June
Márton Balassi475 views
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp by Timothy Spann
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Timothy Spann163 views
ApacheCon 2021: Apache NiFi 101- introduction and best practices by Timothy Spann
ApacheCon 2021:   Apache NiFi 101- introduction and best practicesApacheCon 2021:   Apache NiFi 101- introduction and best practices
ApacheCon 2021: Apache NiFi 101- introduction and best practices
Timothy Spann887 views
Using the flipn stack for edge ai (flink, nifi, pulsar) by Timothy Spann
Using the flipn stack for edge ai (flink, nifi, pulsar)Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)
Timothy Spann491 views
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ... by StreamNative
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
StreamNative258 views

More from Timothy Spann

Building Real-Time Travel Alerts by
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
165 views48 slides
JConWorld_ Continuous SQL with Kafka and Flink by
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
156 views36 slides
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines by
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
150 views25 slides
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo by
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
162 views8 slides
CoC23_ Let’s Monitor The Conditions at the Conference by
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the ConferenceTimothy Spann
17 views17 slides
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf by
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfTimothy Spann
23 views43 slides

More from Timothy Spann(20)

Building Real-Time Travel Alerts by Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann165 views
JConWorld_ Continuous SQL with Kafka and Flink by Timothy Spann
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann156 views
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines by Timothy Spann
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
Timothy Spann150 views
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo by Timothy Spann
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Timothy Spann162 views
CoC23_ Let’s Monitor The Conditions at the Conference by Timothy Spann
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the Conference
Timothy Spann17 views
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf by Timothy Spann
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
Timothy Spann23 views
CoC23_Utilizing Real-Time Transit Data for Travel Optimization by Timothy Spann
CoC23_Utilizing Real-Time Transit Data for Travel OptimizationCoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
Timothy Spann31 views
The Never Landing Stream with HTAP and Streaming by Timothy Spann
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
Timothy Spann254 views
Meetup - Brasil - Data In Motion - 2023 September 19 by Timothy Spann
Meetup - Brasil - Data In Motion - 2023 September 19Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19
Timothy Spann319 views
Implement a Universal Data Distribution Architecture to Manage All Streaming ... by Timothy Spann
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Timothy Spann28 views
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data by Timothy Spann
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataBuilding Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Timothy Spann193 views
big data fest building modern data streaming apps by Timothy Spann
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
Timothy Spann317 views
OSSNA Building Modern Data Streaming Apps by Timothy Spann
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
Timothy Spann155 views
GSJUG: Mastering Data Streaming Pipelines 09May2023 by Timothy Spann
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023
Timothy Spann255 views
BestInFlowCompetitionTutorials03May2023 by Timothy Spann
BestInFlowCompetitionTutorials03May2023BestInFlowCompetitionTutorials03May2023
BestInFlowCompetitionTutorials03May2023
Timothy Spann11 views
Cloudera Sandbox Event Guidelines For Workflow by Timothy Spann
Cloudera Sandbox Event Guidelines For WorkflowCloudera Sandbox Event Guidelines For Workflow
Cloudera Sandbox Event Guidelines For Workflow
Timothy Spann32 views
Meet the Committers Webinar_ Lab Preparation by Timothy Spann
Meet the Committers Webinar_ Lab PreparationMeet the Committers Webinar_ Lab Preparation
Meet the Committers Webinar_ Lab Preparation
Timothy Spann32 views
Best Practices For Workflow by Timothy Spann
Best Practices For WorkflowBest Practices For Workflow
Best Practices For Workflow
Timothy Spann89 views
Meetup: Streaming Data Pipeline Development by Timothy Spann
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline Development
Timothy Spann337 views

Recently uploaded

Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...ShapeBlue
199 views20 slides
"Surviving highload with Node.js", Andrii Shumada by
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada Fwdays
58 views29 slides
Future of AR - Facebook Presentation by
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook PresentationRob McCarty
65 views27 slides
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...ShapeBlue
171 views28 slides
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueShapeBlue
207 views54 slides
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...Jasper Oosterveld
35 views49 slides

Recently uploaded(20)

Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue199 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays58 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty65 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue171 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue207 views
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ... by Jasper Oosterveld
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
ESPC 2023 - Protect and Govern your Sensitive Data with Microsoft Purview in ...
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue224 views
Business Analyst Series 2023 - Week 4 Session 8 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 8Business Analyst Series 2023 -  Week 4 Session 8
Business Analyst Series 2023 - Week 4 Session 8
DianaGray10145 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue139 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue108 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue141 views
Optimizing Communication to Optimize Human Behavior - LCBM by Yaman Kumar
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBM
Yaman Kumar38 views
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by Moses Kemibaro
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Moses Kemibaro35 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li91 views
Initiating and Advancing Your Strategic GIS Governance Strategy by Safe Software
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software184 views
"Running students' code in isolation. The hard way", Yurii Holiuk by Fwdays
"Running students' code in isolation. The hard way", Yurii Holiuk "Running students' code in isolation. The hard way", Yurii Holiuk
"Running students' code in isolation. The hard way", Yurii Holiuk
Fwdays36 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays33 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu437 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE84 views

Real time stock processing with apache nifi, apache flink and apache kafka

  • 1. Real-Time Stock Processing With Apache NiFi, Apache Flink and Apache Kafka Timothy Spann, Principal DataFlow Field Engineer @ Pierre Villard, Senior Product Manager @
  • 2. Who? Tim Spann @PaasDev // Blog: www.datainmotion.dev Principal DataFlow Field Engineer. Princeton Future of Data Meetup. https://github.com/tspannhw/EverythingApacheNiFi https://community.cloudera.com/t5/Community-Articles/Real-Time-Stock-Processing-With-A pache-NiFi-and-Apache-Kafka/ta-p/249221 Pierre Villard Twitter & Github - @pvillard31 // Blog: www.pierrevillard.com Committer and PMC member for Apache NiFi (in the community since 2015) Senior Product Manager at Cloudera for products around Apache NiFi, NiFi Registry and MiNiFi Previously at Google & Hortonworks
  • 3. What? This talk is about ingesting real-time data from many sources and build a dashboard on top it to track in real time what our stocks are. This use case is a good example to show the combination of some of the best Apache solutions for streaming applications.
  • 4. NiFi, Kafka and Flink in a few numbers - Apache NiFi (version 1.12.x) - created and open sourced by the NSA - initial release in 2006 350+ contributors, 1200+ people in Slack, 3.1M+ docker pulls Many sub-projects: NiFi, MiNiFi Java, MiNiFi C++, NiFi Registry, etc - Apache Kafka (version 2.6.x) - created and open sourced by LinkedIn - initial release in 2011 700+ contributors - Apache Flink (version 1.11.x) - initial release in 2011 750+ contributors, 2nd top repository by number of commits, top active project on mailing lists
  • 5. What is NiFi used for?
  • 6. Analyze Streaming OLAP Analytics & Time Series Store Powered by Druid & Kudu Buffer Apache Kafka Topics Ingest Gateway Powered by Kafka Distribute Apache NiFi Data Flow Apps Powered by NiFi Buffer Apache Kafka Syndicate topics Syndicate Services Powered by Kafka Collect Syndicate topics Syndicate Services Powered by Kafka Replication / Data Deployment Analyze Streaming Analytics Apps Stream Processing Powered by Flink Streaming Reference Architecture Data Collection at the Edge Apache NiFi / MiNiFi - sensors, IoT - databases - file systems - app sidecar - live streams - MQ - logs - network Anything… you name it!
  • 7. Where? CDP services are optimized for the elastic compute & ‘always-on’ storage services provided by any cloud provider Web service hosted and managed by Cloudera Hosted in the your cloud environment, but managed by the CDP Management Console Shared Data Experience (SDX) technologies form a secure and governed data lake backed by object storage (S3, ADLS, GCS) Flow Management Streams Messaging Streaming Analytics
  • 8. How? This use case architecture Stock Data Logs Errors Aggregates Other data SQL Analytics
  • 9. Demonstration Let’s see all of this in action…
  • 10. Thanks! Questions? Timothy Spann, Principal DataFlow Field Engineer @ Pierre Villard, Senior Product Manager @