Cloudera streaming with flink oct 29, 2020 meetup london

Timothy Spann
Timothy SpannDeveloper Advocate
Future of Data: London presents:
Cloudera Streaming with Flink
Timothy Spann - Principal DataFlow Field Engineer
© 2020 Cloudera, Inc. All rights reserved. 2
More Information
Cloudera’s Next Presentation:
Validating a Jet Engine Predictive Model in a Cloud Environment
Wednesday, November 18, 2020
11:00 AM to 12:30 PM CST
https://www.meetup.com/futureofdata-austin/events/273929240/
Cloudera streaming with flink oct 29, 2020 meetup london
© 2020 Cloudera, Inc. All rights reserved. 4
Welcome to Future of Data - Virtual
@PaasDev
https://www.meetup.com/futureofdata-princeton/
From Big Data to AI to Streaming to Containers to
Cloud to Analytics to Cloud Storage to Fast Data to
Machine Learning to Microservices to ...
© 2020 Cloudera, Inc. All rights reserved. 5
Today’s Lead
Who am I?
Data in Motion Field Engineer
@PaasDev
DZone Zone Leader and Big Data MVB
Princeton NJ Future of Data Meetup
ex-Pivotal Field Engineer
https://github.com/tspannhw https://www.datainmotion.dev/
Speaking at Flink Forward, AI Dev World, Nethope and OSS.
© 2020 Cloudera, Inc. All rights reserved. 6
Yes, Franz, It’s Kafka
Let’s do a metamorphosis on your data. Don’t fear changing data.
You don’t need to be a brilliant writer to stream
data.
Franz Kafka was a German-speaking
Bohemian novelist and short-story writer,
widely regarded as one of the major figures of
20th-century literature. His work fuses
elements of realism and the fantastic.
Wikipedia
© 2019 Cloudera, Inc. All rights reserved. 7
Apache Kafka
• Highly reliable distributed
messaging system
• Decouple applications, enables
many-to-many patterns
• Publish-Subscribe semantics
• Horizontal scalability
• Efficient implementation to
operate at speed with big data
volumes
• Organized by topic to support
several use cases
Source
System
Source
System
Source
System
Kafka
Fraud
Detection
Security
Systems
Real-Time
Monitoring
Source
System
Source
System
Source
System
Fraud
Detection
Security
Systems
Real-Time
Monitoring
Many-To-Many
Publish-Subscribe
Point-To-Point
Request-Response
© 2020 Cloudera, Inc. All rights reserved. 8
Weather Streaming Pipeline
Weather
Weather
Climate
Aggregates
Sensors
SQL
Analytics
Sources
Pollution
© 2020 Cloudera, Inc. All rights reserved. 9
Weather Streaming Pipeline
© 2020 Cloudera, Inc. All rights reserved. 10
Weather Streaming Pipeline
https://dzone.com/articles/lets-build-a-simple-ingest-to-cloud-data-warehouse
© 2020 Cloudera, Inc. All rights reserved. 12
Simplifying the User Experience
© 2020 Cloudera, Inc. All rights reserved. 13
DATA-IN-MOTION COMPONENTS IN CONTEXT
© 2019 Cloudera, Inc. All rights reserved. 14
SDX ENABLES GOVERNANCE FOR THE ENTIRE DATA LIFECYCLE
CDF
CDE
CML
Data Engineering
Transformations
Hive Table
Fraud ML
Project
Fraud
Model
Training
NiFi Processors
Fraud Model
Running in
Prod
End-to-End Streaming with Cloudera
DataFlow
© 2020 Cloudera, Inc. All rights reserved. 16
CLOUDERA FLOW AND EDGE MANAGEMENT
Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data
center) to any downstream system with built in end-to-end security and provenance
Advanced tooling to industrialize
flow development (Flow Development
Life Cycle)
ACQUIRE
• Over 300 Prebuilt Processors
• Easy to build your own
• Parse, Enrich & Apply Schema
• Filter, Split, Merger & Route
• Throttle & Backpressure
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
PROCESS
HASH
MERGE
EXTRACT
DUPLICATE
SPLIT
ENCRYPT
TALL
EVALUATE
EXECUTE
GEOENRICH
SCAN
REPLACE
TRANSLATE
CONVERT
ROUTE TEXT
ROUTE CONTENT
ROUTE CONTEXT
ROUTE RATE
DISTRIBUTE LOAD
DELIVER
• Guaranteed Delivery
• Full data provenance from
acquisition to delivery
• Diverse, Non-Traditional Sources
• Eco-system integration
FTP
SFTP
HL7
UDP
XML
HTTP
EMAIL
HTML
IMAGE
SYSLOG
© 2020 Cloudera, Inc. All rights reserved. 17
Processing one billion events per second with NiFi
https://blog.cloudera.com/benchmarking-nifi-performance-and-scalability/
© 2020 Cloudera, Inc. All rights reserved. 18
New features delivered with Cloudera Streaming Analytics (CSA) 1.2
Next Generation Streaming Analytics
Flink SQL Support
Agile Streaming App
Development using SQL
New Flink Atlas Hook
Capture operational Flink
app metadata and lineage
Single View of Flink Yarn Jobs
Improve Developer Experience
& operational visibility
© 2020 Cloudera, Inc. All rights reserved. 19
● https://www.cloudera.com/tutorials/building-a-sentiment-analysis-application/3.html
● https://blog.cloudera.com/integrating-machine-learning-models-into-your-big-data-pipeli
nes-in-real-time-with-no-coding/
● https://community.cloudera.com/t5/Community-Articles/Processing-Real-Time-Social-Me
dia-Twitter-with-Apache-NiFi-1/ta-p/248354
● https://www.datainmotion.dev/2020/09/using-djlai-for-deep-learning-based.html
● https://github.com/tspannhw/airline-sentiment-streaming
● https://github.com/tspannhw/nifi-corenlp-processor
● https://github.com/tspannhw/nifi-cdsw
● https://github.com/tspannhw/nifi-djlsentimentanalysis-processor
● https://www.datainmotion.dev/2020/04/harnessing-data-lifecycle-for-customer.html
REFERENCES
© 2020 Cloudera, Inc. All rights reserved. 20
TH N Y U
1 of 20

Recommended

Composable architectures The Lego of IT - Alessandro David by
Composable architectures The Lego of IT - Alessandro DavidComposable architectures The Lego of IT - Alessandro David
Composable architectures The Lego of IT - Alessandro DavidCodemotion
1.1K views10 slides
10 Good Reasons: NetApp Data Fabric by
10 Good Reasons: NetApp Data Fabric10 Good Reasons: NetApp Data Fabric
10 Good Reasons: NetApp Data FabricNetApp
1.3K views1 slide
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K... by
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...
Application Autoscaling Made Easy with Kubernetes Event-Driven Autoscaling (K...Codit
1.4K views29 slides
Introducing rubrik a new approach to data protection by
Introducing rubrik   a new approach to data protectionIntroducing rubrik   a new approach to data protection
Introducing rubrik a new approach to data protectionDatabarracks
2.2K views23 slides
Vps server 2 by
Vps server 2Vps server 2
Vps server 2RicoVierra08
142 views3 slides
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017 by
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017
MGT3342BUS - Architecting Data Protection with Rubrik - VMworld 2017Andrew Miller
1.3K views50 slides

More Related Content

What's hot

Amazon cloud service by
Amazon cloud serviceAmazon cloud service
Amazon cloud serviceSuresh Mandava
561 views82 slides
Google Cloud Platform (GCP) by
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)Chetan Sharma
11.4K views39 slides
Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da... by
Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da...Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da...
Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da...HostedbyConfluent
330 views8 slides
Containers for the Enterprise: It's Not That Simple by
Containers for the Enterprise: It's Not That SimpleContainers for the Enterprise: It's Not That Simple
Containers for the Enterprise: It's Not That SimpleMirantis
884 views25 slides
Big Data on OpenStack by
Big Data on OpenStackBig Data on OpenStack
Big Data on OpenStackNati Shalom
3.5K views35 slides
Kubernetes on GCP by
Kubernetes on GCPKubernetes on GCP
Kubernetes on GCPFotios Tragopoulos
398 views31 slides

What's hot(20)

Google Cloud Platform (GCP) by Chetan Sharma
Google Cloud Platform (GCP)Google Cloud Platform (GCP)
Google Cloud Platform (GCP)
Chetan Sharma11.4K views
Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da... by HostedbyConfluent
Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da...Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da...
Maximizing the Capabilities of Kafka – Real-Time Streaming of Event-Driven Da...
HostedbyConfluent330 views
Containers for the Enterprise: It's Not That Simple by Mirantis
Containers for the Enterprise: It's Not That SimpleContainers for the Enterprise: It's Not That Simple
Containers for the Enterprise: It's Not That Simple
Mirantis884 views
Big Data on OpenStack by Nati Shalom
Big Data on OpenStackBig Data on OpenStack
Big Data on OpenStack
Nati Shalom3.5K views
Moving to a Microservice World: Leveraging Consul on Azure by Mitchell Pronschinske
Moving to a Microservice World: Leveraging Consul on AzureMoving to a Microservice World: Leveraging Consul on Azure
Moving to a Microservice World: Leveraging Consul on Azure
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit by Denis Magda
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Denis Magda747 views
Microservices on the Edge by LaunchAny
Microservices on the EdgeMicroservices on the Edge
Microservices on the Edge
LaunchAny3.5K views
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows by DoKC
Dok Talks #111 - Scheduled Scaling with Dask and Argo WorkflowsDok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
Dok Talks #111 - Scheduled Scaling with Dask and Argo Workflows
DoKC216 views
3. Rackspace CloudCamp Las Vegas 2009 5/18 by Intel Corporation
3. Rackspace CloudCamp Las Vegas 2009 5/183. Rackspace CloudCamp Las Vegas 2009 5/18
3. Rackspace CloudCamp Las Vegas 2009 5/18
Intel Corporation301 views
Docker:- Application Delivery Platform Towards Edge Computing by Bukhary Ikhwan Ismail
Docker:- Application Delivery Platform Towards Edge ComputingDocker:- Application Delivery Platform Towards Edge Computing
Docker:- Application Delivery Platform Towards Edge Computing
Services are the New Cloud Platform (Services-as-a-Platform) by Randy Bias
Services are the New Cloud Platform (Services-as-a-Platform)Services are the New Cloud Platform (Services-as-a-Platform)
Services are the New Cloud Platform (Services-as-a-Platform)
Randy Bias3.4K views
Google Cloud Platform by GeneXus
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
GeneXus1.3K views
Google Cloud Platform by VMware Tanzu
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
VMware Tanzu9.5K views
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft... by Tom Kerkhove
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
Integrate UK 2019 - Adventures of building a (multi-tenant) PaaS on Microsoft...
Tom Kerkhove376 views
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media by Scality
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital MediaSuperior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Scality2.8K views
Google Cloud Platform Tutorial | GCP Fundamentals | Edureka by Edureka!
Google Cloud Platform Tutorial | GCP Fundamentals | EdurekaGoogle Cloud Platform Tutorial | GCP Fundamentals | Edureka
Google Cloud Platform Tutorial | GCP Fundamentals | Edureka
Edureka!1.1K views
Andre Paul: Importing VMware infrastructures into CloudStack by ShapeBlue
Andre Paul: Importing VMware infrastructures into CloudStackAndre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStack
ShapeBlue271 views

Similar to Cloudera streaming with flink oct 29, 2020 meetup london

Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023 by
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023ssuser73434e
54 views79 slides
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu) by
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Timothy Spann
559 views42 slides
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifi by
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifiTracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifiTimothy Spann
966 views23 slides
Partner Briefing_January 25 (FINAL).pptx by
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
109 views55 slides
Introducing Cloudera DataFlow (CDF) 2.13.19 by
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
4.9K views31 slides
Cloud Computing and the Promise of Everything as a Service by
Cloud Computing and the Promise of Everything as a ServiceCloud Computing and the Promise of Everything as a Service
Cloud Computing and the Promise of Everything as a ServiceLew Tucker
1.6K views30 slides

Similar to Cloudera streaming with flink oct 29, 2020 meetup london(20)

Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023 by ssuser73434e
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
ssuser73434e54 views
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu) by Timothy Spann
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Using the FLaNK Stack for edge ai (flink, nifi, kafka, kudu)
Timothy Spann559 views
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifi by Timothy Spann
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifiTracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Timothy Spann966 views
Partner Briefing_January 25 (FINAL).pptx by Cloudera, Inc.
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.109 views
Introducing Cloudera DataFlow (CDF) 2.13.19 by Cloudera, Inc.
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.4.9K views
Cloud Computing and the Promise of Everything as a Service by Lew Tucker
Cloud Computing and the Promise of Everything as a ServiceCloud Computing and the Promise of Everything as a Service
Cloud Computing and the Promise of Everything as a Service
Lew Tucker1.6K views
Implement a Universal Data Distribution Architecture to Manage All Streaming ... by Timothy Spann
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Timothy Spann28 views
Introduction to Apache NiFi 1.10 by Timothy Spann
Introduction to Apache NiFi 1.10Introduction to Apache NiFi 1.10
Introduction to Apache NiFi 1.10
Timothy Spann2K views
Stl meetup cloudera platform - january 2020 by Adam Doyle
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
Adam Doyle721 views
Snowflake’s Cloud Data Platform and Modern Analytics by Senturus
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus147 views
Cloudera - The Modern Platform for Analytics by Cloudera, Inc.
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
Cloudera, Inc.1.9K views
Smart, Secure and Efficient Data Sharing in IoT by Angelo Corsaro
Smart, Secure and Efficient Data Sharing in IoTSmart, Secure and Efficient Data Sharing in IoT
Smart, Secure and Efficient Data Sharing in IoT
Angelo Corsaro3.4K views
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka... by Timothy Spann
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
Timothy Spann519 views
Blockchain and Apache NiFi by Timothy Spann
Blockchain and Apache NiFiBlockchain and Apache NiFi
Blockchain and Apache NiFi
Timothy Spann1.3K views
Delivering Data Democratization in the Cloud with Snowflake by Kent Graziano
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
Kent Graziano805 views
Enterprise machine learning on k8s lessons learned and the road ahead by Timothy Chen
Enterprise machine learning on k8s   lessons learned and the road aheadEnterprise machine learning on k8s   lessons learned and the road ahead
Enterprise machine learning on k8s lessons learned and the road ahead
Timothy Chen173 views
Slides-Discover-Power-of-Live-Data(2).pdf by butthead7
Slides-Discover-Power-of-Live-Data(2).pdfSlides-Discover-Power-of-Live-Data(2).pdf
Slides-Discover-Power-of-Live-Data(2).pdf
butthead72 views
Effective IoT System on Openstack by Takashi Kajinami
Effective IoT System on OpenstackEffective IoT System on Openstack
Effective IoT System on Openstack
Takashi Kajinami4.7K views

More from Timothy Spann

Building Real-Time Travel Alerts by
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel AlertsTimothy Spann
165 views48 slides
JConWorld_ Continuous SQL with Kafka and Flink by
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and FlinkTimothy Spann
156 views36 slides
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines by
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data PipelinesTimothy Spann
150 views25 slides
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo by
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoTimothy Spann
162 views8 slides
CoC23_ Looking at the New Features of Apache NiFi by
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiTimothy Spann
36 views24 slides
CoC23_ Let’s Monitor The Conditions at the Conference by
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the ConferenceTimothy Spann
17 views17 slides

More from Timothy Spann(20)

Building Real-Time Travel Alerts by Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann165 views
JConWorld_ Continuous SQL with Kafka and Flink by Timothy Spann
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann156 views
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines by Timothy Spann
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
[EN]DSS23_tspann_Integrating LLM with Streaming Data Pipelines
Timothy Spann150 views
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo by Timothy Spann
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines DemoEvolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Evolve 2023 NYC - Integrating AI Into Realtime Data Pipelines Demo
Timothy Spann162 views
CoC23_ Looking at the New Features of Apache NiFi by Timothy Spann
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
Timothy Spann36 views
CoC23_ Let’s Monitor The Conditions at the Conference by Timothy Spann
CoC23_ Let’s Monitor The Conditions at the ConferenceCoC23_ Let’s Monitor The Conditions at the Conference
CoC23_ Let’s Monitor The Conditions at the Conference
Timothy Spann17 views
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf by Timothy Spann
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdfOSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
OSSFinance_UnlockingFinancialDatawithReal-TimePipelines.pdf
Timothy Spann23 views
CoC23_Utilizing Real-Time Transit Data for Travel Optimization by Timothy Spann
CoC23_Utilizing Real-Time Transit Data for Travel OptimizationCoC23_Utilizing Real-Time Transit Data for Travel Optimization
CoC23_Utilizing Real-Time Transit Data for Travel Optimization
Timothy Spann31 views
The Never Landing Stream with HTAP and Streaming by Timothy Spann
The Never Landing Stream with HTAP and StreamingThe Never Landing Stream with HTAP and Streaming
The Never Landing Stream with HTAP and Streaming
Timothy Spann254 views
Meetup - Brasil - Data In Motion - 2023 September 19 by Timothy Spann
Meetup - Brasil - Data In Motion - 2023 September 19Meetup - Brasil - Data In Motion - 2023 September 19
Meetup - Brasil - Data In Motion - 2023 September 19
Timothy Spann319 views
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data by Timothy Spann
Building Real-time Pipelines with FLaNK_ A Case Study with Transit DataBuilding Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Building Real-time Pipelines with FLaNK_ A Case Study with Transit Data
Timothy Spann193 views
big data fest building modern data streaming apps by Timothy Spann
big data fest building modern data streaming appsbig data fest building modern data streaming apps
big data fest building modern data streaming apps
Timothy Spann317 views
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp by Timothy Spann
Using Apache NiFi with Apache Pulsar for Fast Data On-RampUsing Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp
Timothy Spann163 views
OSSNA Building Modern Data Streaming Apps by Timothy Spann
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
Timothy Spann155 views
GSJUG: Mastering Data Streaming Pipelines 09May2023 by Timothy Spann
GSJUG: Mastering Data Streaming Pipelines 09May2023GSJUG: Mastering Data Streaming Pipelines 09May2023
GSJUG: Mastering Data Streaming Pipelines 09May2023
Timothy Spann255 views
BestInFlowCompetitionTutorials03May2023 by Timothy Spann
BestInFlowCompetitionTutorials03May2023BestInFlowCompetitionTutorials03May2023
BestInFlowCompetitionTutorials03May2023
Timothy Spann11 views
Cloudera Sandbox Event Guidelines For Workflow by Timothy Spann
Cloudera Sandbox Event Guidelines For WorkflowCloudera Sandbox Event Guidelines For Workflow
Cloudera Sandbox Event Guidelines For Workflow
Timothy Spann32 views
Meet the Committers Webinar_ Lab Preparation by Timothy Spann
Meet the Committers Webinar_ Lab PreparationMeet the Committers Webinar_ Lab Preparation
Meet the Committers Webinar_ Lab Preparation
Timothy Spann32 views
Best Practices For Workflow by Timothy Spann
Best Practices For WorkflowBest Practices For Workflow
Best Practices For Workflow
Timothy Spann89 views

Recently uploaded

Ransomware is Knocking your Door_Final.pdf by
Ransomware is Knocking your Door_Final.pdfRansomware is Knocking your Door_Final.pdf
Ransomware is Knocking your Door_Final.pdfSecurity Bootcamp
98 views46 slides
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
137 views13 slides
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...The Digital Insurer
91 views52 slides
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsShapeBlue
247 views13 slides
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITShapeBlue
208 views8 slides
Future of AR - Facebook Presentation by
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook PresentationRob McCarty
65 views27 slides

Recently uploaded(20)

CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue137 views
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by The Digital Insurer
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue247 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue208 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty65 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue108 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue183 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays33 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue139 views
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... by ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue196 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue207 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li91 views
Business Analyst Series 2023 - Week 4 Session 8 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 8Business Analyst Series 2023 -  Week 4 Session 8
Business Analyst Series 2023 - Week 4 Session 8
DianaGray10145 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue162 views
Initiating and Advancing Your Strategic GIS Governance Strategy by Safe Software
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software184 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE84 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue303 views
The Power of Generative AI in Accelerating No Code Adoption.pdf by Saeed Al Dhaheri
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
Saeed Al Dhaheri39 views

Cloudera streaming with flink oct 29, 2020 meetup london

  • 1. Future of Data: London presents: Cloudera Streaming with Flink Timothy Spann - Principal DataFlow Field Engineer
  • 2. © 2020 Cloudera, Inc. All rights reserved. 2 More Information Cloudera’s Next Presentation: Validating a Jet Engine Predictive Model in a Cloud Environment Wednesday, November 18, 2020 11:00 AM to 12:30 PM CST https://www.meetup.com/futureofdata-austin/events/273929240/
  • 4. © 2020 Cloudera, Inc. All rights reserved. 4 Welcome to Future of Data - Virtual @PaasDev https://www.meetup.com/futureofdata-princeton/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ...
  • 5. © 2020 Cloudera, Inc. All rights reserved. 5 Today’s Lead Who am I? Data in Motion Field Engineer @PaasDev DZone Zone Leader and Big Data MVB Princeton NJ Future of Data Meetup ex-Pivotal Field Engineer https://github.com/tspannhw https://www.datainmotion.dev/ Speaking at Flink Forward, AI Dev World, Nethope and OSS.
  • 6. © 2020 Cloudera, Inc. All rights reserved. 6 Yes, Franz, It’s Kafka Let’s do a metamorphosis on your data. Don’t fear changing data. You don’t need to be a brilliant writer to stream data. Franz Kafka was a German-speaking Bohemian novelist and short-story writer, widely regarded as one of the major figures of 20th-century literature. His work fuses elements of realism and the fantastic. Wikipedia
  • 7. © 2019 Cloudera, Inc. All rights reserved. 7 Apache Kafka • Highly reliable distributed messaging system • Decouple applications, enables many-to-many patterns • Publish-Subscribe semantics • Horizontal scalability • Efficient implementation to operate at speed with big data volumes • Organized by topic to support several use cases Source System Source System Source System Kafka Fraud Detection Security Systems Real-Time Monitoring Source System Source System Source System Fraud Detection Security Systems Real-Time Monitoring Many-To-Many Publish-Subscribe Point-To-Point Request-Response
  • 8. © 2020 Cloudera, Inc. All rights reserved. 8 Weather Streaming Pipeline Weather Weather Climate Aggregates Sensors SQL Analytics Sources Pollution
  • 9. © 2020 Cloudera, Inc. All rights reserved. 9 Weather Streaming Pipeline
  • 10. © 2020 Cloudera, Inc. All rights reserved. 10 Weather Streaming Pipeline
  • 12. © 2020 Cloudera, Inc. All rights reserved. 12 Simplifying the User Experience
  • 13. © 2020 Cloudera, Inc. All rights reserved. 13 DATA-IN-MOTION COMPONENTS IN CONTEXT
  • 14. © 2019 Cloudera, Inc. All rights reserved. 14 SDX ENABLES GOVERNANCE FOR THE ENTIRE DATA LIFECYCLE CDF CDE CML Data Engineering Transformations Hive Table Fraud ML Project Fraud Model Training NiFi Processors Fraud Model Running in Prod
  • 15. End-to-End Streaming with Cloudera DataFlow
  • 16. © 2020 Cloudera, Inc. All rights reserved. 16 CLOUDERA FLOW AND EDGE MANAGEMENT Enable easy ingestion, routing, management and delivery of any data anywhere (Edge, cloud, data center) to any downstream system with built in end-to-end security and provenance Advanced tooling to industrialize flow development (Flow Development Life Cycle) ACQUIRE • Over 300 Prebuilt Processors • Easy to build your own • Parse, Enrich & Apply Schema • Filter, Split, Merger & Route • Throttle & Backpressure FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG PROCESS HASH MERGE EXTRACT DUPLICATE SPLIT ENCRYPT TALL EVALUATE EXECUTE GEOENRICH SCAN REPLACE TRANSLATE CONVERT ROUTE TEXT ROUTE CONTENT ROUTE CONTEXT ROUTE RATE DISTRIBUTE LOAD DELIVER • Guaranteed Delivery • Full data provenance from acquisition to delivery • Diverse, Non-Traditional Sources • Eco-system integration FTP SFTP HL7 UDP XML HTTP EMAIL HTML IMAGE SYSLOG
  • 17. © 2020 Cloudera, Inc. All rights reserved. 17 Processing one billion events per second with NiFi https://blog.cloudera.com/benchmarking-nifi-performance-and-scalability/
  • 18. © 2020 Cloudera, Inc. All rights reserved. 18 New features delivered with Cloudera Streaming Analytics (CSA) 1.2 Next Generation Streaming Analytics Flink SQL Support Agile Streaming App Development using SQL New Flink Atlas Hook Capture operational Flink app metadata and lineage Single View of Flink Yarn Jobs Improve Developer Experience & operational visibility
  • 19. © 2020 Cloudera, Inc. All rights reserved. 19 ● https://www.cloudera.com/tutorials/building-a-sentiment-analysis-application/3.html ● https://blog.cloudera.com/integrating-machine-learning-models-into-your-big-data-pipeli nes-in-real-time-with-no-coding/ ● https://community.cloudera.com/t5/Community-Articles/Processing-Real-Time-Social-Me dia-Twitter-with-Apache-NiFi-1/ta-p/248354 ● https://www.datainmotion.dev/2020/09/using-djlai-for-deep-learning-based.html ● https://github.com/tspannhw/airline-sentiment-streaming ● https://github.com/tspannhw/nifi-corenlp-processor ● https://github.com/tspannhw/nifi-cdsw ● https://github.com/tspannhw/nifi-djlsentimentanalysis-processor ● https://www.datainmotion.dev/2020/04/harnessing-data-lifecycle-for-customer.html REFERENCES
  • 20. © 2020 Cloudera, Inc. All rights reserved. 20 TH N Y U