Live Demo Jam Expands: The Leading-Edge
Streaming Data Platform with NiFi, Kafka, and
Flink
Live demo by Tim Spann, Principal DataFlow Field Engineer
© 2021 Cloudera, Inc. All rights reserved. 2
Housekeeping items for today
• Ask your questions via the Q&A widget
• Select “YES” to the exit survey to be contact by Cloudera
• Check out additional resources to learn more
© 2021 Cloudera, Inc. All rights reserved. 3
Who?
Timothy Spann
Twitter - @PaasDev // Blog: www.datainmotion.dev
Frequent speaker at major conferences and events.
Principal DataFlow Field Engineer for streaming around
Apache NiFi, NiFi Registry, MiNiFi, Kafka, Kafka Connect,
Kafka Streams, Flink, Flink SQL, SMM, SRM, SR and EFM.
Previously at E&Y, HPE, Pivotal & Hortonworks
© 2021 Cloudera, Inc. All rights reserved. 4
Cloudera Communities
Got questions? Leverage community.cloudera.com
Join our meetup:
www.meetup/pro/futureofdata
© 2021 Cloudera, Inc. All rights reserved. 5
Data
Sources
Analysts
Engineers
Scientists
Developers
Data
Users
Real-time
Batch
Structured
Unstructured
Data Lifecycle
integration for better
user productivity and
faster time to value
Hybrid & Multi-Cloud
to leverage existing
investments and
reduce risk
Secure & Governed
to simplify data
protection, sharing
and compliance
Open & Extensible
to support more
use cases faster
and at lower cost
A HYBRID / MULTI-CLOUD DATA PLATFORM AND
AN INTEGRATED SUITE OF SECURE ANALYTIC APPS
Cloudera Data Platform
Data
Visualization
Machine
Learning
Data
Warehouse
Virtual
Data Lake
Data
Engineering
Data
Collection
Streaming
Analytics
Operational
Database
© 2021 Cloudera, Inc. All rights reserved. 6
Where? Cloudera Public Cloud
CDP services are optimized for the elastic
compute & ‘always-on’ storage services provided
by any cloud provider
Web service hosted and managed by Cloudera
Hosted in the your cloud environment, but
managed by the CDP Management Console
Shared Data Experience (SDX) technologies form
a secure and governed data lake backed by object
storage (S3, ADLS, GCS)
Flow Management Streams Messaging Streaming Analytics
AWS AZURE GOOGLE
© 2021 Cloudera, Inc. All rights reserved. 7
What use cases would you like to see for the next demo jam?
10% 20% 30% 40% 50%
IoT Edge to Cloud 47%
Detect Event Streams 17%
Optimize Data Streams 36%
© 2021 Cloudera, Inc. All rights reserved. 8
IoT Edge to Cloud
Kafka Topics
Voltage Temp
Energy Usage
Streaming
Analytics
Alerting
Conditions
© 2021 Cloudera, Inc. All rights reserved. 9
DISTRIBUTE DATA COLLECTION AND PROCESSING ACROSS PLATFORMS
CDP
Cloudera Edge Management and Cloudera Flow Management
IoT
Application
Energy Logs
Kafka Topics
Voltage Temp
SQL Stream Builder
IoT Edge
Question #1
- What is the most difficult part of an Edge Flow?
- Gateway Agent
- Edge Data Collection
- Processing Data
Question #2
- What do you want to see next?
- Deliver the data to a data store
- Deliver to Kafka topic
Question #3
- Do you want to see real-time decision making on
streams?
- Yes
- No
Question #4
- What data store should I send my data to?
- Real-time Data Mart (Such as Apache Impala and Apache
Kudu)
- Object Storage (Such as HDFS on S3, ADLSv2, GCS,
Apache Ozone)
- Operational Database (Such as Apache HBase)
- In Stream Only (Apache Kafka with Data Window in SQL
Stream Builder)
- None
- All
Q&A
© 2021 Cloudera, Inc. All rights reserved. 15
TH N Y U

Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Kafka, and Flink

  • 1.
    Live Demo JamExpands: The Leading-Edge Streaming Data Platform with NiFi, Kafka, and Flink Live demo by Tim Spann, Principal DataFlow Field Engineer
  • 2.
    © 2021 Cloudera,Inc. All rights reserved. 2 Housekeeping items for today • Ask your questions via the Q&A widget • Select “YES” to the exit survey to be contact by Cloudera • Check out additional resources to learn more
  • 3.
    © 2021 Cloudera,Inc. All rights reserved. 3 Who? Timothy Spann Twitter - @PaasDev // Blog: www.datainmotion.dev Frequent speaker at major conferences and events. Principal DataFlow Field Engineer for streaming around Apache NiFi, NiFi Registry, MiNiFi, Kafka, Kafka Connect, Kafka Streams, Flink, Flink SQL, SMM, SRM, SR and EFM. Previously at E&Y, HPE, Pivotal & Hortonworks
  • 4.
    © 2021 Cloudera,Inc. All rights reserved. 4 Cloudera Communities Got questions? Leverage community.cloudera.com Join our meetup: www.meetup/pro/futureofdata
  • 5.
    © 2021 Cloudera,Inc. All rights reserved. 5 Data Sources Analysts Engineers Scientists Developers Data Users Real-time Batch Structured Unstructured Data Lifecycle integration for better user productivity and faster time to value Hybrid & Multi-Cloud to leverage existing investments and reduce risk Secure & Governed to simplify data protection, sharing and compliance Open & Extensible to support more use cases faster and at lower cost A HYBRID / MULTI-CLOUD DATA PLATFORM AND AN INTEGRATED SUITE OF SECURE ANALYTIC APPS Cloudera Data Platform Data Visualization Machine Learning Data Warehouse Virtual Data Lake Data Engineering Data Collection Streaming Analytics Operational Database
  • 6.
    © 2021 Cloudera,Inc. All rights reserved. 6 Where? Cloudera Public Cloud CDP services are optimized for the elastic compute & ‘always-on’ storage services provided by any cloud provider Web service hosted and managed by Cloudera Hosted in the your cloud environment, but managed by the CDP Management Console Shared Data Experience (SDX) technologies form a secure and governed data lake backed by object storage (S3, ADLS, GCS) Flow Management Streams Messaging Streaming Analytics AWS AZURE GOOGLE
  • 7.
    © 2021 Cloudera,Inc. All rights reserved. 7 What use cases would you like to see for the next demo jam? 10% 20% 30% 40% 50% IoT Edge to Cloud 47% Detect Event Streams 17% Optimize Data Streams 36%
  • 8.
    © 2021 Cloudera,Inc. All rights reserved. 8 IoT Edge to Cloud Kafka Topics Voltage Temp Energy Usage Streaming Analytics Alerting Conditions
  • 9.
    © 2021 Cloudera,Inc. All rights reserved. 9 DISTRIBUTE DATA COLLECTION AND PROCESSING ACROSS PLATFORMS CDP Cloudera Edge Management and Cloudera Flow Management IoT Application Energy Logs Kafka Topics Voltage Temp SQL Stream Builder IoT Edge
  • 10.
    Question #1 - Whatis the most difficult part of an Edge Flow? - Gateway Agent - Edge Data Collection - Processing Data
  • 11.
    Question #2 - Whatdo you want to see next? - Deliver the data to a data store - Deliver to Kafka topic
  • 12.
    Question #3 - Doyou want to see real-time decision making on streams? - Yes - No
  • 13.
    Question #4 - Whatdata store should I send my data to? - Real-time Data Mart (Such as Apache Impala and Apache Kudu) - Object Storage (Such as HDFS on S3, ADLSv2, GCS, Apache Ozone) - Operational Database (Such as Apache HBase) - In Stream Only (Apache Kafka with Data Window in SQL Stream Builder) - None - All
  • 14.
  • 15.
    © 2021 Cloudera,Inc. All rights reserved. 15 TH N Y U