SlideShare a Scribd company logo
© Cloudera, Inc. All rights reserved.
INTRODUCING
CLOUDERA DATAFLOW (CDF)
Dinesh Chandrasekhar
Product Marketing Lead, Data-in-Motion BU
Cloudera
@AppInt4All
George Vetticaden
Product Management Lead, Data-in-Motion BU
Cloudera
@gvetticaden
© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.
Cloud
~$410 B
Streaming
~$1.65 B
Data Science
~$180 B
Big Data
~$210 B
IoT
~$1.2 T
MARKET OPPORTUNITIES
© Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved.
IOT MARKET
By 2024 more than 24.9 Billion IoT connections will be established
An estimated $70 billion will be spent by global manufacturers on
IoT solutions in 2020
An estimated 646 million healthcare devices (excluding fitness
trackers and wearable devices) will be connected by 2020
An estimated 78% of cars shipped globally will be built with
hardware that connects to the internet by 2020
50% of decision-makers in IT, services, utilities, and manufacturing
have either deployed IoT, or will deploy it in the next 12-24 months
$70B
646M
78%
50%
24.9B
© Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
KEY CUSTOMER CHALLENGES
Visibility: Lack visibility of end-to-end streaming data flows,
inability to troubleshoot bottlenecks, consumption patterns etc.
Data Ingestion: High-volume streaming sources, multiple message
formats, diverse protocols and multi-vendor devices creates data
ingestion challenges
Real-time Insights: Analyzing continuous and rapid inflow
(velocity) of streaming data at high volumes creates major
challenges for gaining real-time insights
© Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved.
CLOUDERA DATAFLOW
© Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
WHAT IS CLOUDERA DATAFLOW (CDF)?
Cloudera DataFlow (CDF) is a scalable, real-time
streaming data platform that collects, curates, and
analyzes data so customers gain key insights for
immediate actionable intelligence.
© Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
Mid-2000’s
NiFi was developed
and used at NSA
2015
Onyara is acquired
HDF is born
2018
Strong Streaming Platform
- Support for Kafka 2.0
- SMM is introduced
Tomorrow:
Edge-to-AI
Bring this to the edge with
connected platforms
HISTORY OF CDF
Data-in-Motion:
• Comprehensive real-time streaming data
platform
• Manage data-in-motion from edge-to-
enterprise
• Power IoT-scale streaming architectures
Enable next generation
Modern Data Architecture
2019
Cloudera merger
Enable Edge Intelligence
© Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
COMMON USE CASES
Data Movement
Optimize resource utilization by moving data
between data centers or between on-premises
infrastructure and cloud infrastructure
Optimize Log Collection & Analysis
Optimize log analytics solutions by using CDF
as a single platform to collect and deliver
multiple data sources
Gain key insights with Streaming Analytics
Accelerate big data ROI by analyzing
streaming data for patterns, comparing with ML
models and delivering actionable intelligence
Single view / 360° view of customer
Ingest, transform and combine customer
data from multiple sources into a single data
view / lake
Stream Processing
Combine multiple streams of data in real-
time, enrich the data and route it to different
end points based on rules
Capture IoT Data
Ingest sensor data from IoT devices and
stream it for further processing and
comprehensive analysis
© Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.
Public Sector Transportation Utilities Healthcare Manufacturing Retail
COMMON IOT USE CASES BY INDUSTRY
Fleet
Management
Connected
Cars
Smart
Cities
Predictive
Analytics
Inventory/
Material
Tracking
• IoT is a $1.13T market opportunity in 2021.
• Americas - $329B IoT spending. Manufacturing and Transportation are top industries, accounting for 26% of total spending.
• APAC - $500B IoT spending. Manufacturing, Utilities and Transportation are top industries.
• EMEA - $264B IoT spending. Manufacturing is top industry, powered by Industry 4.0 initiatives.
• Worldwide IoT Analytics and Information Management Market = $573M
Top 5
Use cases Utility
Monitoring
Predictive
Maintenance
Patient
Monitoring
Usage-based
Insurance
Asset
Tracking /
Monitoring
Edge Data
Collection
© Cloudera, Inc. All rights reserved. 10
CUSTOMERS
© Cloudera, Inc. All rights reserved.
Improving Healthcare with SMART data
Combine multi-format data
streams, with hundreds of
sources, into one platform
• Needed a platform that could
combine multi-format data
streaming
• Data scarcity & latency
problems
• Machine learning & data
science
• First to deliver SMART real-
time streaming data
• Clearsense’s Inception™
product enables fast decisions
for clinicians
• Customers have access to all
data sources with HDP & CDF
Cloud-based systems
architected to deliver
SMART data, using HDP
and CDF
• Mission critical data is now
available for doctors to make
critical decisions
• Cost efficiencies led to access for
2,000 rural providers
• Real-time data helps prevent
“Code Blue”
Mission-critical data and
relevant insight for 2,000
rural providers
Photo by rawpixel on Unsplash
Lack of medical
expertise around
patient care, post
surgery
• Patient Code Blue status
• Possible cardiac arrest 4–
6 hours post surgery
C H A L L E N G E R E S U L TS O L U T I O NI M P A C T
© Cloudera, Inc. All rights reserved.
Positioning technology products & services empower companies worldwide
Provide accurate data for
small carriers to improve
business results
• 95% of small carriers (less
than 50 trucks) have a deficit
of data available
• Estimated data, price points
and revenue base
opportunity for controlling
fuel cost
• Understanding of freight and
lane movement
• Leveraging big data powering
Blockchain, with machine
learning, to revolutionize
Transportation and Logistics
industries
• Analyzed fuel data; can
consolidate data set for small
carriers to generate community
data lake
Big Data in the Cloud
with HDP, CDF, and
Microsoft Azure
• Managing for 4 million
trucks daily
• $31 billion dollars in freight
movement guides
customers to profitability
• Blockchain driven
architecture
Double digit revenue
increase, year over year
C H A L L E N G E
Photo by rawpixel.com on Unsplash
Continuing on current
path would slow
organizational growth and
impact customers
• Being unable to predict
weather patterns would lead to
delays and decreased product
quality
• Operational inefficiencies
prevent reaching business
revenue goals, lack of insights
• Loss of product during
transportation
R E S U L TS O L U T I O NI M P A C T
© Cloudera, Inc. All rights reserved. 13
PRODUCT OVERVIEW
© Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved.
CLOUDERA DATAFLOW
© Cloudera, Inc. All rights reserved. 15
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved.
EDGE DATA MANAGEMENT
• Edge data collection powered by Apache MiNiFi
• MiNiFi – smaller footprint than NiFi
• Guaranteed delivery
• Data buffering
• Prioritized queuing
• Flow-specific QoS
• Data provenance
• Designed for extension
• C++ / Java agents
• Designed for IoT
© Cloudera, Inc. All rights reserved. 17
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.
FLOW MANAGEMENT
• Web-based user interface
• Highly configurable
• Out-of-the-box data provenance
• Designed for extensibility
• Secure
• NiFi Registry
• DevOps support
• FDLC
• Versioning
• Deployment
© Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.
280+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION
Hash
Extract
Merge
Duplicate
Scan
GeoEnrich
Replace
ConvertSplit
Translate
Route Content
Route Context
Route Text
Control Rate
Distribute Load
Generate Table Fetch
Jolt Transform JSON
Prioritized Delivery
Encrypt
Tail
Evaluate
Execute
All Apache project logos are trademarks of the ASF and the respective projects.
Fetch
HTTP
Syslog
Email
HTML
Image
HL7
FTP
UDP
XML
SFTP
AMQP
WebSocket
© Cloudera, Inc. All rights reserved. 20
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved.
Streaming Analytics Reference Architecture
Data Flow Apps
Powered by NiFi
Kafka is Everywhere. Critical Component of Streaming Architectures
Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers
US West Fleet
Truck Sensors C++
Agent
US Central Fleet
Truck Sensors C++
Agent
US East Fleet
Truck Sensors C++
Agent
Analytics App 1
Analytics App 2
Analytics App 5
Analytics App 3
Analytics App 4
© Cloudera, Inc. All rights reserved.
Cloudera Streams Messaging Manager (SMM)
What is SMM?
 Kafka Management and Monitoring
tool
 Cure the “Kafka Blindness”
 Single Monitoring Dashboard for all
your Kafka Clusters across 4 entities
– Broker
– Producer
– Topic
– Consumer
 REST as a First Class Citizen
 Alerting
 Schema Management
 Integration with Schema Registry
© Cloudera, Inc. All rights reserved. 23
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.
STREAMING ANALYTICS
• Pattern matching
• Predictive and Prescriptive Analytics
• Complex Event Processing
• Continuous & Real-time Insights
© Cloudera, Inc. All rights reserved.
OLAP Access PatternSQL Access Pattern
Streaming Event Storage Substrate
Topic A
Kafka Topic Kafka Topic
Topic B
Kafka Topic
Topic C
Kafka Topic
Topic D
Kafka Topic
Topic X
3 KafkaAnalyticsAccess Patterns
Streaming Access Pattern
N
ew
KAFKA SQL
New
KAFKA OLAP
New
© Cloudera, Inc. All rights reserved. 26
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
ENTERPRISE SERVICES
• Provisioning
• Management
• Monitoring
• Unified Security
• Single Sign-on
• Audit
• Compliance
• Edge-to-Enterprise Governance
© Cloudera, Inc. All rights reserved. 28
CLOUDERA DATAFLOW Data-in-motion platform
© Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved.
KEY DIFFERENTIATORS
Comprehensive streaming platform – Only big data vendor to offer a comprehensive streaming
platform from real-time data ingestion, transformation, routing to descriptive, prescriptive and predictive
analytics.
100% open source technology – Only vendor with this strategy; prevents vendor lock-in
280+ pre-built processors – Only product to offer such comprehensive connectivity from edge to
enterprise
Built-in data provenance – Only product in the market to offer out-of-the-box data provenance on data-
in-motion
3 Streaming analytics engines – Only vendor to offer a choice of three streaming analytics engines to
customers for all their streaming architecture needs
© Cloudera, Inc. All rights reserved. 30
DEMO
© Cloudera, Inc. All rights reserved. 31
QUESTIONS?

More Related Content

What's hot

Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
 
Considerations for Data Access in the Lakehouse
Considerations for Data Access in the LakehouseConsiderations for Data Access in the Lakehouse
Considerations for Data Access in the Lakehouse
Databricks
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Kai Wähner
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Dr. Arif Wider
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
DATAVERSITY
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
Trivadis
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
DATAVERSITY
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
Alation
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
DATAVERSITY
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
Adrien Blind
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
Databricks
 
A Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta LakeA Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta Lake
Databricks
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
Databricks
 

What's hot (20)

Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Considerations for Data Access in the Lakehouse
Considerations for Data Access in the LakehouseConsiderations for Data Access in the Lakehouse
Considerations for Data Access in the Lakehouse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
TechEvent Databricks on Azure
TechEvent Databricks on AzureTechEvent Databricks on Azure
TechEvent Databricks on Azure
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
A Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta LakeA Practical Enterprise Feature Store on Delta Lake
A Practical Enterprise Feature Store on Delta Lake
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 

Similar to Introducing Cloudera DataFlow (CDF) 2.13.19

Addressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge ManagementAddressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge Management
DataWorks Summit
 
Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
Cloudera, Inc.
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
Cameron. A. Bradbury
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
Nicolas Morales
 
CWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / ClouderaCWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / Cloudera
Capgemini
 
Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart Cities
Cloudera, Inc.
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
Abdelkrim Hadjidj
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility company
Ilham Ahmed
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
actualtechmedia
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
Adam Doyle
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
Skillspeed
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
Cisco
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!
Gabi Bauer
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
DataWorks Summit/Hadoop Summit
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
Serverless service adoption for Thailand
Serverless service adoption for ThailandServerless service adoption for Thailand
Serverless service adoption for Thailand
Watcharin Yang-Ngam
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
IoT Connected Brewery
IoT Connected BreweryIoT Connected Brewery
IoT Connected Brewery
Jason Hubbard
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
VMware Tanzu
 

Similar to Introducing Cloudera DataFlow (CDF) 2.13.19 (20)

Addressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge ManagementAddressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge Management
 
Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
CWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / ClouderaCWIN17 Frankfurt / Cloudera
CWIN17 Frankfurt / Cloudera
 
Cloudera - IoT & Smart Cities
Cloudera - IoT & Smart CitiesCloudera - IoT & Smart Cities
Cloudera - IoT & Smart Cities
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
 
Digital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility companyDigital Business Transformation for Energy & Utility company
Digital Business Transformation for Energy & Utility company
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
 
Why Infrastructure matters?!
Why Infrastructure matters?!Why Infrastructure matters?!
Why Infrastructure matters?!
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Serverless service adoption for Thailand
Serverless service adoption for ThailandServerless service adoption for Thailand
Serverless service adoption for Thailand
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
IoT Connected Brewery
IoT Connected BreweryIoT Connected Brewery
IoT Connected Brewery
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 

Recently uploaded

High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
Vadym Kazulkin
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
christinelarrosa
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
c5vrf27qcz
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Fwdays
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 

Recently uploaded (20)

High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Y-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PPY-Combinator seed pitch deck template PP
Y-Combinator seed pitch deck template PP
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 

Introducing Cloudera DataFlow (CDF) 2.13.19

  • 1. © Cloudera, Inc. All rights reserved. INTRODUCING CLOUDERA DATAFLOW (CDF) Dinesh Chandrasekhar Product Marketing Lead, Data-in-Motion BU Cloudera @AppInt4All George Vetticaden Product Management Lead, Data-in-Motion BU Cloudera @gvetticaden
  • 2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved. Cloud ~$410 B Streaming ~$1.65 B Data Science ~$180 B Big Data ~$210 B IoT ~$1.2 T MARKET OPPORTUNITIES
  • 3. © Cloudera, Inc. All rights reserved. 3© Cloudera, Inc. All rights reserved. IOT MARKET By 2024 more than 24.9 Billion IoT connections will be established An estimated $70 billion will be spent by global manufacturers on IoT solutions in 2020 An estimated 646 million healthcare devices (excluding fitness trackers and wearable devices) will be connected by 2020 An estimated 78% of cars shipped globally will be built with hardware that connects to the internet by 2020 50% of decision-makers in IT, services, utilities, and manufacturing have either deployed IoT, or will deploy it in the next 12-24 months $70B 646M 78% 50% 24.9B
  • 4. © Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved. KEY CUSTOMER CHALLENGES Visibility: Lack visibility of end-to-end streaming data flows, inability to troubleshoot bottlenecks, consumption patterns etc. Data Ingestion: High-volume streaming sources, multiple message formats, diverse protocols and multi-vendor devices creates data ingestion challenges Real-time Insights: Analyzing continuous and rapid inflow (velocity) of streaming data at high volumes creates major challenges for gaining real-time insights
  • 5. © Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved. CLOUDERA DATAFLOW
  • 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved. WHAT IS CLOUDERA DATAFLOW (CDF)? Cloudera DataFlow (CDF) is a scalable, real-time streaming data platform that collects, curates, and analyzes data so customers gain key insights for immediate actionable intelligence.
  • 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved. Mid-2000’s NiFi was developed and used at NSA 2015 Onyara is acquired HDF is born 2018 Strong Streaming Platform - Support for Kafka 2.0 - SMM is introduced Tomorrow: Edge-to-AI Bring this to the edge with connected platforms HISTORY OF CDF Data-in-Motion: • Comprehensive real-time streaming data platform • Manage data-in-motion from edge-to- enterprise • Power IoT-scale streaming architectures Enable next generation Modern Data Architecture 2019 Cloudera merger Enable Edge Intelligence
  • 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved. COMMON USE CASES Data Movement Optimize resource utilization by moving data between data centers or between on-premises infrastructure and cloud infrastructure Optimize Log Collection & Analysis Optimize log analytics solutions by using CDF as a single platform to collect and deliver multiple data sources Gain key insights with Streaming Analytics Accelerate big data ROI by analyzing streaming data for patterns, comparing with ML models and delivering actionable intelligence Single view / 360° view of customer Ingest, transform and combine customer data from multiple sources into a single data view / lake Stream Processing Combine multiple streams of data in real- time, enrich the data and route it to different end points based on rules Capture IoT Data Ingest sensor data from IoT devices and stream it for further processing and comprehensive analysis
  • 9. © Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved. Public Sector Transportation Utilities Healthcare Manufacturing Retail COMMON IOT USE CASES BY INDUSTRY Fleet Management Connected Cars Smart Cities Predictive Analytics Inventory/ Material Tracking • IoT is a $1.13T market opportunity in 2021. • Americas - $329B IoT spending. Manufacturing and Transportation are top industries, accounting for 26% of total spending. • APAC - $500B IoT spending. Manufacturing, Utilities and Transportation are top industries. • EMEA - $264B IoT spending. Manufacturing is top industry, powered by Industry 4.0 initiatives. • Worldwide IoT Analytics and Information Management Market = $573M Top 5 Use cases Utility Monitoring Predictive Maintenance Patient Monitoring Usage-based Insurance Asset Tracking / Monitoring Edge Data Collection
  • 10. © Cloudera, Inc. All rights reserved. 10 CUSTOMERS
  • 11. © Cloudera, Inc. All rights reserved. Improving Healthcare with SMART data Combine multi-format data streams, with hundreds of sources, into one platform • Needed a platform that could combine multi-format data streaming • Data scarcity & latency problems • Machine learning & data science • First to deliver SMART real- time streaming data • Clearsense’s Inception™ product enables fast decisions for clinicians • Customers have access to all data sources with HDP & CDF Cloud-based systems architected to deliver SMART data, using HDP and CDF • Mission critical data is now available for doctors to make critical decisions • Cost efficiencies led to access for 2,000 rural providers • Real-time data helps prevent “Code Blue” Mission-critical data and relevant insight for 2,000 rural providers Photo by rawpixel on Unsplash Lack of medical expertise around patient care, post surgery • Patient Code Blue status • Possible cardiac arrest 4– 6 hours post surgery C H A L L E N G E R E S U L TS O L U T I O NI M P A C T
  • 12. © Cloudera, Inc. All rights reserved. Positioning technology products & services empower companies worldwide Provide accurate data for small carriers to improve business results • 95% of small carriers (less than 50 trucks) have a deficit of data available • Estimated data, price points and revenue base opportunity for controlling fuel cost • Understanding of freight and lane movement • Leveraging big data powering Blockchain, with machine learning, to revolutionize Transportation and Logistics industries • Analyzed fuel data; can consolidate data set for small carriers to generate community data lake Big Data in the Cloud with HDP, CDF, and Microsoft Azure • Managing for 4 million trucks daily • $31 billion dollars in freight movement guides customers to profitability • Blockchain driven architecture Double digit revenue increase, year over year C H A L L E N G E Photo by rawpixel.com on Unsplash Continuing on current path would slow organizational growth and impact customers • Being unable to predict weather patterns would lead to delays and decreased product quality • Operational inefficiencies prevent reaching business revenue goals, lack of insights • Loss of product during transportation R E S U L TS O L U T I O NI M P A C T
  • 13. © Cloudera, Inc. All rights reserved. 13 PRODUCT OVERVIEW
  • 14. © Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved. CLOUDERA DATAFLOW
  • 15. © Cloudera, Inc. All rights reserved. 15 CLOUDERA DATAFLOW Data-in-motion platform
  • 16. © Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved. EDGE DATA MANAGEMENT • Edge data collection powered by Apache MiNiFi • MiNiFi – smaller footprint than NiFi • Guaranteed delivery • Data buffering • Prioritized queuing • Flow-specific QoS • Data provenance • Designed for extension • C++ / Java agents • Designed for IoT
  • 17. © Cloudera, Inc. All rights reserved. 17 CLOUDERA DATAFLOW Data-in-motion platform
  • 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved. FLOW MANAGEMENT • Web-based user interface • Highly configurable • Out-of-the-box data provenance • Designed for extensibility • Secure • NiFi Registry • DevOps support • FDLC • Versioning • Deployment
  • 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved. 280+ PROCESSORS FOR DEEPER ECOSYSTEM INTEGRATION Hash Extract Merge Duplicate Scan GeoEnrich Replace ConvertSplit Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioritized Delivery Encrypt Tail Evaluate Execute All Apache project logos are trademarks of the ASF and the respective projects. Fetch HTTP Syslog Email HTML Image HL7 FTP UDP XML SFTP AMQP WebSocket
  • 20. © Cloudera, Inc. All rights reserved. 20 CLOUDERA DATAFLOW Data-in-motion platform
  • 21. © Cloudera, Inc. All rights reserved. Streaming Analytics Reference Architecture Data Flow Apps Powered by NiFi Kafka is Everywhere. Critical Component of Streaming Architectures Kafka Producers Kafka Topics Kafka TopicsKafka Consumers & Producers Kafka Consumers US West Fleet Truck Sensors C++ Agent US Central Fleet Truck Sensors C++ Agent US East Fleet Truck Sensors C++ Agent Analytics App 1 Analytics App 2 Analytics App 5 Analytics App 3 Analytics App 4
  • 22. © Cloudera, Inc. All rights reserved. Cloudera Streams Messaging Manager (SMM) What is SMM?  Kafka Management and Monitoring tool  Cure the “Kafka Blindness”  Single Monitoring Dashboard for all your Kafka Clusters across 4 entities – Broker – Producer – Topic – Consumer  REST as a First Class Citizen  Alerting  Schema Management  Integration with Schema Registry
  • 23. © Cloudera, Inc. All rights reserved. 23 CLOUDERA DATAFLOW Data-in-motion platform
  • 24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved. STREAMING ANALYTICS • Pattern matching • Predictive and Prescriptive Analytics • Complex Event Processing • Continuous & Real-time Insights
  • 25. © Cloudera, Inc. All rights reserved. OLAP Access PatternSQL Access Pattern Streaming Event Storage Substrate Topic A Kafka Topic Kafka Topic Topic B Kafka Topic Topic C Kafka Topic Topic D Kafka Topic Topic X 3 KafkaAnalyticsAccess Patterns Streaming Access Pattern N ew KAFKA SQL New KAFKA OLAP New
  • 26. © Cloudera, Inc. All rights reserved. 26 CLOUDERA DATAFLOW Data-in-motion platform
  • 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. ENTERPRISE SERVICES • Provisioning • Management • Monitoring • Unified Security • Single Sign-on • Audit • Compliance • Edge-to-Enterprise Governance
  • 28. © Cloudera, Inc. All rights reserved. 28 CLOUDERA DATAFLOW Data-in-motion platform
  • 29. © Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved. KEY DIFFERENTIATORS Comprehensive streaming platform – Only big data vendor to offer a comprehensive streaming platform from real-time data ingestion, transformation, routing to descriptive, prescriptive and predictive analytics. 100% open source technology – Only vendor with this strategy; prevents vendor lock-in 280+ pre-built processors – Only product to offer such comprehensive connectivity from edge to enterprise Built-in data provenance – Only product in the market to offer out-of-the-box data provenance on data- in-motion 3 Streaming analytics engines – Only vendor to offer a choice of three streaming analytics engines to customers for all their streaming architecture needs
  • 30. © Cloudera, Inc. All rights reserved. 30 DEMO
  • 31. © Cloudera, Inc. All rights reserved. 31 QUESTIONS?

Editor's Notes

  1. Data ingestion, transformation and routing done visually with no code using Apache NiFi & 260+ processors Build streaming apps and analytics from edge to datalake / EDW using builder Enable edge data collection and intelligence through MiNiFi agents Support massive IoT infrastructures Deliver perishable insights with pattern matching and Complex Event Processing (CEP) from real-time streams Manage, monitor, secure and govern streaming data
  2. What it actually is and What is the main use/goal of [product]?
  3. Provide context to why we added this to our stack at time. For CDF, it was to a) create more value from HDP by making it easier to get data into HDP and also to take advantage of growing IOT market opportunities and to address more encompassing view of data. It then was foundational for next step (DataPlane). History can help strengthen mental models of where this fits.
  4. TALK TRACK We usually help our customers get started with one of these CDF use cases: They augment their Splunk systems with a wider variety of data (via CDF), They ingest logs for cyber security and threat detection. They feed data to streaming analytics engines like Apache Spark or Apache Storm They move their own data internally between data centers on premises or to the cloud. And of course, they capture data from the Internet of Things. CDF was originally designed to be robust, so that it could continue to move data despite varying device footprints or fluctuating power or connectivity levels. The data keeps flowing, without being lost in transit. [NEXT SLIDE]
  5. Clearsense public case study, https://hortonworks.com/customers/clearsense/ Challenge Needed viable, economic, and secure platform that could combine multi-format data streaming Data scarcity/latency problems for healthcare organizations Clinicians wanted to use machine learning/data science to store/analyze data, but technology didn’t exist. Solution First to deliver SMART real-time streaming data to healthcare customers. Inception product makes data available for clinical, financial and operational decisions. Customers have access to all data sources, ingested with CDF, stored in HDP, delivered to the point of decision. Result Doctors and nurses now have a new level of mission-critical data and relevant insight that can be incorporated into clinical decisions. Cost efficiencies from running in the cloud have allowed Clearsense to offer healthcare predictive analytics to 2,000 rural providers that otherwise wouldn’t have access. Real-time data is displayed on “Mission Control” dashboard, which helps prevent Code Blue with patients.
  6. TMW/Trimble case study, https://hortonworks.com/customers/tmw-systems/ Challenge: Accurate data for small carriers needed to improve business results 95% small carriers have a deficit in the data available to them They are estimating data, price points, revenue-based opportunities and controlling fuel cost Solution: New approach enables advanced analytics leveraging Big Data. Analytics like market rate index, national rate, fuel surcharge, and maintenance cost are important because small businesses were growing at a fast rate. Leveraging big data powering Blockchain, with machine learning, to revolutionize Transportation and Logistics industries Analyzed fuel data; can consolidate data set for small carriers to generate community data lake to drive revenue, fuel and freight cost, lane analysis, and pricing ranges. Results: Double digit revenue Y/Y Managing 4M trucks on the nation/state roads, daily $31 billion dollars in freight movement guides customers to profitability Blockchain driven architecture
  7. Data ingestion, transformation and routing done visually with no code using Apache NiFi & 260+ processors Build streaming apps and analytics from edge to datalake / EDW using builder Enable edge data collection and intelligence through MiNiFi agents Support massive IoT infrastructures Deliver perishable insights with pattern matching and Complex Event Processing (CEP) from real-time streams Manage, monitor, secure and govern streaming data
  8. Web-based user interface Design, control, feedback & monitoring Highly configurable Loss tolerant vs guaranteed delivery Low latency vs high throughput Dynamic prioritization Flow can be modified at runtime Back pressure Data provenance Track dataflow from beginning to end Designed for extension Build your own processors Secure SSL, SSH, HTTPS, etc.
  9. Web-based user interface Design, control, feedback & monitoring Highly configurable Loss tolerant vs guaranteed delivery Low latency vs high throughput Dynamic prioritization Flow can be modified at runtime Back pressure Data provenance Track dataflow from beginning to end Designed for extension Build your own processors Secure SSL, SSH, HTTPS, etc.