SlideShare a Scribd company logo
1 of 24
1© Cloudera, Inc. All rights reserved.
Beyond ETL: How to Build
Continuous Ingestion for IOT
Sean Anderson | Product Marketing, Cloudera
Kirit Basu | Director of Product Management,
Streamsets
2© Cloudera, Inc. All rights reserved.
Agenda
The Internet of Things
I. Driving Data Growth
II. Real-time Capabilities
III. An IOT Data Platform
IV. Cloudera Enterprise for IOT
The IOT Use Case
I. Packaged Goods
II. Sensor Data
III. Real-time Processing
Streamsets Platform
I. Data Collector
II. Data KPI’s
III. Containerized Architecture
IV. Real-time Analytics with
Cloudera
Demo
3© Cloudera, Inc. All rights reserved.
Poll
Are you currently collecting sensor data?
• Yes
• No
• Plan to in the future
4© Cloudera, Inc. All rights reserved.
Internet of Things (IoT) – A Revolution In The Making
$1.7
Trillion
In Value
20%
Annual Growth
30 Billion
Things
250
Million
Connected Vehicles
Source - IDC & Gartner Estimates
Internet of
Things
IoT Markets - 2020
5© Cloudera, Inc. All rights reserved.
IoT Will Drive An Explosion of Data…
Data expected to explode to
44 ZB by 2020
Source: IDC
44 Trillion GB!80% of data will be
unstructured
6© Cloudera, Inc. All rights reserved.
Value is Maximized when Data is combined from
other sources
Value of Data is multiplied when you combine
and correlate it with other data from relevant
sources
Improvement in value that can be
unlocked by combining data from
multiple IoT applications and sources
SOURCE: McKinsey Global Institute analysis
“Interoperability would significantly improve performance by
combining sensor data from different machines and systems to provide
decision makers with an integrated view of performance across an
entire factory or oil rig.”
40%
7© Cloudera, Inc. All rights reserved.
The IoT Ecosystem
Consumer
Industrial
IoT Gateway
Cloud
Data Center
Data Analytics
Sensors/ Things
8© Cloudera, Inc. All rights reserved.
The IoT Ecosystem
Consumer
Industrial
IoT Gateway
Data Center
Data Analytics
Sensors/ Things
Data Characteristics
• Un-structured
• Intermittent
• Volume & Variety
Gateway
• Data Routing
• Edge-Processing
• Edge-Storage
Sensors/ Things
•To grow by 50X
•Drop in prices by
70% in last 5 years
Data Storage, Processing & Analytics
IOT Data Characteristics
• More processing in the
cloud
• Analytics on the cloud
IOT Data Analytics
• Key to Value Creation
• Combine data from multiple
sources & types
• Drive business insights
IOT Data Characteristics
• Distributed Data
Processing
• Cloud & On-Premise
Cloud
9© Cloudera, Inc. All rights reserved.
Key Attributes For Next Gen IoT Data Platform
Scale efficiently based on
your data growth
Effectively handle multiple
data-types and structures
Manage the complexity of
real-time IoT data ingest
Fundamentally Secure
Real-Time Analytics – Combine and
analyze data from multiple sources
Flexible deployment options
- Cloud & Distributed Data Processing
10© Cloudera, Inc. All rights reserved.
FILESYSTEM RELATIONAL
Cloudera Enterprise – Making Hadoop Fast, Easy, and Secure
OPERATIONS
Cloudera Manager
Cloudera Director
DATA
MANAGEMENT
Cloudera Navigator
Encrypt and KeyTrustee
Optimizer
BATCH
Sqoop
REAL-TIME
Kafka, Flume
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Partners
CLOUDERA ENTERPRISE
11© Cloudera, Inc. All rights reserved.
Cloudera Enterprise – The Data & Analytics Platform for IoT
Sensors/ IoT
Data Sources
Internal Systems External Sources
BI Solutions Real-Time AppsSearch EDWDiscove
r
Machine
Learning
Data Center
Cloud
Sensor/ IoT Data
IoT Gateway
• Data Storage
• Data Processing
• Machine Learning
• Real-time Analytics
OPERATIONS
Cloudera Manager
Cloudera Director
DATA
MANAGEMENT
Cloudera Navigator
Encrypt and KeyTrustee
Optimizer
BATCH
Sqoop
REAL-TIME
Kafka, Flume
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Partners
12© Cloudera, Inc. All rights reserved.
Cloudera Enterprise – Real Time Analytics for IoT
BI Solutions Real-Time AppsSearch EDWDiscover Machine
Learning
Deployment
Flexibility
Spark Streaming
Leadership in Spark
Integrated with EDH
Flexible Storage
Store any and all Data.
Kudu - Fast Analytics on
Fast Data
Real-Time Data
Processing
Data Security
Four pillars of security: Perimeter,
Access, Visibility, and Data
+ Record Service
Streaming Ingest
Kafka & Flume - Real-Time
Data Ingest for streaming,
high volume data
Sensor/ IoT Data Internal Systems External Sources
Centralized Mgmt.
Cloudera Manager for
centralized cluster
management
Manage Multiple Clusters – On
Premise or Cloud environment
- On Premise or Cloud
OPERATIONS
Cloudera Manager
Cloudera Director
DATA
MANAGEMENT
Cloudera Navigator
Encrypt and KeyTrustee
Optimizer
BATCH
Sqoop
REAL-TIME
Kafka, Flume
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive, Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Partners
13© Cloudera, Inc. All rights reserved.
The Cloudera Difference
Powerful Cluster Ops
Trusted by the pros
Cloud & Hybrid deployment
Integrated with AWS & Azure
Expert Support
Dedicated prescriptive help, just a click away
Real-Time IoT Analytics
The most experience with Spark
The Fastest Analytic SQL
Lowest latency, best concurrency
Fast, Updateable Analytic Storage
High throughput, low latency, and updates
Easy to ManageFast for Business Security without Compromise
Enterprise Encryption
Protects everything transparently
Access Policy Enforcement
Full-stack row/column-based RBAC & dynamic masking
Automated Data Management
Full-stack audit, lineage, discovery, and lifecycle
Secure Operations
Separation of duties, log data redaction
14© Cloudera, Inc. All rights reserved.
Continuous Data Ingestion with Cloudera & StreamSets
StreamSets enables easy onboarding and effortless data ingest into all
components of CDH
Reliable, Scalable, Always-
on Data Ingest
IoT for Consumer Packaged Goods
Gathering Sensor Data from IoT Devices
Gathering Sensor Data from Freight Containers
The StreamSets Data Collector
Design Debug Execute
StreamSets Deployment Models
Poll
Where are you in your development effort for bringing IoT data into Hadoop?
• In Production
• Test and Development
• Planning (Already decided on the architecture)
• Not there yet (Need to decide on an architecture)
• Current Architecture doesn’t work, need a better way to do things
Challenges with IoT Data
• Multitude of Sensors
• Real-Time Streaming
• Multiple Firmware versions
• Bad data from damaged sensors
• Regulatory Constraints
• Data Quality
Demo – Evolving DataSets
23© Cloudera, Inc. All rights reserved.
Getting Started is Easy
Watch the
Beyond ETL
Series
Download the
Streamsets
Data Collector
Contact Us to
start a POC
1 2 3
24© Cloudera, Inc. All rights reserved.
Thank You

More Related Content

What's hot

Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
 

What's hot (20)

Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Preparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity RenaissancePreparing for the Cybersecurity Renaissance
Preparing for the Cybersecurity Renaissance
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
 
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game Changers
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
 

Similar to How to Build Continuous Ingestion for the Internet of Things

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Amazon Web Services
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
Timothy Spann
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
ssuser73434e
 

Similar to How to Build Continuous Ingestion for the Internet of Things (20)

Top 5 IoT Use Cases
Top 5 IoT Use CasesTop 5 IoT Use Cases
Top 5 IoT Use Cases
 
IoT-Enabled Predictive Maintenance
IoT-Enabled Predictive MaintenanceIoT-Enabled Predictive Maintenance
IoT-Enabled Predictive Maintenance
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
AWS Big Data Solution Days
AWS Big Data Solution DaysAWS Big Data Solution Days
AWS Big Data Solution Days
 
CL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and PlanningCL2015 - Datacenter and Cloud Strategy and Planning
CL2015 - Datacenter and Cloud Strategy and Planning
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
 
Azure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar PresentationAzure Synapse 101 Webinar Presentation
Azure Synapse 101 Webinar Presentation
 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Build Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsightBuild Big Data Enterprise Solutions Faster on Azure HDInsight
Build Big Data Enterprise Solutions Faster on Azure HDInsight
 
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptxTrack 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
Track 1 Session 6_建立安全高效的資料分析平台加速金融創新_HC+EMQ Cliff(已檢核,上下無黑邊).pptx
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
 

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 

Recently uploaded

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Recently uploaded (20)

Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptxBUS PASS MANGEMENT SYSTEM USING PHP.pptx
BUS PASS MANGEMENT SYSTEM USING PHP.pptx
 

How to Build Continuous Ingestion for the Internet of Things

  • 1. 1© Cloudera, Inc. All rights reserved. Beyond ETL: How to Build Continuous Ingestion for IOT Sean Anderson | Product Marketing, Cloudera Kirit Basu | Director of Product Management, Streamsets
  • 2. 2© Cloudera, Inc. All rights reserved. Agenda The Internet of Things I. Driving Data Growth II. Real-time Capabilities III. An IOT Data Platform IV. Cloudera Enterprise for IOT The IOT Use Case I. Packaged Goods II. Sensor Data III. Real-time Processing Streamsets Platform I. Data Collector II. Data KPI’s III. Containerized Architecture IV. Real-time Analytics with Cloudera Demo
  • 3. 3© Cloudera, Inc. All rights reserved. Poll Are you currently collecting sensor data? • Yes • No • Plan to in the future
  • 4. 4© Cloudera, Inc. All rights reserved. Internet of Things (IoT) – A Revolution In The Making $1.7 Trillion In Value 20% Annual Growth 30 Billion Things 250 Million Connected Vehicles Source - IDC & Gartner Estimates Internet of Things IoT Markets - 2020
  • 5. 5© Cloudera, Inc. All rights reserved. IoT Will Drive An Explosion of Data… Data expected to explode to 44 ZB by 2020 Source: IDC 44 Trillion GB!80% of data will be unstructured
  • 6. 6© Cloudera, Inc. All rights reserved. Value is Maximized when Data is combined from other sources Value of Data is multiplied when you combine and correlate it with other data from relevant sources Improvement in value that can be unlocked by combining data from multiple IoT applications and sources SOURCE: McKinsey Global Institute analysis “Interoperability would significantly improve performance by combining sensor data from different machines and systems to provide decision makers with an integrated view of performance across an entire factory or oil rig.” 40%
  • 7. 7© Cloudera, Inc. All rights reserved. The IoT Ecosystem Consumer Industrial IoT Gateway Cloud Data Center Data Analytics Sensors/ Things
  • 8. 8© Cloudera, Inc. All rights reserved. The IoT Ecosystem Consumer Industrial IoT Gateway Data Center Data Analytics Sensors/ Things Data Characteristics • Un-structured • Intermittent • Volume & Variety Gateway • Data Routing • Edge-Processing • Edge-Storage Sensors/ Things •To grow by 50X •Drop in prices by 70% in last 5 years Data Storage, Processing & Analytics IOT Data Characteristics • More processing in the cloud • Analytics on the cloud IOT Data Analytics • Key to Value Creation • Combine data from multiple sources & types • Drive business insights IOT Data Characteristics • Distributed Data Processing • Cloud & On-Premise Cloud
  • 9. 9© Cloudera, Inc. All rights reserved. Key Attributes For Next Gen IoT Data Platform Scale efficiently based on your data growth Effectively handle multiple data-types and structures Manage the complexity of real-time IoT data ingest Fundamentally Secure Real-Time Analytics – Combine and analyze data from multiple sources Flexible deployment options - Cloud & Distributed Data Processing
  • 10. 10© Cloudera, Inc. All rights reserved. FILESYSTEM RELATIONAL Cloudera Enterprise – Making Hadoop Fast, Easy, and Secure OPERATIONS Cloudera Manager Cloudera Director DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop REAL-TIME Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Partners CLOUDERA ENTERPRISE
  • 11. 11© Cloudera, Inc. All rights reserved. Cloudera Enterprise – The Data & Analytics Platform for IoT Sensors/ IoT Data Sources Internal Systems External Sources BI Solutions Real-Time AppsSearch EDWDiscove r Machine Learning Data Center Cloud Sensor/ IoT Data IoT Gateway • Data Storage • Data Processing • Machine Learning • Real-time Analytics OPERATIONS Cloudera Manager Cloudera Director DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop REAL-TIME Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Partners
  • 12. 12© Cloudera, Inc. All rights reserved. Cloudera Enterprise – Real Time Analytics for IoT BI Solutions Real-Time AppsSearch EDWDiscover Machine Learning Deployment Flexibility Spark Streaming Leadership in Spark Integrated with EDH Flexible Storage Store any and all Data. Kudu - Fast Analytics on Fast Data Real-Time Data Processing Data Security Four pillars of security: Perimeter, Access, Visibility, and Data + Record Service Streaming Ingest Kafka & Flume - Real-Time Data Ingest for streaming, high volume data Sensor/ IoT Data Internal Systems External Sources Centralized Mgmt. Cloudera Manager for centralized cluster management Manage Multiple Clusters – On Premise or Cloud environment - On Premise or Cloud OPERATIONS Cloudera Manager Cloudera Director DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop REAL-TIME Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Partners
  • 13. 13© Cloudera, Inc. All rights reserved. The Cloudera Difference Powerful Cluster Ops Trusted by the pros Cloud & Hybrid deployment Integrated with AWS & Azure Expert Support Dedicated prescriptive help, just a click away Real-Time IoT Analytics The most experience with Spark The Fastest Analytic SQL Lowest latency, best concurrency Fast, Updateable Analytic Storage High throughput, low latency, and updates Easy to ManageFast for Business Security without Compromise Enterprise Encryption Protects everything transparently Access Policy Enforcement Full-stack row/column-based RBAC & dynamic masking Automated Data Management Full-stack audit, lineage, discovery, and lifecycle Secure Operations Separation of duties, log data redaction
  • 14. 14© Cloudera, Inc. All rights reserved. Continuous Data Ingestion with Cloudera & StreamSets StreamSets enables easy onboarding and effortless data ingest into all components of CDH Reliable, Scalable, Always- on Data Ingest
  • 15. IoT for Consumer Packaged Goods
  • 16. Gathering Sensor Data from IoT Devices
  • 17. Gathering Sensor Data from Freight Containers
  • 18. The StreamSets Data Collector Design Debug Execute
  • 20. Poll Where are you in your development effort for bringing IoT data into Hadoop? • In Production • Test and Development • Planning (Already decided on the architecture) • Not there yet (Need to decide on an architecture) • Current Architecture doesn’t work, need a better way to do things
  • 21. Challenges with IoT Data • Multitude of Sensors • Real-Time Streaming • Multiple Firmware versions • Bad data from damaged sensors • Regulatory Constraints • Data Quality
  • 22. Demo – Evolving DataSets
  • 23. 23© Cloudera, Inc. All rights reserved. Getting Started is Easy Watch the Beyond ETL Series Download the Streamsets Data Collector Contact Us to start a POC 1 2 3
  • 24. 24© Cloudera, Inc. All rights reserved. Thank You