SlideShare a Scribd company logo

Leveraging the Cloud for Big Data Analytics 12.11.18

Learn how organizations are deriving unique customer insights, improving product and services efficiency, and reducing business risk with a modern big data architecture powered by Cloudera on AWS. In this webinar, you see how fast and easy it is to deploy a modern data management platform—in your cloud, on your terms.

1 of 34
BIG DATA ANALYTICS IN THE CLOUD
Sushant Rao
Cloud Product Marketing @ Cloudera
Rohit Pujari
Solutions Architect @ Amazon Web Services
© Cloudera, Inc. All rights reserved.2
Primary Advantages for Cloud
● Agility
○ Speed of making changes to meet business / technical needs
● Scalable & Elastic
○ Scale up and down quickly
● Reliable
○ Multiple options to ensure infrastructure / services are available
● Cost effectiveness
○ Pay for what you use (but may not be cheaper than on-prem)
© Cloudera, Inc. All rights reserved.3
Big Data Use Cases for Cloud
● Corporate directive to leverage the cloud
○ C-level has decided to utilize the cloud more
● Disaster Recovery “location” in the cloud
○ Backup all data to the cloud, without a second “physical” location
● On-demand data mart / data engineering
○ Separate environment for new, production workloads
○ Ad-hoc workloads that run intermittently
● Sandbox environment for workloads
○ Environment to test queries and algorithms
© Cloudera, Inc. All rights reserved.4
Cloudera’s Solution for Data Analytics / Engineering in Cloud
• The modern platform for machine learning and analytics
○ Numerous functions for all types of jobs and queries
• with multiple deployment options
○ On-premises, Public cloud (including multi-), and Hybrid
• and one shared data experience
○ Framework for consistent security, governance, and metadata management across
applications and deployments
© Cloudera, Inc. All rights reserved.5
The Modern Platform for Machine Learning & Analytics
OPERATIONAL
DATABASE
DATA
ENGINEERING
DATA
WAREHOUSE
DATA
SCIENCE
DATA PROCESSING
• Cost efficient
• Reliable
• Scalable
• Based on Spark,
MapReduce,
Hive & Pig
• Supported by
Workload
Analytics
FAST BI & SQL
• Flexibility
• Elastic scale
• Go beyond SQL
• Based on
Impala & Hive
• SQL dev enviro
• Supported by
Workload
Analytics
MACHINE LEARNING
• Fast dev to
production
• Secure self-serve
• Based on
Python, R, and
Spark
• ML dev
environment
(CDSW)
ONLINE & REAL-TIME
• High throughput,
low latency
• Strongly consistent
• Based on
Hbase, Kudu
& Spark
streaming
© Cloudera, Inc. All rights reserved. 6
Cloudera’s Vision for AI and Machine Learning
Modern Enterprise Platform, Tools, and Expert Guidance to help you Unlock Business Value with ML /
AI
Agile platform to build,
train, and deploy
scalable ML applications
Enterprise data science
tools to accelerate team
productivity
Expert guidance,
services & training to
fast track value & scale

Recommended

Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionCloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 

More Related Content

What's hot

Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera, Inc.
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18Cloudera, Inc.
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusCloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...Cloudera, Inc.
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseCloudera, Inc.
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...Cloudera, Inc.
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18Cloudera, Inc.
 
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformCloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 

What's hot (20)

Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 
Big data journey to the cloud rohit pujari 5.30.18
Big data journey to the cloud   rohit pujari 5.30.18Big data journey to the cloud   rohit pujari 5.30.18
Big data journey to the cloud rohit pujari 5.30.18
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 

Similar to Leveraging the Cloud for Big Data Analytics 12.11.18

Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Cloudera, Inc.
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera, Inc.
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudCloudera, Inc.
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSCloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Cloudera, Inc.
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Matt Stubbs
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudGoDataDriven
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Cloudera, Inc.
 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionCloudera, Inc.
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computingPUBLEAD (R)
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsCloudera, Inc.
 

Similar to Leveraging the Cloud for Big Data Analytics 12.11.18 (20)

Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
 
Hybrid is the New Normal
Hybrid is the New NormalHybrid is the New Normal
Hybrid is the New Normal
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
Comment développer une stratégie Big Data dans le cloud public avec l'offre P...
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analytics
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceCloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enoughCloudera, Inc.
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Cloudera, Inc.
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera, Inc.
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedCloudera, Inc.
 
Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Cloudera, Inc.
 

More from Cloudera, Inc. (16)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 
Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18Multi task learning stepping away from narrow expert models 7.11.18
Multi task learning stepping away from narrow expert models 7.11.18
 
Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18Cloudera training secure your cloudera cluster 7.10.18
Cloudera training secure your cloudera cluster 7.10.18
 
The 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: ExposedThe 5 Biggest Data Myths in Telco: Exposed
The 5 Biggest Data Myths in Telco: Exposed
 
Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18Delivering improved patient outcomes through advanced analytics 6.26.18
Delivering improved patient outcomes through advanced analytics 6.26.18
 

Recently uploaded

How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stackSummit
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...DianaGray10
 
"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor Fesenko"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor FesenkoFwdays
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch TuesdayIvanti
 
IT Nation Evolve event 2024 - Quarter 1
IT Nation Evolve event 2024  - Quarter 1IT Nation Evolve event 2024  - Quarter 1
IT Nation Evolve event 2024 - Quarter 1Inbay UK
 
Artificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdfArtificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdfIsidro Navarro
 
Campotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company ProfileCampotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company ProfileCampotelPhilippines
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Umar Saif
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologySafe Software
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPMemory Fabric Forum
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17Ana-Maria Mihalceanu
 
Traffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptxTraffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptxharimaxwell0712
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...Fwdays
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...shaiyuvasv
 
How to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response PlanHow to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response PlanDatabarracks
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...ISPMAIndia
 
Battle of React State Managers in frontend applications
Battle of React State Managers in frontend applicationsBattle of React State Managers in frontend applications
Battle of React State Managers in frontend applicationsEvangelia Mitsopoulou
 
Importance of magazines in education ppt
Importance of magazines in education pptImportance of magazines in education ppt
Importance of magazines in education pptsafnarafeek2002
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsInflectra
 
"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura RochniakFwdays
 

Recently uploaded (20)

How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stack
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
 
"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor Fesenko"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor Fesenko
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch Tuesday
 
IT Nation Evolve event 2024 - Quarter 1
IT Nation Evolve event 2024  - Quarter 1IT Nation Evolve event 2024  - Quarter 1
IT Nation Evolve event 2024 - Quarter 1
 
Artificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdfArtificial-Intelligence-in-Marketing-Data.pdf
Artificial-Intelligence-in-Marketing-Data.pdf
 
Campotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company ProfileCampotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company Profile
 
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI Technology
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
 
Traffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptxTraffic Signboard Classification with Voice alert to the driver.pptx
Traffic Signboard Classification with Voice alert to the driver.pptx
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
 
How to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response PlanHow to write an effective Cyber Incident Response Plan
How to write an effective Cyber Incident Response Plan
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
 
Battle of React State Managers in frontend applications
Battle of React State Managers in frontend applicationsBattle of React State Managers in frontend applications
Battle of React State Managers in frontend applications
 
Importance of magazines in education ppt
Importance of magazines in education pptImportance of magazines in education ppt
Importance of magazines in education ppt
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
 
"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak
 

Leveraging the Cloud for Big Data Analytics 12.11.18

  • 1. BIG DATA ANALYTICS IN THE CLOUD Sushant Rao Cloud Product Marketing @ Cloudera Rohit Pujari Solutions Architect @ Amazon Web Services
  • 2. © Cloudera, Inc. All rights reserved.2 Primary Advantages for Cloud ● Agility ○ Speed of making changes to meet business / technical needs ● Scalable & Elastic ○ Scale up and down quickly ● Reliable ○ Multiple options to ensure infrastructure / services are available ● Cost effectiveness ○ Pay for what you use (but may not be cheaper than on-prem)
  • 3. © Cloudera, Inc. All rights reserved.3 Big Data Use Cases for Cloud ● Corporate directive to leverage the cloud ○ C-level has decided to utilize the cloud more ● Disaster Recovery “location” in the cloud ○ Backup all data to the cloud, without a second “physical” location ● On-demand data mart / data engineering ○ Separate environment for new, production workloads ○ Ad-hoc workloads that run intermittently ● Sandbox environment for workloads ○ Environment to test queries and algorithms
  • 4. © Cloudera, Inc. All rights reserved.4 Cloudera’s Solution for Data Analytics / Engineering in Cloud • The modern platform for machine learning and analytics ○ Numerous functions for all types of jobs and queries • with multiple deployment options ○ On-premises, Public cloud (including multi-), and Hybrid • and one shared data experience ○ Framework for consistent security, governance, and metadata management across applications and deployments
  • 5. © Cloudera, Inc. All rights reserved.5 The Modern Platform for Machine Learning & Analytics OPERATIONAL DATABASE DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE DATA PROCESSING • Cost efficient • Reliable • Scalable • Based on Spark, MapReduce, Hive & Pig • Supported by Workload Analytics FAST BI & SQL • Flexibility • Elastic scale • Go beyond SQL • Based on Impala & Hive • SQL dev enviro • Supported by Workload Analytics MACHINE LEARNING • Fast dev to production • Secure self-serve • Based on Python, R, and Spark • ML dev environment (CDSW) ONLINE & REAL-TIME • High throughput, low latency • Strongly consistent • Based on Hbase, Kudu & Spark streaming
  • 6. © Cloudera, Inc. All rights reserved. 6 Cloudera’s Vision for AI and Machine Learning Modern Enterprise Platform, Tools, and Expert Guidance to help you Unlock Business Value with ML / AI Agile platform to build, train, and deploy scalable ML applications Enterprise data science tools to accelerate team productivity Expert guidance, services & training to fast track value & scale
  • 7. © Cloudera, Inc. All rights reserved.7 Via Cloudera Altus Director INFRASTRUCTURE SERVICES OPERATIONAL DATABASE DATA ENGINEERING DATA WAREHOUSE DATA SCIENCE DATA ENGINEERING DATA WAREHOUSE Via Cloudera Altus Services With Multiple Deployment Options Traditional Infrastructure (combined storage and compute) Cloud Infrastructure (decoupled storage and compute) Cloud Infrastructure (decoupled storage and compute)
  • 8. © Cloudera, Inc. All rights reserved.8 Cloudera Enterprise with SDX Benefits for IT infra & ops ●Central control and security ●Focus on curating not firefighting Benefits for users ●Value from single source of truth ●Bring the best tools for each job WORKLOADS 3RD PARTY SERVICES DATA ENGINEERING DATA SCIENCE DATA WAREHOUSE OPERATIONAL DATABASE DATA CATALOG GOVERNANCESECURITY LIFECYCLE MANAGEMENT STORAGE Microsoft ADLS COMMON SERVICES HDFS Amazon S3 KUDU
  • 9. © Cloudera, Inc. All rights reserved.9 Many Options for Data Analytics / Engineering in the Cloud Altus Director Altus Services Existing On- Prem Deployment
  • 10. © Cloudera, Inc. All rights reserved.10 Many Options for Data Analytics / Engineering in the Cloud Altus Director Altus Services Existing On- Prem Deployment Starting New Deployment
  • 11. © Cloudera, Inc. All rights reserved.11 Journey from On-Prem Cluster to Cloud BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  • 12. © Cloudera, Inc. All rights reserved.12 Journey from On-Prem Cluster to Cloud CUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1 - LIFT AND SHIFT HDFS BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  • 13. © Cloudera, Inc. All rights reserved.13 Journey from On-Prem Cluster to Cloud CUSTOMER CLOUDCUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1 - LIFT AND SHIFT 2 - OBJECT STORAGE HDFS BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  • 14. © Cloudera, Inc. All rights reserved.14 Journey from On-Prem Cluster to Cloud CUSTOMER CLOUD CUSTOMER CLOUDCUSTOMER CLOUD CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 1 - LIFT AND SHIFT 2 - OBJECT STORAGE HDFS CLOUDERA CLUSTERS (TRANSIENT– ALTUS) COMPUTE Data Engineering CLOUDERA CLOUD CLOUDERA ALTUS CONTROL PLANE STORAGE CLOUD OBJECT STORE DATA CONTEXT CLOUDERA CLUSTER (PERSISTENT–DIRECTOR) COMPUTE DATA CONTEXT CLOUDERA CLUSTERS (TRANSIENT– ALTUS) COMPUTE Data Warehouse 3 - CLOUD NATIVE ARCHITECTURES BARE METAL CLOUDERA CLUSTER (PERSISTENT) COMPUTE DATA CONTEXT Data Engineering Data Warehouse Data Science Security Metadata Governance STORAGE CLOUD OBJECT STORE 0 - ON PREMISES HDFS
  • 15. © Cloudera, Inc. All rights reserved.15 Customer Examples Many Cloudera customers (Global 5K) used public cloud • Online retailer • Over 2,000 nodes with ~2PB of data on AWS running in an active - active configuration • Transforming data with Spark and then analyzing with Apache Hive • German chain of coffee retailers and cafés • 30+ nodes with 50TB of data on AWS • Modern Cloudera platform with an Impala data warehouse • Global information company • 50+ nodes on Microsoft Azure and 20+ nodes on AWS • Replaced Netezza with Hadoop and leveraging both Impala and Spark for analytics
  • 16. © Cloudera, Inc. All rights reserved.16 Security Use Case Cloudera is using cloud as well Altus based solution saved more than 50% cost compared to initial implementation
  • 17. © Cloudera, Inc. All rights reserved.17 Cloudera Altus Key Differentiators • Multi-function: Unified platform for data engineering, data warehouse, and data science • Multi-cloud: Option for on-premises, Public cloud (including multi-), and Hybrid • SDX: Integrated shared data experience across multi-function clusters
  • 18. Rohit Pujari, Solutions Architect AWS Security & Compliance
  • 19. Why is security traditionally so hard? Lack of visibility Low degree of automation
  • 20. ORANDMove fast Stay secure Before…Now…
  • 21. Making life easier Choosing security does not mean giving up on convenience or introducing complexity
  • 22. The most sensitive workloads run on AWS “We can be even more secure in the AWS cloud than in our own datacenters.” —Tom Soderstrom, CTO, NASA JPL “We knew the cloud was the only way to get the scalability, speed, and security our customers expect from 3M.” —Rick Austin, 3M Health Information Systems “We determined that security in AWS is superior to our on-premises data center across several dimensions, including patching, encryption, auditing and logging, entitlements, and compliance.” —John Brady, CISO, FINRA (Financial Industry Regulatory Authority)
  • 23. Benefits of a Data Lake - All Data is in One Place Analyze all of your data, from all of your sources, in one stored location “Why is the data distributed in many locations? Where is the single source of truth?”
  • 24. Durable Designed for 11 9s of durability Available Designed for 99.99% availability High performance ▪ Multiple upload ▪ Range GET ▪ Scalable throughput Scalable ▪ Store as much as you need ▪ Scale storage and compute independently ▪ No minimum usage commitments Integrated Partner Tools ▪ Cloudera EDH ▪ Cloudera Altus ▪ Cloudera Impala Easy to use ▪ Simple REST API ▪ AWS SDKs ▪ Simple management tools ▪ Event notification ▪ Lifecycle policies Why Amazon S3 for a Data Lake?
  • 25. AWS Direct Connect AWS Snowball ISV Connectors Kafka/Flume Amazon Kinesis Firehose Amazon S3 Transfer Acceleration AWS Storage Gateway Data Ingestion into Amazon S3
  • 26. Encryption ComplianceSecurity ▪ Identity and access Management (IAM) policies ▪ Bucket policies ▪ Access Control Lists (ACLs) ▪ Private VPC endpoints to Amazon S3 ▪ Amazon S3 object tagging to manage access policies ▪ SSL endpoints ▪ Server-side encryption (SSE-S3) ▪ S3 server-side encryption with provided keys (SSE-C, SSE-KMS) ▪ Client-side encryption ▪ Buckets access logs ▪ Lifecycle management policies ▪ Access Control Lists (ACLs) ▪ Versioning and MFA deletes ▪ Certifications—HIPAA, PCI, SOC 1/2/3, etc. Strong Security Controls
  • 27. Automate with deeply integrated security tools and services Inherit global security and compliance controls Highest standards for privacy and data security Largest network of security partners and solutions Scale with superior visibility and control that satisfies the most risk-sensitive orgs Move to AWS Strengthen your security posture
  • 28. Encrypt data in transit and at rest with keys managed by our AWS Key Management System (KMS) or managing your own encryption keys with Cloud HSM using FIPS 140-2 Level 3 validated HSMs Meet data residency requirements Choose an AWS Region and AWS will not replicate it elsewhere unless you choose to do so Access services and tools that enable you to build GDPR-compliant infrastructure on top of AWS Comply with local data privacy laws by controlling who can access content, its lifecycle and disposal Highest standards for privacy
  • 29. Inherit global security and compliance controls
  • 30. © Cloudera, Inc. All rights reserved.30 Data Analytics / Engineering with Cloudera $ • Lower risk of data breach • Analysts more productive on jobs • Self-service (no shadow IT) and more productive • IT more strategic, less admin time • Deployment choices and no lock-in • Same solution as on-premises and multi- cloud • Eliminate data copies • Single security framework with universally shared metadata • Easy to track data lineage • Unified services + CLOUDERA ADVANTAGES BUSINESS VALUE
  • 31. © Cloudera, Inc. All rights reserved.31 Ready to try Data Analytics / Engineering in the Cloud? Have an existing cluster for DW / DE • Up to $2K Free AWS Credits* • Email: awsoffer@cloudera.com Don’t have an existing cluster • Free Altus DE / DW Trial • https://sso.cloudera.com/register.html *Must work with AWS and Cloudera account managers on POC to be eligible for offer
  • 33. © Cloudera, Inc. All rights reserved.33 APPENDIX
  • 34. © Cloudera, Inc. All rights reserved.34 Cloudera Pricing / Acquisition • Acquisition Options • Pay-as-you-go usage-based pricing • Node-based license subscription • Free 30-day trial • Pre-pay of cloud credits • Free version that can be deployed in the cloud • Pricing - https://www.cloudera.com/products/pricing.html

Editor's Notes

  1. Let’s keep this interactive. Please do ask questions as we go along
  2. Start with an overview of our strategy, which has 3 pillars First is a multi-function platform which has both machine learning and analytics. For the work our customers are doing, silo’ed products won’t get it done Next is the flexibility to choose the deployment that best meets the needs of their applications, data, and security / governance Lastly, is a framework to ensure consistency across applications and deployments Let’s go deeper into these
  3. Our customers are comprised of the global 5K and for these companies, the type of complex workloads they are running require more than a point product. So, we provide a platform that covers data engineering, data warehouse, data science and operational analytics. The platform also includes data ingestion such as with Kafka and other components such as Apache Solr which provides capabilities to analyze text and logs. Companies have the option of using these on a pay-as-you-go usage-based pricing, Node-based license subscription, Pre-pay of cloud credits as well as a Free version that can be deployed in the cloud
  4. Hadoop and Spark are the starting point but it’s not everything they need. So, those are some of the kinds of applied machine learning Research & Advising capabilities that Cloudera focuses on to help our clients be successful with enterprise machine learning. We also couple this with Professional Services & Training, and with our modern, unified Data Platform and enterprise Data Science tooling. I’ll spend the rest of this talk focusing on the latter capabilities. *** Old notes / reference *** With our modern, open platform and enterprise tools, we enable clients to build and deploy AI solutions at scale, efficiently and securely, anywhere they want. And we couple that with Cloudera Fast Forward Labs expert guidance to help clients realize their AI future, faster. Ideal Foundation: Agile platform to build, train, and deploy scalable ML applications Cloudera's modern platform with SDX enables secure, shared data access with consistent context, breaking down data & workflow silos Combines data warehousing and ML on a single platform that runs anywhere, at scale Built on open tech for future proof innovation Enterprise ML Made Easy: Enterprise data science tools to accelerate team productivity CDSW eases the machine learning workflow Supports modern, open data science and ML tooling and team collaboration for innovation & agility With enterprise grade data management, security and governance Fast track to value & scale: Expert guidance, services & training to fast track value & scale Cloudera Fast Forward Labs helps you design & execute your ML strategy Enables rapid, practical application of emerging ML technologies to your business Cloudera PS for proven delivery of scalable, production-grade ML systems
  5. So we introduced Cloudera SDX - or shared data experience – the foundations of Cloudera Enterprise. SDX makes it possible for companies to run dozens - hundreds - of analytic applications against a common pool of data. One logical cluster provides a shared data experience to multiple workloads and tenants SDX applies a centralized, consistent framework for catalog, security, governance, management, data ingest and more. It makes it faster, easier, and safer for organizations, teams, people to develop and deploy high-value, multi-function use cases like customer next best offer, clinical prediction, and risk modeling. SDX cuts through silos to unify data, analytics, management, security, and governance, and empowers self-service It combines the strengths of on-premises and cloud only deployments: * multi-function support * shared data experience * information security model * cost management * tenant isolation * workload elasticity * self service * speed of deployment
  6. - CLoudera Infosec wanted to use Apache Spot to analyze security events in our network - Our IT, didn't want them to run their workload on the production cluster due to typical isolation / uptime concerns on business-critical workloads. - They were running on their own cluster, but that was underutilized and a waste of money - So, they migrated the workload to Altus Services - After using Altus Services, the costs dropped by 50% due to better utilization.
  7. Since we’re discussing how to migrate Hadoop workloads to AWS, we’re aware how important it is to break down data silos, and build a well governed data lake to which different business units can subscribe to fulfill their analytics needs. AWS adds global dimension to the concept of data lake, where you can build a policy driven data lake that respects geographic boundaries not just from data storage perspective but also from data processing standpoint
  8. Amazon S3 is a global service that allows you to store the data in 18 regions around the world. S3 is highly available web scale object store that designed for 11 9s of durability. It infinitely scalable data storage infrastructure at very low cost as compared to HDFS. S3 is designed to be highly flexible, you can store any data in any format you want, so you can store Hadoop compatible formats like Parquet, ORC, Avro, JSON, CSV, others. And you can access it variety of ways – like over REST API, command line tools, Hadoop S3A client, etc Almost all AWS partner products that work with data are integrated with S3 including - cloudera EDH, Altus and Impala.
  9. And there are host of options to bring data into s3 – If majority of your data in on-premises you can use direct connect to establish high-throughput dedicated connection from your premises to AWS. Once you have direct connect in place you can use tools of your choice to send the data to S3. If you have data in the range of terabytes to petabytes, and sending data over network is not time-efficient you can use AWS snowball devices for secure physical transport. For streaming data, you can use Cloudera Flume, Kafka and Kinesis to bring to land that data into s3 S3 Transfer acceleration enables fast data transfer over long distances between your client and s3 bucket. So for example if you have a user in Australia who’s trying to upload data to a s3 bucket in US, he can take advantage of s3 Transfer acceleration which makes use of globally distributed edge locations, so once the data arrives edge locations the data is routed to s3 over an optimized network path. You also have an option to use AWS storage gateway - which can expose s3 bucket as NFS mount that you can use to store and retrieve data. You can also use cloud back storage volumes to asynchronously backup point-in-time snapshots of your data to s3 As you can see how s3 allows you to build truly a global policy driven data lake.
  10. Also, you get strong security controls with S3. You can securely send your data to s3 via SSL endpoints You can encrypt data at rest. With S3 server side encryption, you can configure your s3 buckets to automatically encrypt data before storing it. You can use Key Management Service from AWS if you wish to control the encryption keys. In addition to that, you can use your own encryption libraries to encrypt the data before storing it into S3. The are number of ways through which you can control access to your data. You can use IAM Policies and bucket policies – that define which user/group or role can access what resources and data. You can use VPC endpoints allow you to further lock down s3 your buckets to be accessed from your logically isolated section of AWS cloud You can tags to classify your data and define fine grained access control based on that. From compliance perspective, S3 captures the access logs – it’s a full audit trail of who has accessed what data when, and from where You can version your objects, set up MFA for delete as an extra layer of protection. S3 is complaint HIPPA, Pci, SOC 1, 2, and 3 to even more confidence that you can safely store and process sensitive data.