1© Cloudera, Inc. All rights reserved.
Bringing Self-Service BI & SQL
Analytics to a Hybrid Cloud World
Alex Gutow | Product Marketing, Analytic Database
Cloudera
2© Cloudera, Inc. All rights reserved.
What’s Driving Analytics to the Cloud?
Big data deployments in cloud
are accelerating:
● Increased Agility: End-user self-service
● Elasticity: Optimize infrastructure usage
● Executive Mandate: Minimize on-prem
datacenter footprint
● Lower Overall TCO (workload dependent)
3© Cloudera, Inc. All rights reserved.
Analytic
Database
More data of all types is being
tapped for analytics, across
environments
Self-Service BI & Data
Open up new possibilities
for real-time insights as
data changes
Real-Time Analysis
BI & analytics are critical but
only tell part of the story. Get
more value by sharing data
across workloads
Converged Workloads
4© Cloudera, Inc. All rights reserved.
Key Applications
EDW
Optimization
Data
Preparation
Self-Service BI
& Exploration
Use your EDW more
efficiently by offloading
workloads to Hadoop
Fast, flexible ETL over large
data volumes, so data is always
ready for your business
Fastest time-to-insights with a modern
analytic database designed with
Hadoop’s flexibility and agility
5© Cloudera, Inc. All rights reserved.
Key Benefits
An analytic database designed for Hadoop
High-Performance BI and SQL Analytics
Flexibility for Data and Use Case Variety
Cost-effective Scale for Today and Tomorrow
Go Beyond SQL with an Open Architecture
6© Cloudera, Inc. All rights reserved.
A Modern Analytic Database for
the Hybrid Cloud
7© Cloudera, Inc. All rights reserved.
Anatomy of an Analytic Database
Cloudera Decoupled by Design
Query Engine
Storage Engine
Catalog
Query Engine
(Impala)
Catalog
(HMS)
Monolithic Analytic Database Modern Analytic Database
Storage
(Kudu)
Storage
(S3)
Storage
(HDFS)
8© Cloudera, Inc. All rights reserved.
Limited to SQL only
• Maintain data copies
for non-SQL
Rigid Data Model
• Tightly coupled
storage and compute
Static Sizing
• Major maintenance to add
capacity/nodes
Poorly Designed for Cloud
• No elasticity or integration
with object storage
Pain Points
Traditional Monolithic Analytic Databases
∞
COMPUTE
STORE
9© Cloudera, Inc. All rights reserved.
Benefits of Cloudera’s Modern Approach
Cloud-Native & On-Premise
Go Beyond SQL
• Open Architecture: Open
formats and open storage
• Shared data across SQL and
non-SQL workloads
Data Flexibility
• Faster, more agile data
acquisition
• Data portability: Open
formats and open storage
Cost-Effective Scalability
• Elastic scale on-prem or in
the cloud
• Cloud-native pay-per-use
and transience
• Proven at big data scale
Hybrid
• Runs across multi-cloud &
on-prem
• Multi-storage over S3, HDFS,
Kudu, Isilon, DSSD, etcShared Data
10© Cloudera, Inc. All rights reserved.
Most Enterprises Are or Will be Heterogeneous
• 76% will embrace hybrid cloud (Gartner1)
• 82% will have a multi-cloud strategy (RightScale2)
• 50% will “repatriate” at least one public cloud workload back to private cloud or
on-prem for cost reasons (4513)
• 50% of Cloudera’s cloud customers run a hybrid environment
1Gartner, Market Trends: Cloud Adoption Trends Favor Public Cloud With a Hybrid Twist 2015
2 RightScale 2016 State of the Cloud Report
3 451 Research: AWS Lambda: new and exciting, old and rehashed, more vendor lock-in (or all the above)?, November 22, 2016
Why is this a critical strategy?
Portability & Cost Functionality Data Gravity
11© Cloudera, Inc. All rights reserved.
Cost-Efficiencies & Flexibility in the Cloud
Primary Analytic Database Patterns
Only pay for what you need,
when you need it
▪ Transient clusters
▪ Object storage centric
▪ Cloud-native deployment
ETL
Reduce Operating Costs New Insights, New Revenue
BI/Analytics
Explore and analyze all data,
wherever it lives
▪ Long-running clusters
▪ Object storage or local storage
▪ Lift-and-shift deployment
12© Cloudera, Inc. All rights reserved.
Add Use Cases, Analytics,
and Data On-Demand
• Avoid the IT backlog with instant
access to all data
• On-demand clusters query directly
on shared object storage
Predictable Results
Whenever You Want
• Consistent query performance,
even during peak times
• Multi-tenancy via isolated clusters
on shared data
Just-in-Time Resources
• Real-time capacity for your needs,
as they change
• Elastically grow/shrink your cluster
via decoupled architecture
Contention-Free ETL
• ETL anytime without impacting
other workloads or risking SLAs
• Separate ETL clusters as-needed on
shared data
With Additive Business Benefits in the Cloud
13© Cloudera, Inc. All rights reserved.
True Self-Service BI with Seamless Productivity
• Automatic, always-on metadata management for context
and stewardship at scale
• Enterprise-wide consistency with partner integrations
• Visibility into common workload patterns and data usage for
proactive data modeling and optimizations
• Usage-enriched discovery, recommendations, and
intelligent query design assistance
…and others
Data
Stewards
Database
Admins
SQL
Developers
Analysts
14© Cloudera, Inc. All rights reserved.
Helping Companies with a Global Value
Chain Save Millions
• 360° view of supply chain process in
seconds with data from suppliers,
manufacturing, equipment, field
service, IoT and repair
• Improved product quality by identifying
and addressing supply chain issues in
near-real time
• $15-$25 million savings annually for
Siemens clients
• Costs 90% less per TB than RDBMS; 75%
less per TB than Netezza
CUSTOMER 360
15© Cloudera, Inc. All rights reserved.
Providing a complete view of consumer
watching and buying habits
• Helps customers optimize their ad
spend for greater campaign ROI
• Improves processing performance as
data volumes double
• Boosts agility and flexibility and
reduces risk with hybrid and
multi-cloud strategy
CUSTOMER 360
16© Cloudera, Inc. All rights reserved.
Measure user interaction across the
ecosystem, help direct R&D and
development spend
• Virtuous cycle: Identify features that
facilitate sharing of content that drive
new customers
• Real-time streaming and batch data
from product logs, web analytics,
channel data and ERP
• Impala connects to third-party data
wrangling and BI tools for fast reporting
17© Cloudera, Inc. All rights reserved.
Cloudera’s Analytic Database
Identify, offload, &
optimize workloads to
Hadoop
Navigator
Optimizer
Intelligent SQL editor
Hue
Audit, lineage,
encryption, key
management, & policy
lifecycles
Navigator
Integration with the
leading BI tools
BI Partners
Interactive query engine
for BI & SQL analytics
Impala
Large-scale ETL & batch
processing engine
Hive-on-
Spark
Multi-Storage, Multi-Environment
Data Storage for Fast &
Changing Data
Kudu
18© Cloudera, Inc. All rights reserved.
Process data, develop and
serve predictive models.
Data Science
& Engineering
Data-driven applications
to deliver real-time insights.
Operational
Database
One Enterprise Data Hub for Multiple Workloads
Explore, analyze, and
understand all your data.
Analytic
Database
19© Cloudera, Inc. All rights reserved.
Transform Your Business
DATA
ENGINEERING
ANALYTIC
DATABASE
OPERATIONAL
DATABASE
MODERNIZE
ARCHITECTURE
DRIVE
CUSTOMER INSIGHTS
IMPROVE
PRODUCTS & SERVICES EFFICIENCY
LOWER
BUSINESS RISK
20© Cloudera, Inc. All rights reserved.
Learn More at Cloudera’s Booth #421
Check out a live demo of Cloudera’s Analytic Database
powering BI in the cloud
21© Cloudera, Inc. All rights reserved.
Thank You!

Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics to a Hybrid Cloud World

  • 1.
    1© Cloudera, Inc.All rights reserved. Bringing Self-Service BI & SQL Analytics to a Hybrid Cloud World Alex Gutow | Product Marketing, Analytic Database Cloudera
  • 2.
    2© Cloudera, Inc.All rights reserved. What’s Driving Analytics to the Cloud? Big data deployments in cloud are accelerating: ● Increased Agility: End-user self-service ● Elasticity: Optimize infrastructure usage ● Executive Mandate: Minimize on-prem datacenter footprint ● Lower Overall TCO (workload dependent)
  • 3.
    3© Cloudera, Inc.All rights reserved. Analytic Database More data of all types is being tapped for analytics, across environments Self-Service BI & Data Open up new possibilities for real-time insights as data changes Real-Time Analysis BI & analytics are critical but only tell part of the story. Get more value by sharing data across workloads Converged Workloads
  • 4.
    4© Cloudera, Inc.All rights reserved. Key Applications EDW Optimization Data Preparation Self-Service BI & Exploration Use your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over large data volumes, so data is always ready for your business Fastest time-to-insights with a modern analytic database designed with Hadoop’s flexibility and agility
  • 5.
    5© Cloudera, Inc.All rights reserved. Key Benefits An analytic database designed for Hadoop High-Performance BI and SQL Analytics Flexibility for Data and Use Case Variety Cost-effective Scale for Today and Tomorrow Go Beyond SQL with an Open Architecture
  • 6.
    6© Cloudera, Inc.All rights reserved. A Modern Analytic Database for the Hybrid Cloud
  • 7.
    7© Cloudera, Inc.All rights reserved. Anatomy of an Analytic Database Cloudera Decoupled by Design Query Engine Storage Engine Catalog Query Engine (Impala) Catalog (HMS) Monolithic Analytic Database Modern Analytic Database Storage (Kudu) Storage (S3) Storage (HDFS)
  • 8.
    8© Cloudera, Inc.All rights reserved. Limited to SQL only • Maintain data copies for non-SQL Rigid Data Model • Tightly coupled storage and compute Static Sizing • Major maintenance to add capacity/nodes Poorly Designed for Cloud • No elasticity or integration with object storage Pain Points Traditional Monolithic Analytic Databases ∞ COMPUTE STORE
  • 9.
    9© Cloudera, Inc.All rights reserved. Benefits of Cloudera’s Modern Approach Cloud-Native & On-Premise Go Beyond SQL • Open Architecture: Open formats and open storage • Shared data across SQL and non-SQL workloads Data Flexibility • Faster, more agile data acquisition • Data portability: Open formats and open storage Cost-Effective Scalability • Elastic scale on-prem or in the cloud • Cloud-native pay-per-use and transience • Proven at big data scale Hybrid • Runs across multi-cloud & on-prem • Multi-storage over S3, HDFS, Kudu, Isilon, DSSD, etcShared Data
  • 10.
    10© Cloudera, Inc.All rights reserved. Most Enterprises Are or Will be Heterogeneous • 76% will embrace hybrid cloud (Gartner1) • 82% will have a multi-cloud strategy (RightScale2) • 50% will “repatriate” at least one public cloud workload back to private cloud or on-prem for cost reasons (4513) • 50% of Cloudera’s cloud customers run a hybrid environment 1Gartner, Market Trends: Cloud Adoption Trends Favor Public Cloud With a Hybrid Twist 2015 2 RightScale 2016 State of the Cloud Report 3 451 Research: AWS Lambda: new and exciting, old and rehashed, more vendor lock-in (or all the above)?, November 22, 2016 Why is this a critical strategy? Portability & Cost Functionality Data Gravity
  • 11.
    11© Cloudera, Inc.All rights reserved. Cost-Efficiencies & Flexibility in the Cloud Primary Analytic Database Patterns Only pay for what you need, when you need it ▪ Transient clusters ▪ Object storage centric ▪ Cloud-native deployment ETL Reduce Operating Costs New Insights, New Revenue BI/Analytics Explore and analyze all data, wherever it lives ▪ Long-running clusters ▪ Object storage or local storage ▪ Lift-and-shift deployment
  • 12.
    12© Cloudera, Inc.All rights reserved. Add Use Cases, Analytics, and Data On-Demand • Avoid the IT backlog with instant access to all data • On-demand clusters query directly on shared object storage Predictable Results Whenever You Want • Consistent query performance, even during peak times • Multi-tenancy via isolated clusters on shared data Just-in-Time Resources • Real-time capacity for your needs, as they change • Elastically grow/shrink your cluster via decoupled architecture Contention-Free ETL • ETL anytime without impacting other workloads or risking SLAs • Separate ETL clusters as-needed on shared data With Additive Business Benefits in the Cloud
  • 13.
    13© Cloudera, Inc.All rights reserved. True Self-Service BI with Seamless Productivity • Automatic, always-on metadata management for context and stewardship at scale • Enterprise-wide consistency with partner integrations • Visibility into common workload patterns and data usage for proactive data modeling and optimizations • Usage-enriched discovery, recommendations, and intelligent query design assistance …and others Data Stewards Database Admins SQL Developers Analysts
  • 14.
    14© Cloudera, Inc.All rights reserved. Helping Companies with a Global Value Chain Save Millions • 360° view of supply chain process in seconds with data from suppliers, manufacturing, equipment, field service, IoT and repair • Improved product quality by identifying and addressing supply chain issues in near-real time • $15-$25 million savings annually for Siemens clients • Costs 90% less per TB than RDBMS; 75% less per TB than Netezza CUSTOMER 360
  • 15.
    15© Cloudera, Inc.All rights reserved. Providing a complete view of consumer watching and buying habits • Helps customers optimize their ad spend for greater campaign ROI • Improves processing performance as data volumes double • Boosts agility and flexibility and reduces risk with hybrid and multi-cloud strategy CUSTOMER 360
  • 16.
    16© Cloudera, Inc.All rights reserved. Measure user interaction across the ecosystem, help direct R&D and development spend • Virtuous cycle: Identify features that facilitate sharing of content that drive new customers • Real-time streaming and batch data from product logs, web analytics, channel data and ERP • Impala connects to third-party data wrangling and BI tools for fast reporting
  • 17.
    17© Cloudera, Inc.All rights reserved. Cloudera’s Analytic Database Identify, offload, & optimize workloads to Hadoop Navigator Optimizer Intelligent SQL editor Hue Audit, lineage, encryption, key management, & policy lifecycles Navigator Integration with the leading BI tools BI Partners Interactive query engine for BI & SQL analytics Impala Large-scale ETL & batch processing engine Hive-on- Spark Multi-Storage, Multi-Environment Data Storage for Fast & Changing Data Kudu
  • 18.
    18© Cloudera, Inc.All rights reserved. Process data, develop and serve predictive models. Data Science & Engineering Data-driven applications to deliver real-time insights. Operational Database One Enterprise Data Hub for Multiple Workloads Explore, analyze, and understand all your data. Analytic Database
  • 19.
    19© Cloudera, Inc.All rights reserved. Transform Your Business DATA ENGINEERING ANALYTIC DATABASE OPERATIONAL DATABASE MODERNIZE ARCHITECTURE DRIVE CUSTOMER INSIGHTS IMPROVE PRODUCTS & SERVICES EFFICIENCY LOWER BUSINESS RISK
  • 20.
    20© Cloudera, Inc.All rights reserved. Learn More at Cloudera’s Booth #421 Check out a live demo of Cloudera’s Analytic Database powering BI in the cloud
  • 21.
    21© Cloudera, Inc.All rights reserved. Thank You!