© Cloudera, Inc. All rights reserved. 1
MODERN DATA WAREHOUSE
FUNDAMENTALS
Part III: Healthy in Production: Optimizing, Managing, and Troubleshooting
December, 2018
© Cloudera, Inc. All rights reserved. 3
SPEAKERS
Raman Rajasekhar
Product Manager
rr@cloudera.com
David Dichmann
Director, Product Marketing
ddichmann@cloudera.com
4 © Cloudera, Inc. All rights reserved.
TYPICAL MODERN DATA WAREHOUSE FOOTPRINTS
Terabytes
Users
Databases
Queries / Month
FRAUD
PREVENTION
Use Cases
Users
Fewer Silos
Diverse Data
NEW PRODUCT
DEVELOPMENT
Query
Responses
New Sources
Min. Data Sets
Users
BUSINESS
OPTIMIZATION
LARGE BANK GLOBAL PHARMA MAJOR TELCO
5 © Cloudera, Inc. All rights reserved.
Managing all this requires comprehensive
Workload Management
6 © Cloudera, Inc. All rights reserved.
Workload Management is about proactively
assisting and de-risking every phase of the big
data application lifecycle, supporting all
stakeholders involved
7 © Cloudera, Inc. All rights reserved.
Cloudera aims to empower our customers through a
self-service intelligent workload-centric
software to deliver
Workload XM
8 © Cloudera, Inc. All rights reserved.
PERSONAS: WHO USES THE MODERN DATA WAREHOUSE?
Hadoop / System
/ Database Admin
Data Architects Data Engineering
/ BI Developer
Data
Consumers
Central Support
Team
System
Integrator
9 © Cloudera, Inc. All rights reserved.
KEEPING YOUR DATA WAREHOUSE HEALTHY
Self-service AnalysisAssist workload migration
Visibility into clusters
Baseline, performance tune &
troubleshoot Impala/Hive/SparkDeep health checks
Identify rogue applications
10 © Cloudera, Inc. All rights reserved.
How do I share petabytes of verified data across thousands of
users with varied skill-sets while maintaining SLAs and cost?
Fix data health issues like
small files
Better metadata handoff
between ETL & BI Enhanced visibility within
and across clusters
Proactive De-risking through
early detection & elimination
of bottlenecks Reduced time-to-production
through quick
troubleshooting & tuning
Self-service analytics through
prescriptive recommendations
11 © Cloudera, Inc. All rights reserved.
CLOUDERA ENTERPRISE
The modern platform for machine learning and analytics optimized for the cloud
Amazon
S3
Microsoft
ADLS HDFS KUDU
SECURITY GOVERNANCE
WORKLOAD
MANAGEMENT
INGEST &
REPLICATION
DATA CATALOG
Core
Services
Storage
Services
DATA
WAREHOUSE
DATA
SCIENCE
EXTENSIBLE
SERVICES
OPERATIONAL
DATABASE
DATA
ENGINEERING
12 © Cloudera, Inc. All rights reserved.
DEMO Video
13 © Cloudera, Inc. All rights reserved.
CLOUDERA WORKLOAD XM
Proactively optimize Workloads, Application Performance, and Infrastructure Capacity for
Data Warehousing, Data Engineering, and Machine Learning Environments
Migrate Analyze Optimize Manage
14 © Cloudera, Inc. All rights reserved.
MIGRATE
Comprehensive Risk Assessment
Easy Onboarding of New Applications
Reduced Time to Production
15 © Cloudera, Inc. All rights reserved.
ANALYZE
Monitor Application Health
Self-service Analytics
Identify Anti-patterns & Bad
Hardware
16 © Cloudera, Inc. All rights reserved.
OPTIMIZE
Achieve Performance Predictability
Prescriptive Tuning & Workload
Balancing
Resource-to-Cost Optimizer
17 © Cloudera, Inc. All rights reserved.
MANAGE
Define, Monitor & Manage SLAs
Proactively Help Eliminate Bottlenecks
Assist in Capacity Planning
18 © Cloudera, Inc. All rights reserved.
WHY WORKLOAD XM?
- No Installation, Restarts, Downtimes
- No Agents or Sensors deployed on cluster
- Instant availability of new features and enhancements
- Extensive Knowledge base on platform usage from customers across the globe
- A non-intrusive, secure and stateless metrics collection
- SDX Advantage, tighter integration into the Cloudera ecosystem & enhanced UX
- No one understands our platform better!
19 © Cloudera, Inc. All rights reserved.
WORKLOAD MANAGEMENT - WHAT’S COMING?
Intelligent Workloads Prescriptive Recommendations Actionable Capabilities
20 © Cloudera, Inc. All rights reserved.
Cluster Utilization -
Increased ROI on
infrastructure
THE WORKLOAD XM ADVANTAGE
Time to Detect Issues &
Bottlenecks - Saving
Time & Costs
Faster RCA &
Troubleshooting -
Reducing expensive
Overheads & Costs
Intrinsic Value of
Node
Improved SLA
Adherence - Equates to
XXX Revenue
Minimal Additional
Resource Consumption
© Cloudera, Inc. All rights reserved. 21
THE MODERN DATA WAREHOUSE
Deeper Business Insights at Extreme Speed and Scale While Managing Cost
DEEPER
business insights
EXTREME
speed & scale
CONTROLLED
resources & costs
THANK YOU
https://www.cloudera.com/products/data-warehouse.html
© Cloudera, Inc. All rights reserved. 23

Modern Data Warehouse Fundamentals Part 3

  • 1.
    © Cloudera, Inc.All rights reserved. 1
  • 2.
    MODERN DATA WAREHOUSE FUNDAMENTALS PartIII: Healthy in Production: Optimizing, Managing, and Troubleshooting December, 2018
  • 3.
    © Cloudera, Inc.All rights reserved. 3 SPEAKERS Raman Rajasekhar Product Manager rr@cloudera.com David Dichmann Director, Product Marketing ddichmann@cloudera.com
  • 4.
    4 © Cloudera,Inc. All rights reserved. TYPICAL MODERN DATA WAREHOUSE FOOTPRINTS Terabytes Users Databases Queries / Month FRAUD PREVENTION Use Cases Users Fewer Silos Diverse Data NEW PRODUCT DEVELOPMENT Query Responses New Sources Min. Data Sets Users BUSINESS OPTIMIZATION LARGE BANK GLOBAL PHARMA MAJOR TELCO
  • 5.
    5 © Cloudera,Inc. All rights reserved. Managing all this requires comprehensive Workload Management
  • 6.
    6 © Cloudera,Inc. All rights reserved. Workload Management is about proactively assisting and de-risking every phase of the big data application lifecycle, supporting all stakeholders involved
  • 7.
    7 © Cloudera,Inc. All rights reserved. Cloudera aims to empower our customers through a self-service intelligent workload-centric software to deliver Workload XM
  • 8.
    8 © Cloudera,Inc. All rights reserved. PERSONAS: WHO USES THE MODERN DATA WAREHOUSE? Hadoop / System / Database Admin Data Architects Data Engineering / BI Developer Data Consumers Central Support Team System Integrator
  • 9.
    9 © Cloudera,Inc. All rights reserved. KEEPING YOUR DATA WAREHOUSE HEALTHY Self-service AnalysisAssist workload migration Visibility into clusters Baseline, performance tune & troubleshoot Impala/Hive/SparkDeep health checks Identify rogue applications
  • 10.
    10 © Cloudera,Inc. All rights reserved. How do I share petabytes of verified data across thousands of users with varied skill-sets while maintaining SLAs and cost? Fix data health issues like small files Better metadata handoff between ETL & BI Enhanced visibility within and across clusters Proactive De-risking through early detection & elimination of bottlenecks Reduced time-to-production through quick troubleshooting & tuning Self-service analytics through prescriptive recommendations
  • 11.
    11 © Cloudera,Inc. All rights reserved. CLOUDERA ENTERPRISE The modern platform for machine learning and analytics optimized for the cloud Amazon S3 Microsoft ADLS HDFS KUDU SECURITY GOVERNANCE WORKLOAD MANAGEMENT INGEST & REPLICATION DATA CATALOG Core Services Storage Services DATA WAREHOUSE DATA SCIENCE EXTENSIBLE SERVICES OPERATIONAL DATABASE DATA ENGINEERING
  • 12.
    12 © Cloudera,Inc. All rights reserved. DEMO Video
  • 13.
    13 © Cloudera,Inc. All rights reserved. CLOUDERA WORKLOAD XM Proactively optimize Workloads, Application Performance, and Infrastructure Capacity for Data Warehousing, Data Engineering, and Machine Learning Environments Migrate Analyze Optimize Manage
  • 14.
    14 © Cloudera,Inc. All rights reserved. MIGRATE Comprehensive Risk Assessment Easy Onboarding of New Applications Reduced Time to Production
  • 15.
    15 © Cloudera,Inc. All rights reserved. ANALYZE Monitor Application Health Self-service Analytics Identify Anti-patterns & Bad Hardware
  • 16.
    16 © Cloudera,Inc. All rights reserved. OPTIMIZE Achieve Performance Predictability Prescriptive Tuning & Workload Balancing Resource-to-Cost Optimizer
  • 17.
    17 © Cloudera,Inc. All rights reserved. MANAGE Define, Monitor & Manage SLAs Proactively Help Eliminate Bottlenecks Assist in Capacity Planning
  • 18.
    18 © Cloudera,Inc. All rights reserved. WHY WORKLOAD XM? - No Installation, Restarts, Downtimes - No Agents or Sensors deployed on cluster - Instant availability of new features and enhancements - Extensive Knowledge base on platform usage from customers across the globe - A non-intrusive, secure and stateless metrics collection - SDX Advantage, tighter integration into the Cloudera ecosystem & enhanced UX - No one understands our platform better!
  • 19.
    19 © Cloudera,Inc. All rights reserved. WORKLOAD MANAGEMENT - WHAT’S COMING? Intelligent Workloads Prescriptive Recommendations Actionable Capabilities
  • 20.
    20 © Cloudera,Inc. All rights reserved. Cluster Utilization - Increased ROI on infrastructure THE WORKLOAD XM ADVANTAGE Time to Detect Issues & Bottlenecks - Saving Time & Costs Faster RCA & Troubleshooting - Reducing expensive Overheads & Costs Intrinsic Value of Node Improved SLA Adherence - Equates to XXX Revenue Minimal Additional Resource Consumption
  • 21.
    © Cloudera, Inc.All rights reserved. 21 THE MODERN DATA WAREHOUSE Deeper Business Insights at Extreme Speed and Scale While Managing Cost DEEPER business insights EXTREME speed & scale CONTROLLED resources & costs
  • 22.
  • 23.
    © Cloudera, Inc.All rights reserved. 23