SlideShare a Scribd company logo
CONSOLIDATE YOUR DATA MARTS FOR
FAST, FLEXIBLE ANALYTICS
Alex Gutow | Cloudera
Josh Klahr | AtScale
2 © Cloudera, Inc. All rights reserved.
TRENDS WITHIN DATA WAREHOUSING
Expand to
Do More
Nearly 40% want greater capacity for
growing data, users, reports,
analyses, etc1
Improve
Responsiveness
67% of users are requesting to do
more BI/analytics on their own2
Balance Business
Critical & Exploration
42% looking to augment EDW with
modern platform1
1Data Warehouse Modernization In the Age of Big Data Analytics, TDWI, 2016
2Achieving Greater Agility with Business Intelligence, TDWI, 2014
3 © Cloudera, Inc. All rights reserved.
FIRST GENERATION
EDWETL
4 © Cloudera, Inc. All rights reserved.
THE RISE OF NEW TOOLS
EDWET
L
ET
L
Analytic
Appliances
Curated Data
Sets/OLAP
Cubes
5 © Cloudera, Inc. All rights reserved.
POLL QUESTION #1
What analytic appliances or data marts does your organization use? (Select all that apply)
• Netezza
• Vertica
• AWS Redshift
• Snowflake
• Exadata
• BI Tool Extracts (Tableau, Qlik, etc)
• EDW
• None/Unknown
• Other
6 © Cloudera, Inc. All rights reserved.
RISE OF THE DATA LAKE
EDW
ET
L
Analytic
Appliances
Curated Data
Sets/OLAP
Cubes
Data Hub
/Data Lake
7 © Cloudera, Inc. All rights reserved.
LIMITATIONS OF EXISTING INFRASTRUCTURE
• Not able to take on more reports,
use cases, users, etc.
• Constrained exploration to prevent
risking critical SLAs
• Proliferation of data silos to address
additional workloads
• Maintain data copies causes
inefficiencies for storage,
processing, and people
• Need to contain costs for
existing workloads
• Difficult to justify budget and
maintenance for expansion
• Struggle to do more with less
• Designed for curated reports, not
iterative, self-service analytics
• Not built for elasticity or object
store integration
Compute
Store
8 © Cloudera, Inc. All rights reserved.
POLL QUESTION #2
What pain points or limitations are you facing with your data mart(s)? (Select all that apply)
• Cannot meet internal SLA for delivering data
• Struggle to contain costs
• Long wait times for new data
• Limited data for querying
• Data exists across silos
• Cannot connect BI tools
• Migrating to cloud
• Other
9 © Cloudera, Inc. All rights reserved.
DATA WAREHOUSING OPTIMIZATION & CONSOLIDATION
CLOUDERA DATA WAREHOUSE
ATSCALE UNIVERSAL SEMANTIC
LAYER
ENTERPRISE DATA
WAREHOUSE
DATA MART(S)
OPTIMIZE
THROUGH
OFFLOAD
COMPLETE
MIGRATION
10 © Cloudera, Inc. All rights reserved.
ADVANTAGES OF MODERN DATA WAREHOUSING
Data Flexibility
• Iterative modeling and
self-service accessibility
• Portability: No proprietary formats
or storage lock-in
Go Beyond SQL
• Consolidate data silos with
an open architecture
• Shared data across SQL
and non-SQL workloads
High-Performance SQL +
Cost-Effective Scalability
• Elastic scale in any environment
• Cloud-native integration for
optimized pay-per-use costs
• Proven at massive scale
Hybrid Decoupled Architecture
• Runs across multi-cloud & on-
prem
for zero lock-in
• Multi-
storage over S3, ADLS, HDFS,
Kudu, Isilon, etc
Shared Data
11 © Cloudera, Inc. All rights reserved.
MODERNIZING FOR THE DATA AGE
Fixed
Reports
DATA SOURCES MODERN DATA WAREHOUSE
Flexible
Reporting
Advanced
Analytics
Self-Service
BI/Ad Hoc
Dashboards/
Analytic Apps
EDW
UNIVERSAL SEMANTIC LAYER
12 © Cloudera, Inc. All rights reserved.
MORE VALUE AT A LOWER COST/TB
AverageCostperTB
Observed 80%
decrease
in TCO
Note: TCO calculations and migration opportunity is dependent on your environment
and types of workloads. Please work with Cloudera to calculate potential savings for
your environment
© Cloudera, Inc. All rights reserved.
PROOF OF PERFORMANCE & OLAP FUNCTIONALITY
14 © Cloudera, Inc. All rights reserved.
IMPALA OUTPERFORMS ANALYTIC DATABASES
• Impala outperforms on
both single and multi-
user tests at 10TB
• Impala lead expands
with concurrency
• Other SQL on Hadoop
engines failed at 10TB
Streams Impala QpH MPP DB QpH Impala Times Better
2 41 20 2.05x
4 75 20 3.75x
8 133 16 8.31x
Metric Impala MPP DB Impala Times Better
Total seconds 11,898 21,093 1.77x
Geometric mean 33 92 2.79x
Single-user Test – 10TB
Multi-user Test – 10TB
15 © Cloudera, Inc. All rights reserved.
TPC-DS 1TB: SINGLE-USER
• The analytical db cohort
(Impala / MPP) leads
SQL on Hadoop
• Impala outperforms
• Presto by 8.3x
• Hive w/ LLAP by 4x
• Spark SQL by 2.8x Impala Greenplum Spark SQL
Hive with
LLAP
Presto
Total 100% 97% 283% 405% 826%
Geomean 100% 250% 367% 450% 1317%
0%
200%
400%
600%
800%
1000%
1200%
1400%
RelativePerformance
TPC-DS 1TB Single-user (Lower is Better)
16 © Cloudera, Inc. All rights reserved.
TPC-DS 1TB: MULTI-USER
• At 1TB the analytic db
cohort (Impala / MPP)
expands lead with
concurrency
• With 16 streams Impala
outperforms
• Spark by ~22x
• Hive by ~20x
• Greenplum by ~3x
4 streams 8 streams 16 streams
Impala 495 865 1,315
Greenplum 381 417 462
Spark SQL 76 70 61
Hive with LLAP 58 62 66
0
200
400
600
800
1000
1200
1400
QueriesperHour
Multi-user TPC-DS 1TB Queries/Hour
(Higher is Better)
20x
17 © Cloudera, Inc. All rights reserved.
COST-EFFECTIVE PERFORMANCE
With Cloudera + AtScale
Cloudera + AtScale can meet or
exceed the performance needs of
traditional database users
• Impala provides leading analytic database
performance at high user concurrency
• AtScale’s Intelligence Platform improves
query performance
• Dimensional Calc Engine allows complex
OLAP-style calculations to run directly on
cluster
Fortune 50 Insurance Company
Use Case Netezza
AtScale +
Cloudera Speedup Factor
Metric Trends 4.0 0.3 15.6 times faster
Customer Growth 6.8 0.3 20.6 times faster
Segment Benchmark 2.4 0.9 2.7 times faster
Time Series 5.5 1.1 4.9 times faster
Transaction Dashboard 5.5 0.4 12.7 times faster
Top Customers 5.4 0.9 6.1 times faster
Product Heatmap 5.5 0.2 34.8 times faster
Top Products 5.5 1.9 2.9 times faster
Top Suppliers 5.5 0.8 6.9 times faster
18 © Cloudera, Inc. All rights reserved.
BI ON BIG DATA IS ABOUT MORE THAN JUST PERFORMANCE
AtScale adds OLAP to Cloudera’s Analytic DB
Benefits of AtScale + Impala
• OLAP is the language of BI
■ Provides business users access to functions like: time-
intelligence, drill-down, and slicing and dicing
• AtScale is optimized for Impala
■ Provides intelligent queries, optimized data structures
19 © Cloudera, Inc. All rights reserved.
DATA SCIENTISTS AND BUSINESS USE THE SAME DATA
Curate the data once - deliver self-service BI and Data Science
Benefits the AtScale Universal Semantic Layer
• Keeps the data in the Cloudera cluster
• Avoids extractions or data marts
• Simplifies and modernizes your data architecture
• Allows any BI tool to access the data, including Excel, PowerBI, Tableau,
etc.
20 © Cloudera, Inc. All rights reserved.
POLL QUESTION #3
Which BI tools are you using? (Select all that apply)
• Tableau
• SAP Business Objects
• Excel
• PowerBI
• Qlik
• Spotfire
• Microstrategy
• Zoomdata
• Cognos
• Other
21 © Cloudera, Inc. All rights reserved.
ATSCALE PRESENTS DATA FOR BI USERS, NOT DATA ENGINEERS
22 © Cloudera, Inc. All rights reserved.
DASHBOARDS DEMAND CONCURRENCY, THROUGHPUT
© Cloudera, Inc. All rights reserved.
JOINT CUSTOMER SUCCESS
24 © Cloudera, Inc. All rights reserved.
TOYOTA
Self-Service BI for Finance & IoT
Cloudera + AtScale
• Deliver sub-second queries
• Supports ad hoc Tableau and
PowerBI
• Data science embedded with
business units and goals
• Enable and protect enterprise
data consumption
Faster time-to-insights
Self-service users
Databases migrated
Self-delivered dashboards
25 © Cloudera, Inc. All rights reserved.
GLOBAL PHARMACEUTICAL
R&D Information Platform
Cloudera + AtScale
• Consistent, shared data access
through BI tools (Tableau, Tibco)
• Interactive query speeds
• OLAP capabilities
• HIPAA compliant
Time-to-insights in minutes, not years:
• Reduced cost and time to identify
clinical trial groups
• Accelerate new drug development
• 1st time metrics and monitoring on
compliance data
structured + unstructured
100% unstructured data
captured
Analytic Users
70% Execs/Managers
25% Analysts
5% Data Scientists
Reduction in silos
Use cases
© Cloudera, Inc. All rights reserved.
THE OFFLOAD PROCESS
27 © Cloudera, Inc. All rights reserved.
WHERE DO YOU START?
12 am 12 pm 12 am
1.6M
1.2M
800K
400K
6 am 6 pm
Query volume can be huge
• How do you determine what workloads to run
on Cloudera’s platform?
• Will the queries run efficiently?
• What does it take to migrate?
• How do you prioritize?
Numerous databases, thousands of tables,
many users and applications
Queries can be very complex
28 © Cloudera, Inc. All rights reserved.
DATA WAREHOUSE OPTIMIZATION
Tools & Framework to help you through the process
Plan Offload Optimize
Estimate Effort
Risk Analysis
Schema Design
Test & Validate
Evaluate
Identify Use Cases Impact Analysis
Set Objectives
Prioritized Plan
Initial POC
Offload each workloadEvaluate the need and scope
Impact analysis, prioritized
plan
Optimize for performance
Identify Suitable
Workloads
Offload Actions
Capacity Planning
Fine Tuning Data
Model on Hadoop
Optimize Queries for
Performance
Validate ROI, Cost
29 © Cloudera, Inc. All rights reserved.
WHERE TO LEARN MORE
Learn more how GSK and Toyota
brought BI to the Data Lake:
http://info.atscale.com/webinar_do_dont_bi_on_
data_lake_2018-0
Get the report for how to offload BI
workloads:
https://www.cloudera.com/content/dam/www/ma
rketing/resources/analyst-reports/efficiently-
offloading-and-optimizing-bi-workloads-to-
hadoop-with-cloudera-navigator-
optimizer.pdf.landing.html
Contact us or check out:
THANK YOU

More Related Content

What's hot

Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

Cloudera, Inc.
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Cloudera, Inc.
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Cloudera, Inc.
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
Cloudera, Inc.
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
Cloudera, Inc.
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Cloudera, Inc.
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
Cloudera, Inc.
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
Cloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

Cloudera, Inc.
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
Cloudera, Inc.
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 

What's hot (20)

Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
 
PaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with AltusPaaS or Fail: Rule the Cloud with Altus
PaaS or Fail: Rule the Cloud with Altus
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
 
Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 

Similar to Consolidate your data marts for fast, flexible analytics 5.24.18

Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
Cloudera, Inc.
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Stefan Lipp
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera, Inc.
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
jdijcks
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
Cloudera, Inc.
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
Matt Lord
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
Cloudera, Inc.
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
Cloudera, Inc.
 
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationBigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
Excelerate Systems
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
Cloudera, Inc.
 
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Uri Laserson
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
jdijcks
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
IntelAPAC
 

Similar to Consolidate your data marts for fast, flexible analytics 5.24.18 (20)

Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationBigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse Optimisation
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 

Recently uploaded

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 

Recently uploaded (20)

RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 

Consolidate your data marts for fast, flexible analytics 5.24.18

  • 1. CONSOLIDATE YOUR DATA MARTS FOR FAST, FLEXIBLE ANALYTICS Alex Gutow | Cloudera Josh Klahr | AtScale
  • 2. 2 © Cloudera, Inc. All rights reserved. TRENDS WITHIN DATA WAREHOUSING Expand to Do More Nearly 40% want greater capacity for growing data, users, reports, analyses, etc1 Improve Responsiveness 67% of users are requesting to do more BI/analytics on their own2 Balance Business Critical & Exploration 42% looking to augment EDW with modern platform1 1Data Warehouse Modernization In the Age of Big Data Analytics, TDWI, 2016 2Achieving Greater Agility with Business Intelligence, TDWI, 2014
  • 3. 3 © Cloudera, Inc. All rights reserved. FIRST GENERATION EDWETL
  • 4. 4 © Cloudera, Inc. All rights reserved. THE RISE OF NEW TOOLS EDWET L ET L Analytic Appliances Curated Data Sets/OLAP Cubes
  • 5. 5 © Cloudera, Inc. All rights reserved. POLL QUESTION #1 What analytic appliances or data marts does your organization use? (Select all that apply) • Netezza • Vertica • AWS Redshift • Snowflake • Exadata • BI Tool Extracts (Tableau, Qlik, etc) • EDW • None/Unknown • Other
  • 6. 6 © Cloudera, Inc. All rights reserved. RISE OF THE DATA LAKE EDW ET L Analytic Appliances Curated Data Sets/OLAP Cubes Data Hub /Data Lake
  • 7. 7 © Cloudera, Inc. All rights reserved. LIMITATIONS OF EXISTING INFRASTRUCTURE • Not able to take on more reports, use cases, users, etc. • Constrained exploration to prevent risking critical SLAs • Proliferation of data silos to address additional workloads • Maintain data copies causes inefficiencies for storage, processing, and people • Need to contain costs for existing workloads • Difficult to justify budget and maintenance for expansion • Struggle to do more with less • Designed for curated reports, not iterative, self-service analytics • Not built for elasticity or object store integration Compute Store
  • 8. 8 © Cloudera, Inc. All rights reserved. POLL QUESTION #2 What pain points or limitations are you facing with your data mart(s)? (Select all that apply) • Cannot meet internal SLA for delivering data • Struggle to contain costs • Long wait times for new data • Limited data for querying • Data exists across silos • Cannot connect BI tools • Migrating to cloud • Other
  • 9. 9 © Cloudera, Inc. All rights reserved. DATA WAREHOUSING OPTIMIZATION & CONSOLIDATION CLOUDERA DATA WAREHOUSE ATSCALE UNIVERSAL SEMANTIC LAYER ENTERPRISE DATA WAREHOUSE DATA MART(S) OPTIMIZE THROUGH OFFLOAD COMPLETE MIGRATION
  • 10. 10 © Cloudera, Inc. All rights reserved. ADVANTAGES OF MODERN DATA WAREHOUSING Data Flexibility • Iterative modeling and self-service accessibility • Portability: No proprietary formats or storage lock-in Go Beyond SQL • Consolidate data silos with an open architecture • Shared data across SQL and non-SQL workloads High-Performance SQL + Cost-Effective Scalability • Elastic scale in any environment • Cloud-native integration for optimized pay-per-use costs • Proven at massive scale Hybrid Decoupled Architecture • Runs across multi-cloud & on- prem for zero lock-in • Multi- storage over S3, ADLS, HDFS, Kudu, Isilon, etc Shared Data
  • 11. 11 © Cloudera, Inc. All rights reserved. MODERNIZING FOR THE DATA AGE Fixed Reports DATA SOURCES MODERN DATA WAREHOUSE Flexible Reporting Advanced Analytics Self-Service BI/Ad Hoc Dashboards/ Analytic Apps EDW UNIVERSAL SEMANTIC LAYER
  • 12. 12 © Cloudera, Inc. All rights reserved. MORE VALUE AT A LOWER COST/TB AverageCostperTB Observed 80% decrease in TCO Note: TCO calculations and migration opportunity is dependent on your environment and types of workloads. Please work with Cloudera to calculate potential savings for your environment
  • 13. © Cloudera, Inc. All rights reserved. PROOF OF PERFORMANCE & OLAP FUNCTIONALITY
  • 14. 14 © Cloudera, Inc. All rights reserved. IMPALA OUTPERFORMS ANALYTIC DATABASES • Impala outperforms on both single and multi- user tests at 10TB • Impala lead expands with concurrency • Other SQL on Hadoop engines failed at 10TB Streams Impala QpH MPP DB QpH Impala Times Better 2 41 20 2.05x 4 75 20 3.75x 8 133 16 8.31x Metric Impala MPP DB Impala Times Better Total seconds 11,898 21,093 1.77x Geometric mean 33 92 2.79x Single-user Test – 10TB Multi-user Test – 10TB
  • 15. 15 © Cloudera, Inc. All rights reserved. TPC-DS 1TB: SINGLE-USER • The analytical db cohort (Impala / MPP) leads SQL on Hadoop • Impala outperforms • Presto by 8.3x • Hive w/ LLAP by 4x • Spark SQL by 2.8x Impala Greenplum Spark SQL Hive with LLAP Presto Total 100% 97% 283% 405% 826% Geomean 100% 250% 367% 450% 1317% 0% 200% 400% 600% 800% 1000% 1200% 1400% RelativePerformance TPC-DS 1TB Single-user (Lower is Better)
  • 16. 16 © Cloudera, Inc. All rights reserved. TPC-DS 1TB: MULTI-USER • At 1TB the analytic db cohort (Impala / MPP) expands lead with concurrency • With 16 streams Impala outperforms • Spark by ~22x • Hive by ~20x • Greenplum by ~3x 4 streams 8 streams 16 streams Impala 495 865 1,315 Greenplum 381 417 462 Spark SQL 76 70 61 Hive with LLAP 58 62 66 0 200 400 600 800 1000 1200 1400 QueriesperHour Multi-user TPC-DS 1TB Queries/Hour (Higher is Better) 20x
  • 17. 17 © Cloudera, Inc. All rights reserved. COST-EFFECTIVE PERFORMANCE With Cloudera + AtScale Cloudera + AtScale can meet or exceed the performance needs of traditional database users • Impala provides leading analytic database performance at high user concurrency • AtScale’s Intelligence Platform improves query performance • Dimensional Calc Engine allows complex OLAP-style calculations to run directly on cluster Fortune 50 Insurance Company Use Case Netezza AtScale + Cloudera Speedup Factor Metric Trends 4.0 0.3 15.6 times faster Customer Growth 6.8 0.3 20.6 times faster Segment Benchmark 2.4 0.9 2.7 times faster Time Series 5.5 1.1 4.9 times faster Transaction Dashboard 5.5 0.4 12.7 times faster Top Customers 5.4 0.9 6.1 times faster Product Heatmap 5.5 0.2 34.8 times faster Top Products 5.5 1.9 2.9 times faster Top Suppliers 5.5 0.8 6.9 times faster
  • 18. 18 © Cloudera, Inc. All rights reserved. BI ON BIG DATA IS ABOUT MORE THAN JUST PERFORMANCE AtScale adds OLAP to Cloudera’s Analytic DB Benefits of AtScale + Impala • OLAP is the language of BI ■ Provides business users access to functions like: time- intelligence, drill-down, and slicing and dicing • AtScale is optimized for Impala ■ Provides intelligent queries, optimized data structures
  • 19. 19 © Cloudera, Inc. All rights reserved. DATA SCIENTISTS AND BUSINESS USE THE SAME DATA Curate the data once - deliver self-service BI and Data Science Benefits the AtScale Universal Semantic Layer • Keeps the data in the Cloudera cluster • Avoids extractions or data marts • Simplifies and modernizes your data architecture • Allows any BI tool to access the data, including Excel, PowerBI, Tableau, etc.
  • 20. 20 © Cloudera, Inc. All rights reserved. POLL QUESTION #3 Which BI tools are you using? (Select all that apply) • Tableau • SAP Business Objects • Excel • PowerBI • Qlik • Spotfire • Microstrategy • Zoomdata • Cognos • Other
  • 21. 21 © Cloudera, Inc. All rights reserved. ATSCALE PRESENTS DATA FOR BI USERS, NOT DATA ENGINEERS
  • 22. 22 © Cloudera, Inc. All rights reserved. DASHBOARDS DEMAND CONCURRENCY, THROUGHPUT
  • 23. © Cloudera, Inc. All rights reserved. JOINT CUSTOMER SUCCESS
  • 24. 24 © Cloudera, Inc. All rights reserved. TOYOTA Self-Service BI for Finance & IoT Cloudera + AtScale • Deliver sub-second queries • Supports ad hoc Tableau and PowerBI • Data science embedded with business units and goals • Enable and protect enterprise data consumption Faster time-to-insights Self-service users Databases migrated Self-delivered dashboards
  • 25. 25 © Cloudera, Inc. All rights reserved. GLOBAL PHARMACEUTICAL R&D Information Platform Cloudera + AtScale • Consistent, shared data access through BI tools (Tableau, Tibco) • Interactive query speeds • OLAP capabilities • HIPAA compliant Time-to-insights in minutes, not years: • Reduced cost and time to identify clinical trial groups • Accelerate new drug development • 1st time metrics and monitoring on compliance data structured + unstructured 100% unstructured data captured Analytic Users 70% Execs/Managers 25% Analysts 5% Data Scientists Reduction in silos Use cases
  • 26. © Cloudera, Inc. All rights reserved. THE OFFLOAD PROCESS
  • 27. 27 © Cloudera, Inc. All rights reserved. WHERE DO YOU START? 12 am 12 pm 12 am 1.6M 1.2M 800K 400K 6 am 6 pm Query volume can be huge • How do you determine what workloads to run on Cloudera’s platform? • Will the queries run efficiently? • What does it take to migrate? • How do you prioritize? Numerous databases, thousands of tables, many users and applications Queries can be very complex
  • 28. 28 © Cloudera, Inc. All rights reserved. DATA WAREHOUSE OPTIMIZATION Tools & Framework to help you through the process Plan Offload Optimize Estimate Effort Risk Analysis Schema Design Test & Validate Evaluate Identify Use Cases Impact Analysis Set Objectives Prioritized Plan Initial POC Offload each workloadEvaluate the need and scope Impact analysis, prioritized plan Optimize for performance Identify Suitable Workloads Offload Actions Capacity Planning Fine Tuning Data Model on Hadoop Optimize Queries for Performance Validate ROI, Cost
  • 29. 29 © Cloudera, Inc. All rights reserved. WHERE TO LEARN MORE Learn more how GSK and Toyota brought BI to the Data Lake: http://info.atscale.com/webinar_do_dont_bi_on_ data_lake_2018-0 Get the report for how to offload BI workloads: https://www.cloudera.com/content/dam/www/ma rketing/resources/analyst-reports/efficiently- offloading-and-optimizing-bi-workloads-to- hadoop-with-cloudera-navigator- optimizer.pdf.landing.html Contact us or check out: