SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Better Analytics in Less Time
Enabling Data Discovery & Analytics with Cloudera
2© Cloudera, Inc. All rights reserved.
Trends in the Market
Only 12% of enterprise data is
leveraged for analytics.
Source: Forrester Research
Analytics returns $13.01 for
every dollar invested.
Source: Nucleus Research
80% of CEOs said analytics is a
strategic objective for 2015.
Source: PWC CEO Survey
Too Much Data Early Returns Strategic Direction
3© Cloudera, Inc. All rights reserved.
Trends in the Market
Data Discovery is the #1 fastest growing workload for
enterprise analytics.
4© Cloudera, Inc. All rights reserved.
Data Discovery & Analytics (DD&A) Overview
The ability to find enterprise data and quickly uncover
new insights and optimize existing analytics.
(AKA: Self-service BI, BI, Data Discovery, Advance Analytics, Machine Learning)
5© Cloudera, Inc. All rights reserved.
Data Discovery & Analytics (DD&A) Overview
Report and dashboard building
for business user consumption
(Tableau, Oracle)
Statistical and machine learning
analyses using advanced
analytic tools
(SAS, Matlab)
Command line access for
custom analytic development
(Spark, Python)
Self-Service BI Advanced Analytics Data Science
6© Cloudera, Inc. All rights reserved.
Objectives of Data Discovery & Analytics
Report Model Rule
Marketing analysis
Log analysis
Churn analysis
Product recommendation
Predictive support
Trade recommendation
Ad targeting
Transaction classification
Lead scoring
7© Cloudera, Inc. All rights reserved.
It’s an Iterative Process
Report, Model,
or Rules
Ingest
Transformation
80% of Time
Diverse Ingest
Search and lineage
Agile Transforms
20% of Time
SQL
Statistical
Machine Learning
Implement
Point Solution
Custom App
Analysis
Technique
Access
Data
Generation
Data Discovery
& Analytics
Flow
8© Cloudera, Inc. All rights reserved.
A Traditional Architecture
Access Data Experiment FastAnalyze Data
Enterprise Data Warehouse
ImplementData Sources
ETLStructured
Unstructured
Ingest
Storage #1, 2, N
ELT
Store & Process
Traditional Architecture
EDW
Archive
ETL
Access Data
Analyze Data
Search
Serve
Serve
Serve
Optimize
Implement
Custom
Application
Point
Solution
ELT
ELT
Statistical
Machine
Learning
SQL
9© Cloudera, Inc. All rights reserved.
Enterprise Data Warehouse
ImplementData Sources
ETLStructured
Unstructured
Ingest
Storage #1, 2, N
ELT
Store & Process
Traditional Architecture
EDW
Archive
ETL
Access Data
Analyze Data
Search
Serve
Serve
Serve
Optimize
Implement
Custom
Application
Point
Solution
ELT
ELT
Statistical
Machine
Learning
SQL
Challenges with Traditional Architectures
1) Limited Data 2) Long Time to Value
1
2
2
3) Compliance & Privacy Concerns
3
10© Cloudera, Inc. All rights reserved.
The New Way Forward
1) Unlimited Data Access 2) Reduce Time to Value 3) Secure and Compliant
Enterprise Data Warehouse
ImplementData Sources
ELT
Store & Process
Modern Architecture
Access Data
Analyze Data
Optimize
Implement
Custom
Application
Point
Solution
Statistical
Machine
Learning
SQL
Structured
Unstructured
EDW
ETL
Serve
ETL
Active
Ingest
Ingest EDH
Archive
Load
Cloudera
ELT
3
2
2
3
1
Search
11© Cloudera, Inc. All rights reserved.
Concur Customer Story
12© Cloudera, Inc. All rights reserved.
About Concur
What do we do -
Global customer base of 20,000 clients and 25 million users
Processing more than $50 Billion in Travel & Expense (T&E) spend each year
Who is Concur -
Leading provider of spend management solutions and (Travel, Invoice, TripIt,
etc.) services in the world
13© Cloudera, Inc. All rights reserved.
Our Objectives
Deeper BI Reports
Starbucks Store #3313
601 108th Ave NE
Bellevue, WA (425) 646-9602
-------------------------------
Chk 713452
05/14/2014 11:04 AM
1961558 Drawer: 1 Reg: 1
-------------------------------
Bacon Art Brkfst 3.45
Warmed
T1 Latte 2.70
Triple 1.50
Soy 0.60
Gr Vanilla Mac 4.15
Reload Card 50.00
AMEX $50.00
XXXXXXXXXXXXXXXXXX1004
SBUX Card $13.56
SUBTOTAL $62.40
New Caffe Espresso
Frappuccino(R) Blended beverage
Our Signature
Frappuccino(R) roast coffee and
fresh milk, blended with ice.
Topped with our new espresso
whipped cream and new
Italian roast drizzle
Better
Classifications
Smarter Predictions
14© Cloudera, Inc. All rights reserved.
The Pre-Hadoop Environment
Travel Invoice Expense
1 Difficult to connect to multiple sources
1
BI
Challenges
15© Cloudera, Inc. All rights reserved.
The Pre-Hadoop Environment
Travel Invoice Expense
1
2
Difficult to connect to multiple sources
Couldn’t analyze new data sources
BTS Google Analytics Weather
1
2
BI
Challenges
16© Cloudera, Inc. All rights reserved.
The Pre-Hadoop Environment
Travel Invoice Expense
1
2
3
Difficult to connect to multiple sources
Couldn’t analyze new data sources
Challenges
BTS Google Analytics Weather
Advanced analytics was a dream
1
2
BI
Advanced
Analytics
3
17© Cloudera, Inc. All rights reserved.
Concur’s Modern Hadoop Architecture
Complete data access (structured
and unstructured)
1
Improvements
New Data
Pig
Process
1 1
Store
HDFS
18© Cloudera, Inc. All rights reserved.
Concur’s Modern Hadoop Architecture
Complete data access (structured
and unstructured)
1
Large scale ad-hoc queries2
Pig
Process
1 1
Store
HDFS
HUE
Discover
Impala Solr
Access
BI
2
HiveImprovements
New Data
19© Cloudera, Inc. All rights reserved.
Concur’s Modern Hadoop Architecture
Complete data access (structured
and unstructured)
Quick Iterations for advanced
analytics
3
1
Large scale ad-hoc queries2
Pig
Process
HUE
Model
R MLlib
1 1
Discover
Impala Solr Spark
Access
BI
Store
HDFS
2
3
HiveImprovements
New Data
20© Cloudera, Inc. All rights reserved.
The Modern Analyst Flow
Find Data Prepare Data Analyze Data
21© Cloudera, Inc. All rights reserved.
Business and Technical ROI
Technology ROI
Business ROI
Customer 360 view leading to better experience
Insight into customer travel, expenses, and T&E
Single platform for emerging advanced use cases
Merge together storage systems for simpler management
Unified access to disparate data
Scale affordably
22© Cloudera, Inc. All rights reserved.
Key Leanings
Crawl, walk, run
It takes time, start now
Lean on experts in the community
It is a cultural shift as much as a technology shift
23© Cloudera, Inc. All rights reserved.
Thank you

More Related Content

What's hot

Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
Cloudera, Inc.
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
Cloudera, Inc.
 
From Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationFrom Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your Organization
Cloudera, Inc.
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
Cloudera, Inc.
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Technologies
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Cloudera, Inc.
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
Breakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopBreakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with Hadoop
Cloudera, Inc.
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
Cloudera, Inc.
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

Cloudera, Inc.
 
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest MindsCase study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Happiest Minds Technologies
 
Emergence of MongoDB as an Enterprise Data Hub
Emergence of MongoDB as an Enterprise Data HubEmergence of MongoDB as an Enterprise Data Hub
Emergence of MongoDB as an Enterprise Data Hub
MongoDB
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

Cloudera, Inc.
 
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
DataWorks Summit
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
Contexti
 
Hortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataHortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your data
Scott Clinton
 
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
Phillip Delaney
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
Cloudera, Inc.
 

What's hot (20)

Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 
From Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationFrom Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your Organization
 
High-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache ImpalaHigh-Performance Analytics in the Cloud with Apache Impala
High-Performance Analytics in the Cloud with Apache Impala
 
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Breakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with HadoopBreakout: Operational Analytics with Hadoop
Breakout: Operational Analytics with Hadoop
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

 
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest MindsCase study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
 
Emergence of MongoDB as an Enterprise Data Hub
Emergence of MongoDB as an Enterprise Data HubEmergence of MongoDB as an Enterprise Data Hub
Emergence of MongoDB as an Enterprise Data Hub
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
It Takes a Village: Organizational Alignment to Deliver Big Data Value in Hea...
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
 
Hortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your dataHortonworks Hybrid Cloud - Putting you back in control of your data
Hortonworks Hybrid Cloud - Putting you back in control of your data
 
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 

Similar to Breakout: Data Discovery with Hadoop

Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data Store
Cloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
CS-Op Analytics
CS-Op AnalyticsCS-Op Analytics
CS-Op Analytics
Cloudera, Inc.
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
Cloudera, Inc.
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
Denodo
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
Fran Navarro
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
Dr. Wilfred Lin (Ph.D.)
 
Keynote: The Journey to Pervasive Analytics
Keynote: The Journey to Pervasive AnalyticsKeynote: The Journey to Pervasive Analytics
Keynote: The Journey to Pervasive Analytics
Cloudera, Inc.
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
DATAVERSITY
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
Arcadia Data
 
How Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT AnalyticsHow Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT Analytics
Arcadia Data
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
Toronto-Oracle-Users-Group
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
Cloudera, Inc.
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
Cloudera, Inc.
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
DataWorks Summit
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
jdijcks
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
DATAVERSITY
 

Similar to Breakout: Data Discovery with Hadoop (20)

Breakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data StoreBreakout: Hadoop and the Operational Data Store
Breakout: Hadoop and the Operational Data Store
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
CS-Op Analytics
CS-Op AnalyticsCS-Op Analytics
CS-Op Analytics
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
Keynote: The Journey to Pervasive Analytics
Keynote: The Journey to Pervasive AnalyticsKeynote: The Journey to Pervasive Analytics
Keynote: The Journey to Pervasive Analytics
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
How Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT AnalyticsHow Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT Analytics
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 

Recently uploaded (20)

5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 

Breakout: Data Discovery with Hadoop

  • 1. 1© Cloudera, Inc. All rights reserved. Better Analytics in Less Time Enabling Data Discovery & Analytics with Cloudera
  • 2. 2© Cloudera, Inc. All rights reserved. Trends in the Market Only 12% of enterprise data is leveraged for analytics. Source: Forrester Research Analytics returns $13.01 for every dollar invested. Source: Nucleus Research 80% of CEOs said analytics is a strategic objective for 2015. Source: PWC CEO Survey Too Much Data Early Returns Strategic Direction
  • 3. 3© Cloudera, Inc. All rights reserved. Trends in the Market Data Discovery is the #1 fastest growing workload for enterprise analytics.
  • 4. 4© Cloudera, Inc. All rights reserved. Data Discovery & Analytics (DD&A) Overview The ability to find enterprise data and quickly uncover new insights and optimize existing analytics. (AKA: Self-service BI, BI, Data Discovery, Advance Analytics, Machine Learning)
  • 5. 5© Cloudera, Inc. All rights reserved. Data Discovery & Analytics (DD&A) Overview Report and dashboard building for business user consumption (Tableau, Oracle) Statistical and machine learning analyses using advanced analytic tools (SAS, Matlab) Command line access for custom analytic development (Spark, Python) Self-Service BI Advanced Analytics Data Science
  • 6. 6© Cloudera, Inc. All rights reserved. Objectives of Data Discovery & Analytics Report Model Rule Marketing analysis Log analysis Churn analysis Product recommendation Predictive support Trade recommendation Ad targeting Transaction classification Lead scoring
  • 7. 7© Cloudera, Inc. All rights reserved. It’s an Iterative Process Report, Model, or Rules Ingest Transformation 80% of Time Diverse Ingest Search and lineage Agile Transforms 20% of Time SQL Statistical Machine Learning Implement Point Solution Custom App Analysis Technique Access Data Generation Data Discovery & Analytics Flow
  • 8. 8© Cloudera, Inc. All rights reserved. A Traditional Architecture Access Data Experiment FastAnalyze Data Enterprise Data Warehouse ImplementData Sources ETLStructured Unstructured Ingest Storage #1, 2, N ELT Store & Process Traditional Architecture EDW Archive ETL Access Data Analyze Data Search Serve Serve Serve Optimize Implement Custom Application Point Solution ELT ELT Statistical Machine Learning SQL
  • 9. 9© Cloudera, Inc. All rights reserved. Enterprise Data Warehouse ImplementData Sources ETLStructured Unstructured Ingest Storage #1, 2, N ELT Store & Process Traditional Architecture EDW Archive ETL Access Data Analyze Data Search Serve Serve Serve Optimize Implement Custom Application Point Solution ELT ELT Statistical Machine Learning SQL Challenges with Traditional Architectures 1) Limited Data 2) Long Time to Value 1 2 2 3) Compliance & Privacy Concerns 3
  • 10. 10© Cloudera, Inc. All rights reserved. The New Way Forward 1) Unlimited Data Access 2) Reduce Time to Value 3) Secure and Compliant Enterprise Data Warehouse ImplementData Sources ELT Store & Process Modern Architecture Access Data Analyze Data Optimize Implement Custom Application Point Solution Statistical Machine Learning SQL Structured Unstructured EDW ETL Serve ETL Active Ingest Ingest EDH Archive Load Cloudera ELT 3 2 2 3 1 Search
  • 11. 11© Cloudera, Inc. All rights reserved. Concur Customer Story
  • 12. 12© Cloudera, Inc. All rights reserved. About Concur What do we do - Global customer base of 20,000 clients and 25 million users Processing more than $50 Billion in Travel & Expense (T&E) spend each year Who is Concur - Leading provider of spend management solutions and (Travel, Invoice, TripIt, etc.) services in the world
  • 13. 13© Cloudera, Inc. All rights reserved. Our Objectives Deeper BI Reports Starbucks Store #3313 601 108th Ave NE Bellevue, WA (425) 646-9602 ------------------------------- Chk 713452 05/14/2014 11:04 AM 1961558 Drawer: 1 Reg: 1 ------------------------------- Bacon Art Brkfst 3.45 Warmed T1 Latte 2.70 Triple 1.50 Soy 0.60 Gr Vanilla Mac 4.15 Reload Card 50.00 AMEX $50.00 XXXXXXXXXXXXXXXXXX1004 SBUX Card $13.56 SUBTOTAL $62.40 New Caffe Espresso Frappuccino(R) Blended beverage Our Signature Frappuccino(R) roast coffee and fresh milk, blended with ice. Topped with our new espresso whipped cream and new Italian roast drizzle Better Classifications Smarter Predictions
  • 14. 14© Cloudera, Inc. All rights reserved. The Pre-Hadoop Environment Travel Invoice Expense 1 Difficult to connect to multiple sources 1 BI Challenges
  • 15. 15© Cloudera, Inc. All rights reserved. The Pre-Hadoop Environment Travel Invoice Expense 1 2 Difficult to connect to multiple sources Couldn’t analyze new data sources BTS Google Analytics Weather 1 2 BI Challenges
  • 16. 16© Cloudera, Inc. All rights reserved. The Pre-Hadoop Environment Travel Invoice Expense 1 2 3 Difficult to connect to multiple sources Couldn’t analyze new data sources Challenges BTS Google Analytics Weather Advanced analytics was a dream 1 2 BI Advanced Analytics 3
  • 17. 17© Cloudera, Inc. All rights reserved. Concur’s Modern Hadoop Architecture Complete data access (structured and unstructured) 1 Improvements New Data Pig Process 1 1 Store HDFS
  • 18. 18© Cloudera, Inc. All rights reserved. Concur’s Modern Hadoop Architecture Complete data access (structured and unstructured) 1 Large scale ad-hoc queries2 Pig Process 1 1 Store HDFS HUE Discover Impala Solr Access BI 2 HiveImprovements New Data
  • 19. 19© Cloudera, Inc. All rights reserved. Concur’s Modern Hadoop Architecture Complete data access (structured and unstructured) Quick Iterations for advanced analytics 3 1 Large scale ad-hoc queries2 Pig Process HUE Model R MLlib 1 1 Discover Impala Solr Spark Access BI Store HDFS 2 3 HiveImprovements New Data
  • 20. 20© Cloudera, Inc. All rights reserved. The Modern Analyst Flow Find Data Prepare Data Analyze Data
  • 21. 21© Cloudera, Inc. All rights reserved. Business and Technical ROI Technology ROI Business ROI Customer 360 view leading to better experience Insight into customer travel, expenses, and T&E Single platform for emerging advanced use cases Merge together storage systems for simpler management Unified access to disparate data Scale affordably
  • 22. 22© Cloudera, Inc. All rights reserved. Key Leanings Crawl, walk, run It takes time, start now Lean on experts in the community It is a cultural shift as much as a technology shift
  • 23. 23© Cloudera, Inc. All rights reserved. Thank you

Editor's Notes

  1. Key takeaway: It is not just a BI or analytics challenge, it is the way that data is managed. Keeping 3 main high level objectives of an architecture built for Data Discovery in mind- accessing data, analyzing data, and experimenting and iterating fast- we can examine a traditional architecture and see where organizations might run into issues. Questions for customer: Does this look like your architecture?
  2. Limited Data Access Data siloes Archived or deleted data No unstructured data Only SQL Long Time to Value Resource intensive ad-hoc ELT Adding dimensions takes months Slow large scale queries Compliance & Privacy Concerns Data copied Managing users across systems Protecting data without limiting users
  3. Key takeaway: An EDH provides the foundation to change the way you collect and manage data in order to provide your analyst what they need in less time. ETL on the fly: Talk to schema-on-write vs schema-on-read (http://www.slideshare.net/awadallah/schemaonread-vs-schemaonwrite). 1) Unlimited Data Access (Active archive, Scalable storage, Unstructured data) 2) Reduce Time to Value (ETL on the fly, Parallel processing, Complete data access) 3) Secure and Compliant (Compliant-ready, Tool Integration, Data lineage)
  4. Pulling from the “Insights Section”
  5. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting a BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data
  6. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting any BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data
  7. Why Hadoop slide content: Even with primarily relational systems, it involved hundreds of sources Getting any BI tool to connect to so many sources is … not fun More times than not, we needed to understand a subset or aggregate of this data - not all of the data! Can use Pig to process, extract, filter the data Can use Hive - a SQL like query language - to query my data