SlideShare a Scribd company logo
1 of 19
The Future of Data Management: 
The Enterprise Data Hub 
Dr. Amr Awadallah (@awadallah) 
Cofounder & Chief Technology Officer 
Cloudera, Inc. 
1 ©2014 Cloudera, Inc. All rights reserved.
©2014 Cloudera, Inc. All rights reserved. 
Cloudera Snapshot 
2 
Founded 2008, by former employees of 
Employees Today ~ 700 
World Class Support 24x7 Global Staff 
Pro-active & Predictive Support Programs 
Mission Critical Thousands of Enterprise Users 
Over 400 Paying Subscription Customers 
The Largest Ecosystem Over 1000+ Partners 
Cloudera University Over 100,000+ Trained 
Open Source Leaders Cloudera Employees are Leading Developers & Contributors 
Total Capital Raised A lot! (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock) 
Mission Help Organizations Leverage the Power of 
All Their Data to Ask Bigger Questions.
Why is Big Data Happening Now? 
©2014 Cloudera, Inc. 3 All rights reserved.
10TB to 10PB 
IT’S ALL 
(BIG) 
DATA 
(NOT) 
©2014 Cloudera, Inc. 4 All rights reserved.
MEDIA / 
ENTERTAINMENT 
Viewers / 
advertising 
effectiveness 
ON-LINE SERVICES / 
SOCIAL MEDIA 
People & career 
matching 
Website 
optimization 
HEALTH CARE 
Patient sensors, 
monitoring, 
EHRs Quality 
of care 
FINANCIAL SERVICES 
Risk & portfolio 
analysis 
New products 
CONSUMER 
PACKAGED GOODS 
Sentiment 
analysis of 
what’s hot, 
customer service 
TRAVEL & TRANSPORTATION 
Sensor analysis for 
optimal traffic flows 
Customer 
sentiment 
RETAIL 
Consumer sentiment 
Optimized 
marketing 
EDUCATION 
& RESEARCH 
Experiment 
sensor 
analysis 
LIFE SCIENCES 
Clinical trials 
Genomics 
AUTOMOTIVE 
Auto sensors 
reporting location, 
problems 
COMMUNICATIONS 
Location-based 
advertising 
HIGH TECHNOLOGY / 
INDUSTRIAL MFG. 
Mfg quality 
Warranty 
analysis 
UTILITIES 
Smart Meter 
analysis for 
network 
capacity 
OIL & GAS 
Drilling 
exploration 
sensor 
analysis 
LAW ENFORCEMENT 
& DEFENSE 
Threat analysis, 
Social media 
monitoring, 
Photo analysis 
It Isn’t Just About Web 2.0 / Social 
©2014 Cloudera, 5 Inc. All rights reserved.
Customer Success Across Industries 
Financial & 
Business Services 
Telecom & 
Technology 
Healthcare & 
Life Sciences 
Media & 
Information 
Retail & 
Consumer 
Energy & 
Public Sector
Expanding Data Requires A New Approach 
©2014 Cloudera, Inc. All rights reserved. 
7 
What we do 
Copy Data to Applications 
What we should do 
Bring Applications to Data 
Data 
Information-centric 
businesses use all Data: 
Multi-structured, 
Internal & external data 
of all types 
App 
App 
App 
Process-centric 
businesses use: 
• Structured data mainly 
• Internal data only 
• “Important” data only 
• Multiple copies of data 
App 
App 
App 
Data 
Data 
Data 
Data
The Power of the Enterprise Data Hub is … 
©2014 Cloudera, Inc. All rights reserved. 
8 
THE OLD WAY EDH
Hadoop Changes the Game: 
Storage and Compute on One Platform 
The Old Way The Hadoop Way 
Network 
Expensive, Special purpose, “Reliable” Servers 
Expensive Licensed Software 
• Hard to scale 
• Network is a bottleneck 
• Only handles relational data 
• Difficult to add new fields & data types 
Expensive & Unattainable 
$30,000+ per TB 
Data Storage 
(SAN, NAS) 
Compute 
(RDBMS, EDW) 
Commodity “Unreliable” Servers 
Hybrid Open Source Software 
• Scales out forever 
• No bottlenecks 
• Easy to ingest any data 
• Agile data access 
Affordable & Attainable 
$300-$1,000 per TB 
Compute 
(CPU) 
Memory Storage 
(Disk) 
z 
z 
9 ©2014 Cloudera, Inc. All rights reserved.
The Old Way: Bringing Data to Compute 
©2014 Cloudera, Inc. All rights reserved. 
3 
2 
10 
Complex Architecture 
• Many special-purpose systems 
• Moving data around 
• No complete views 
Cost of Analytics 
• Existing systems strained 
• No agility 
• “BI backlog” 
Time to Data 
• Up-front modeling 
• Transforms slow 
• Transforms lose data 
Missing Data 
• Leaving data behind 
• Risk and compliance 
• High cost of storage 
4 
1 
EDWS MARTS SERVERS DOCUMENTS STORAGE SEARCH ARCHIVE 
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
The New Way: Bringing Applications to Data 
2 
©2014 Cloudera, Inc. All rights reserved. 
11 
SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE 
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS ESTERNAL DATA SOURCES 
Diverse Analytic Platform 
• Bring applications to data 
• Combine different workloads on 
common data (i.e. SQL + Search) 
• True analytic agility 
4 
1 
3 4 
Active Compliance Archive 
• Full fidelity original data 
• Indefinite time, any source 
• Lowest cost storage 
1 
Persistent Staging 
• One source of data for all analytics 
• Persist state of transformed data 
• Significantly faster & cheaper 
2 
Self-Service Exploratory BI 
• Simple search + BI tools 
• “Schema on read” agility 
• Reduce BI user backlog requests 
3
Core Benefits of the Enterprise Data Hub (EDH) 
• Full-Fidelity Active Compliance Archive 
• Accelerate Time to Insight (Scale) 
• Unlock Agility and Innovation 
• Consolidate Silos for 360o View 
• Enable Converged Analytics 
©2014 Cloudera, Inc. All rights reserved. 
12
A Look Inside The Enterprise Data Hub 
CLOUDERA’S ENTERPRISE DATA HUB 
©2014 Cloudera, Inc. All rights reserved. 
13 
Open Source, 
Scalable, 
Flexible, and 
Cost-Effective 
✔ 
Unified and 
Managed ✖ 
✔ 
✔ 
✔ 
Open 
Architecture ✖ 
Secure and 
Governed ✖ 
3RD PARTY 
APPS 
(Many) 
STORAGE FOR ANY TYPE OF DATA 
UNIFIED, ELASTIC, RESILIENT, SECURE (Sentry, Gazzang, Rhino) 
BATCH 
PROCESSING 
(MR, Hive, Pig) 
INTERACTIVE 
SQL 
(Impala) 
SEARCH 
ENGINE 
(SOLR) 
MACHINE 
LEARNING 
(SPARK) 
STREAM 
PROCESSING 
(SPARK) 
WORKLOAD MANAGEMENT (YARN) 
FILESYSTEM 
(HDFS) 
ONLINE NOSQL 
(HBASE) 
DATA 
MANAGEMENT 
(Navigator) 
SYSTEM 
MANAGEMENT 
(Cloudera Manager) 
DATA COLLECTION (Flume, Sqoop, NFS)
Enabling The App Store of Big Data 
BI and Analytics 
Partners 
SI, Cloud, MSP 
Partners 
Database 
Partners 
Resellers 
Data Integration 
Partners 
Hardware 
Partners 
©2014 Cloudera, Inc. 14 All rights reserved.
2014 Gartner MQ for Data Warehouse DBMS 
“A data warehouse DBMS is now expected 
to coordinate data virtualization strategies, 
and distributed file and/or processing 
approaches, to address changes in data 
management and access requirements.” 
©2014 Cloudera, Inc. 15 All rights reserved.
The Modern Information Architecture 
Data Architects System Operators Engineers Data Scientists Analysts Business Users 
BI / ANALYTICS REPORTING MACHINE 
LEARNING 
ENTERPRISE 
ENTERPRISE DATA 
WAREHOUSE 
ONLINE SERVING 
SYSTEM 
WEB/MOBILE APPLICATIONS 
CONVERGED 
APPLICATIONS 
CLOUDERA 
MANAGER 
META DATA / 
ETL TOOLS 
ENTERPRISE DATA HUB 
©2014 Cloudera, Inc. All Rights Reserved. 
Customers & End Users 
SYS LOGS WEB LOGS FILES RDBMS 
16
A High Level View of the Journey 
Data 
Science 
Agile 
Exploration 
Operational Efficiency 
(Faster, Bigger, Cheaper) 
ETL 
Acceleration 
Transformative Applications 
(New Business Value) 
Cheap 
Storage 
EDW 
Optimization 
Converged 
Analytics 
IT Business 
17 ©2014 Cloudera, Inc. All rights reserved.
Core Benefits of the Enterprise Data Hub (EDH) 
• Full-Fidelity Active Compliance Archive 
• Accelerate Time to Insight (Scale) 
• Unlock Agility and Innovation 
• Consolidate Silos for 360o View 
• Enable Converged Analytics 
©2014 Cloudera, Inc. 18 All rights reserved.
Thank You! 
Amr Awadallah (@awadallah)

More Related Content

What's hot

Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data Virtualization
Denodo
 

What's hot (20)

Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best Practices
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
 
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
Cloud Data Warehousing presentation by Rogier Werschkull, including tips, bes...
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
DataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data ArchitectureDataOps - The Foundation for Your Agile Data Architecture
DataOps - The Foundation for Your Agile Data Architecture
 
Talend Open Studio Data Integration
Talend Open Studio Data IntegrationTalend Open Studio Data Integration
Talend Open Studio Data Integration
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data Virtualization
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
 
Modernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data PipelinesModernize & Automate Analytics Data Pipelines
Modernize & Automate Analytics Data Pipelines
 
Snowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data WarehousingSnowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data Warehousing
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Data Vault Vs Data Lake
Data Vault Vs Data LakeData Vault Vs Data Lake
Data Vault Vs Data Lake
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Washington DC DataOps Meetup -- Nov 2019
Washington DC DataOps Meetup   -- Nov 2019Washington DC DataOps Meetup   -- Nov 2019
Washington DC DataOps Meetup -- Nov 2019
 

Viewers also liked

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Cloudera Enterprise_Data Hub in Telecom
Cloudera Enterprise_Data Hub in TelecomCloudera Enterprise_Data Hub in Telecom
Cloudera Enterprise_Data Hub in Telecom
Einsny Phionesgo
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
Amr Awadallah
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Somnath Mazumdar
 
MapR-DB Elasticsearch Integration
MapR-DB Elasticsearch IntegrationMapR-DB Elasticsearch Integration
MapR-DB Elasticsearch Integration
MapR Technologies
 

Viewers also liked (20)

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
AURIN Data Hubs Supporting Smarter Cities - Phil Delaney, Locate14
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
 
Cloudera Enterprise_Data Hub in Telecom
Cloudera Enterprise_Data Hub in TelecomCloudera Enterprise_Data Hub in Telecom
Cloudera Enterprise_Data Hub in Telecom
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
Cloudera/Stanford EE203 (Entrepreneurial Engineer)
Cloudera/Stanford EE203 (Entrepreneurial Engineer)Cloudera/Stanford EE203 (Entrepreneurial Engineer)
Cloudera/Stanford EE203 (Entrepreneurial Engineer)
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
 
ElasticES-Hadoop: Bridging the world of Hadoop and Elasticsearch
ElasticES-Hadoop: Bridging the world of Hadoop and ElasticsearchElasticES-Hadoop: Bridging the world of Hadoop and Elasticsearch
ElasticES-Hadoop: Bridging the world of Hadoop and Elasticsearch
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
MapR-DB Elasticsearch Integration
MapR-DB Elasticsearch IntegrationMapR-DB Elasticsearch Integration
MapR-DB Elasticsearch Integration
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
Big Data Modeling and Analytic Patterns – Beyond Schema on ReadBig Data Modeling and Analytic Patterns – Beyond Schema on Read
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
 
Sumo Logic Webinar: Visibility into your Host Metrics
Sumo Logic Webinar: Visibility into your Host MetricsSumo Logic Webinar: Visibility into your Host Metrics
Sumo Logic Webinar: Visibility into your Host Metrics
 

Similar to The Future of Data Management: The Enterprise Data Hub

Similar to The Future of Data Management: The Enterprise Data Hub (20)

MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Making Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useMaking Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to use
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
Cloudera Federal Forum 2014: Hadoop's Impact on the Future of Data Management
Cloudera Federal Forum 2014: Hadoop's Impact on the Future of Data ManagementCloudera Federal Forum 2014: Hadoop's Impact on the Future of Data Management
Cloudera Federal Forum 2014: Hadoop's Impact on the Future of Data Management
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
The Journey to Success with Big Data
The Journey to Success with Big DataThe Journey to Success with Big Data
The Journey to Success with Big Data
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

The Future of Data Management: The Enterprise Data Hub

  • 1. The Future of Data Management: The Enterprise Data Hub Dr. Amr Awadallah (@awadallah) Cofounder & Chief Technology Officer Cloudera, Inc. 1 ©2014 Cloudera, Inc. All rights reserved.
  • 2. ©2014 Cloudera, Inc. All rights reserved. Cloudera Snapshot 2 Founded 2008, by former employees of Employees Today ~ 700 World Class Support 24x7 Global Staff Pro-active & Predictive Support Programs Mission Critical Thousands of Enterprise Users Over 400 Paying Subscription Customers The Largest Ecosystem Over 1000+ Partners Cloudera University Over 100,000+ Trained Open Source Leaders Cloudera Employees are Leading Developers & Contributors Total Capital Raised A lot! (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock) Mission Help Organizations Leverage the Power of All Their Data to Ask Bigger Questions.
  • 3. Why is Big Data Happening Now? ©2014 Cloudera, Inc. 3 All rights reserved.
  • 4. 10TB to 10PB IT’S ALL (BIG) DATA (NOT) ©2014 Cloudera, Inc. 4 All rights reserved.
  • 5. MEDIA / ENTERTAINMENT Viewers / advertising effectiveness ON-LINE SERVICES / SOCIAL MEDIA People & career matching Website optimization HEALTH CARE Patient sensors, monitoring, EHRs Quality of care FINANCIAL SERVICES Risk & portfolio analysis New products CONSUMER PACKAGED GOODS Sentiment analysis of what’s hot, customer service TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment RETAIL Consumer sentiment Optimized marketing EDUCATION & RESEARCH Experiment sensor analysis LIFE SCIENCES Clinical trials Genomics AUTOMOTIVE Auto sensors reporting location, problems COMMUNICATIONS Location-based advertising HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis UTILITIES Smart Meter analysis for network capacity OIL & GAS Drilling exploration sensor analysis LAW ENFORCEMENT & DEFENSE Threat analysis, Social media monitoring, Photo analysis It Isn’t Just About Web 2.0 / Social ©2014 Cloudera, 5 Inc. All rights reserved.
  • 6. Customer Success Across Industries Financial & Business Services Telecom & Technology Healthcare & Life Sciences Media & Information Retail & Consumer Energy & Public Sector
  • 7. Expanding Data Requires A New Approach ©2014 Cloudera, Inc. All rights reserved. 7 What we do Copy Data to Applications What we should do Bring Applications to Data Data Information-centric businesses use all Data: Multi-structured, Internal & external data of all types App App App Process-centric businesses use: • Structured data mainly • Internal data only • “Important” data only • Multiple copies of data App App App Data Data Data Data
  • 8. The Power of the Enterprise Data Hub is … ©2014 Cloudera, Inc. All rights reserved. 8 THE OLD WAY EDH
  • 9. Hadoop Changes the Game: Storage and Compute on One Platform The Old Way The Hadoop Way Network Expensive, Special purpose, “Reliable” Servers Expensive Licensed Software • Hard to scale • Network is a bottleneck • Only handles relational data • Difficult to add new fields & data types Expensive & Unattainable $30,000+ per TB Data Storage (SAN, NAS) Compute (RDBMS, EDW) Commodity “Unreliable” Servers Hybrid Open Source Software • Scales out forever • No bottlenecks • Easy to ingest any data • Agile data access Affordable & Attainable $300-$1,000 per TB Compute (CPU) Memory Storage (Disk) z z 9 ©2014 Cloudera, Inc. All rights reserved.
  • 10. The Old Way: Bringing Data to Compute ©2014 Cloudera, Inc. All rights reserved. 3 2 10 Complex Architecture • Many special-purpose systems • Moving data around • No complete views Cost of Analytics • Existing systems strained • No agility • “BI backlog” Time to Data • Up-front modeling • Transforms slow • Transforms lose data Missing Data • Leaving data behind • Risk and compliance • High cost of storage 4 1 EDWS MARTS SERVERS DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
  • 11. The New Way: Bringing Applications to Data 2 ©2014 Cloudera, Inc. All rights reserved. 11 SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS ESTERNAL DATA SOURCES Diverse Analytic Platform • Bring applications to data • Combine different workloads on common data (i.e. SQL + Search) • True analytic agility 4 1 3 4 Active Compliance Archive • Full fidelity original data • Indefinite time, any source • Lowest cost storage 1 Persistent Staging • One source of data for all analytics • Persist state of transformed data • Significantly faster & cheaper 2 Self-Service Exploratory BI • Simple search + BI tools • “Schema on read” agility • Reduce BI user backlog requests 3
  • 12. Core Benefits of the Enterprise Data Hub (EDH) • Full-Fidelity Active Compliance Archive • Accelerate Time to Insight (Scale) • Unlock Agility and Innovation • Consolidate Silos for 360o View • Enable Converged Analytics ©2014 Cloudera, Inc. All rights reserved. 12
  • 13. A Look Inside The Enterprise Data Hub CLOUDERA’S ENTERPRISE DATA HUB ©2014 Cloudera, Inc. All rights reserved. 13 Open Source, Scalable, Flexible, and Cost-Effective ✔ Unified and Managed ✖ ✔ ✔ ✔ Open Architecture ✖ Secure and Governed ✖ 3RD PARTY APPS (Many) STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE (Sentry, Gazzang, Rhino) BATCH PROCESSING (MR, Hive, Pig) INTERACTIVE SQL (Impala) SEARCH ENGINE (SOLR) MACHINE LEARNING (SPARK) STREAM PROCESSING (SPARK) WORKLOAD MANAGEMENT (YARN) FILESYSTEM (HDFS) ONLINE NOSQL (HBASE) DATA MANAGEMENT (Navigator) SYSTEM MANAGEMENT (Cloudera Manager) DATA COLLECTION (Flume, Sqoop, NFS)
  • 14. Enabling The App Store of Big Data BI and Analytics Partners SI, Cloud, MSP Partners Database Partners Resellers Data Integration Partners Hardware Partners ©2014 Cloudera, Inc. 14 All rights reserved.
  • 15. 2014 Gartner MQ for Data Warehouse DBMS “A data warehouse DBMS is now expected to coordinate data virtualization strategies, and distributed file and/or processing approaches, to address changes in data management and access requirements.” ©2014 Cloudera, Inc. 15 All rights reserved.
  • 16. The Modern Information Architecture Data Architects System Operators Engineers Data Scientists Analysts Business Users BI / ANALYTICS REPORTING MACHINE LEARNING ENTERPRISE ENTERPRISE DATA WAREHOUSE ONLINE SERVING SYSTEM WEB/MOBILE APPLICATIONS CONVERGED APPLICATIONS CLOUDERA MANAGER META DATA / ETL TOOLS ENTERPRISE DATA HUB ©2014 Cloudera, Inc. All Rights Reserved. Customers & End Users SYS LOGS WEB LOGS FILES RDBMS 16
  • 17. A High Level View of the Journey Data Science Agile Exploration Operational Efficiency (Faster, Bigger, Cheaper) ETL Acceleration Transformative Applications (New Business Value) Cheap Storage EDW Optimization Converged Analytics IT Business 17 ©2014 Cloudera, Inc. All rights reserved.
  • 18. Core Benefits of the Enterprise Data Hub (EDH) • Full-Fidelity Active Compliance Archive • Accelerate Time to Insight (Scale) • Unlock Agility and Innovation • Consolidate Silos for 360o View • Enable Converged Analytics ©2014 Cloudera, Inc. 18 All rights reserved.
  • 19. Thank You! Amr Awadallah (@awadallah)

Editor's Notes

  1. Cloudera has built and maintains the industry’s most extensive partner ecosystem to ensure that our customers can leverage their existing investments in tools and skills as they adopt new Hadoop technology. Our goal is to help you minimize disruption while delivering maximum value. Over 800 partners across hardware, software and services – including the leaders in all major market segments Continue to leverage the technologies and solution providers you’ve already invested in Drive additional value through integrated solutions
  2. Alright, we have a partner we’d like to now bring up to the stage. It is with great pleasure that I introduce Sanjay Gojija from Intel to give you some insight on Accelerating Enterprise Big Data Success!