SlideShare a Scribd company logo
Welcome toInside Cloudera’s Distribution including Apache Hadoop Audio/Telephone: +1 (314) 627-1519 Access Code:  380-729-510 Audio PIN: Shown after joining the Webinar Presenter: Charles Zedlewski, Cloudera VP of Product
Housekeeping Ask questions at any time using the Questions panel Problems? Use the Chat panel Slides and recording will be available 2 Copyright 2011 Cloudera Inc. All rights reserved
What Cloudera set out to do with CDH3 Give organizations an integrated, complete data management system that is 100% Apache open source Provide a platform that the rest of the enterprise IT ecosystem could integrate with Continue to make Apache Hadoop even easier to adopt Provide a level of release predictability that organizations can plan their maintenance and upgrade cycles on
An integrated data management system – what did Google do? Dremel Evenflow Evenflow Dremel Sawzall Bigtable MySQL Gateway MapReduce / GFS Chubby
The pattern repeats… HiPal Databee Databee Hive Hive HBase Scribe Zookeeper
The pattern repeats… Oozie Oozie Hive Pig & Hive HBase Data Highway Zookeeper
The pattern repeats… Azkaban Azkaban Pig Voldemort Sqoop Kafka Zookeeper
CDH3 assembled the best of the Apache Hadoop ecosystem into an integrated system so you don’t have to Cloudera’s Distribution Including Apache Hadoop Hue Hue Oozie Oozie Hive Hive / Pig HBase Sqoop Flume Zookeeper
How CDH3 got created Enhancements written and contributed to Apache projects Customer and partner requirements Integration, testing, & backporting Releases selected or cut Stable release! Hadoop 0.20.2 +923 HBase 0.90.1 +15 Hive 0.7 +22 Pig 0.8 +20 Flume 0.9.3 +17 Oozie 20.2 +31 Hue 1.2.0 +0 Sqoop 1.2 +24 Zookeeper 3.3.3 +12 HDFS Beta cycle, more backporting Prioritization based on customer value, cost and (for CDH) community readiness  HBase Flume, etc
CDH2 to CDH3 Copyright 2011.   Cloudera confidential and proprietary.  Redistribution without permission is not permitted
So what?
Example 1 – clickstream sessions Hive Store table metadata MapReduce Sqoop Reliably collect logs Process into sessions Export in EDW for BI reporting Flume HDFS Store in the filesystem
Example 2 – fraud analysis Sqoop Hive Analytics performed using HQL Import regularly changing dimension data HBase MapReduce Sqoop HDFS Import of fact data into filesystem
What Cloudera set out to do with CDH3 Give organizations an integrated, complete data management system that is 100% Apache open source Provide a platform that the rest of the enterprise IT ecosystem could integrate with Continue to make Apache Hadoop even easier to adopt Provide a level of release predictability that organizations can plan their maintenance and upgrade cycles on
Investing in interfacing with the Enterprise IT ecosystem Drivers, language enhancements, testing Cloudera’s Distribution Including Apache Hadoop Sqoop frame-work, adapters More coming… Packaging, testing
What Cloudera set out to do with CDH3 Give organizations an integrated, complete data management system that is 100% Apache open source Provide a platform that the rest of the enterprise IT ecosystem could integrate with Continue to make Apache Hadoop even easier to adopt Provide a level of release predictability that organizations can plan their maintenance and upgrade cycles on
Ease of adoption - making CDH a more enterprise quality artifact Regular, non-disruptive updates
There are new features for each component too (partial list)
What’s next? ,[object Object]
Key themes:
Improved availability

More Related Content

Viewers also liked

Biography
BiographyBiography
Biography
shema12345
 
Teaching phonics for grade 1 students
Teaching phonics for grade 1 studentsTeaching phonics for grade 1 students
Teaching phonics for grade 1 students
Bayan Chehab
 
Benjamin e
Benjamin eBenjamin e
Benjamin efbcat
 
Actividad
ActividadActividad
Actividadlindo53
 
Func vitales
Func vitalesFunc vitales
Func vitales
sabinibarra
 
Instagram ja Pinterest tutuiksi
Instagram ja Pinterest tutuiksi Instagram ja Pinterest tutuiksi
Instagram ja Pinterest tutuiksi
FutureMarja
 
ΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑ
ΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑ
ΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑ
Κατερίνα Μιχαηλάρου
 
Tee Tulosta Facebookilla 08112016
Tee Tulosta Facebookilla 08112016Tee Tulosta Facebookilla 08112016
Tee Tulosta Facebookilla 08112016
FutureMarja
 
Herbáceas. Déficit hídrico.
Herbáceas. Déficit hídrico.Herbáceas. Déficit hídrico.
Herbáceas. Déficit hídrico.
Fundación Espacios para la Vida
 
Concepts of Management
Concepts of ManagementConcepts of Management
Concepts of Management
tarunnamrata
 
Επιπλοκές μετάγγισης
Επιπλοκές μετάγγισηςΕπιπλοκές μετάγγισης
Επιπλοκές μετάγγισης
fotisgirtovitis
 
Funciones de nutrición
Funciones de nutriciónFunciones de nutrición
Funciones de nutriciónRocio Cano
 
HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU!
HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU! HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU!
HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU!
Cloudera, Inc.
 
Tema 3 funciones vitales reproducción
Tema 3 funciones vitales  reproducciónTema 3 funciones vitales  reproducción
Tema 3 funciones vitales reproducción
geopaloma
 

Viewers also liked (18)

Biography
BiographyBiography
Biography
 
Final Presentation Pee for Pizza
Final Presentation Pee for PizzaFinal Presentation Pee for Pizza
Final Presentation Pee for Pizza
 
My Resume
My ResumeMy Resume
My Resume
 
Teaching phonics for grade 1 students
Teaching phonics for grade 1 studentsTeaching phonics for grade 1 students
Teaching phonics for grade 1 students
 
Benjamin e
Benjamin eBenjamin e
Benjamin e
 
Actividad
ActividadActividad
Actividad
 
Func vitales
Func vitalesFunc vitales
Func vitales
 
PPT
PPTPPT
PPT
 
Instagram ja Pinterest tutuiksi
Instagram ja Pinterest tutuiksi Instagram ja Pinterest tutuiksi
Instagram ja Pinterest tutuiksi
 
Elvit brochure RU NEW
Elvit brochure RU NEW Elvit brochure RU NEW
Elvit brochure RU NEW
 
ΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑ
ΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑ
ΗΠΑΡΙΝΟΘΕΡΑΠΕΙΑ
 
Tee Tulosta Facebookilla 08112016
Tee Tulosta Facebookilla 08112016Tee Tulosta Facebookilla 08112016
Tee Tulosta Facebookilla 08112016
 
Herbáceas. Déficit hídrico.
Herbáceas. Déficit hídrico.Herbáceas. Déficit hídrico.
Herbáceas. Déficit hídrico.
 
Concepts of Management
Concepts of ManagementConcepts of Management
Concepts of Management
 
Επιπλοκές μετάγγισης
Επιπλοκές μετάγγισηςΕπιπλοκές μετάγγισης
Επιπλοκές μετάγγισης
 
Funciones de nutrición
Funciones de nutriciónFunciones de nutrición
Funciones de nutrición
 
HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU!
HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU! HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU!
HBaseCon 2013: Apache HBase, Apache Hadoop, DNA and YOU!
 
Tema 3 funciones vitales reproducción
Tema 3 funciones vitales  reproducciónTema 3 funciones vitales  reproducción
Tema 3 funciones vitales reproducción
 

Similar to Webinar: Inside Cloudera's Distribution including Apache Hadoop v3

Bigdata
BigdataBigdata
Bigdata
sweetysweety8
 
Instant hadoop of your own
Instant hadoop of your ownInstant hadoop of your own
Instant hadoop of your own
Jack (Yaakov) Bezalel
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst TrainingCloudera, Inc.
 
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Hadoop World 2010: Productionizing Hadoop: Lessons LearnedHadoop World 2010: Productionizing Hadoop: Lessons Learned
Hadoop World 2010: Productionizing Hadoop: Lessons LearnedCloudera, Inc.
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
Azure Big data
Azure Big data Azure Big data
Azure Big data
Michel HUBERT
 
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Cloudera, Inc.
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
Yahoo Developer Network
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
eduarderwee
 
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Edureka!
 
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud EraModernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
Alluxio, Inc.
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
Cloudera, Inc.
 
HADOOP ONLINE TRAINING
HADOOP ONLINE TRAININGHADOOP ONLINE TRAINING
HADOOP ONLINE TRAINING
Santhosh Sap
 
HADOOP ONLINE TRAINING
HADOOP ONLINE TRAININGHADOOP ONLINE TRAINING
HADOOP ONLINE TRAININGtraining3
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
markgrover
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a Glance
Neev Technologies
 
VMware vROps Management Pack for Hadoop
VMware vROps Management Pack for HadoopVMware vROps Management Pack for Hadoop
VMware vROps Management Pack for Hadoop
Blue Medora
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera, Inc.
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Big Data Aplications Meetup
 

Similar to Webinar: Inside Cloudera's Distribution including Apache Hadoop v3 (20)

Bigdata
BigdataBigdata
Bigdata
 
Instant hadoop of your own
Instant hadoop of your ownInstant hadoop of your own
Instant hadoop of your own
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst Training
 
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
Hadoop World 2010: Productionizing Hadoop: Lessons LearnedHadoop World 2010: Productionizing Hadoop: Lessons Learned
Hadoop World 2010: Productionizing Hadoop: Lessons Learned
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Azure Big data
Azure Big data Azure Big data
Azure Big data
 
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
 
Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010Hadoop Frameworks Panel__HadoopSummit2010
Hadoop Frameworks Panel__HadoopSummit2010
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
 
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud EraModernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
Modernizing Your Data Platform for Analytics and AI in the Hybrid Cloud Era
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
 
HADOOP ONLINE TRAINING
HADOOP ONLINE TRAININGHADOOP ONLINE TRAINING
HADOOP ONLINE TRAINING
 
HADOOP ONLINE TRAINING
HADOOP ONLINE TRAININGHADOOP ONLINE TRAINING
HADOOP ONLINE TRAINING
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
 
Hadoop Ecosystem at a Glance
Hadoop Ecosystem at a GlanceHadoop Ecosystem at a Glance
Hadoop Ecosystem at a Glance
 
VMware vROps Management Pack for Hadoop
VMware vROps Management Pack for HadoopVMware vROps Management Pack for Hadoop
VMware vROps Management Pack for Hadoop
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7
 
Accelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & AlluxioAccelerating workloads and bursting data with Google Dataproc & Alluxio
Accelerating workloads and bursting data with Google Dataproc & Alluxio
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 

Webinar: Inside Cloudera's Distribution including Apache Hadoop v3

  • 1. Welcome toInside Cloudera’s Distribution including Apache Hadoop Audio/Telephone: +1 (314) 627-1519 Access Code: 380-729-510 Audio PIN: Shown after joining the Webinar Presenter: Charles Zedlewski, Cloudera VP of Product
  • 2. Housekeeping Ask questions at any time using the Questions panel Problems? Use the Chat panel Slides and recording will be available 2 Copyright 2011 Cloudera Inc. All rights reserved
  • 3. What Cloudera set out to do with CDH3 Give organizations an integrated, complete data management system that is 100% Apache open source Provide a platform that the rest of the enterprise IT ecosystem could integrate with Continue to make Apache Hadoop even easier to adopt Provide a level of release predictability that organizations can plan their maintenance and upgrade cycles on
  • 4. An integrated data management system – what did Google do? Dremel Evenflow Evenflow Dremel Sawzall Bigtable MySQL Gateway MapReduce / GFS Chubby
  • 5. The pattern repeats… HiPal Databee Databee Hive Hive HBase Scribe Zookeeper
  • 6. The pattern repeats… Oozie Oozie Hive Pig & Hive HBase Data Highway Zookeeper
  • 7. The pattern repeats… Azkaban Azkaban Pig Voldemort Sqoop Kafka Zookeeper
  • 8. CDH3 assembled the best of the Apache Hadoop ecosystem into an integrated system so you don’t have to Cloudera’s Distribution Including Apache Hadoop Hue Hue Oozie Oozie Hive Hive / Pig HBase Sqoop Flume Zookeeper
  • 9. How CDH3 got created Enhancements written and contributed to Apache projects Customer and partner requirements Integration, testing, & backporting Releases selected or cut Stable release! Hadoop 0.20.2 +923 HBase 0.90.1 +15 Hive 0.7 +22 Pig 0.8 +20 Flume 0.9.3 +17 Oozie 20.2 +31 Hue 1.2.0 +0 Sqoop 1.2 +24 Zookeeper 3.3.3 +12 HDFS Beta cycle, more backporting Prioritization based on customer value, cost and (for CDH) community readiness HBase Flume, etc
  • 10. CDH2 to CDH3 Copyright 2011. Cloudera confidential and proprietary. Redistribution without permission is not permitted
  • 12. Example 1 – clickstream sessions Hive Store table metadata MapReduce Sqoop Reliably collect logs Process into sessions Export in EDW for BI reporting Flume HDFS Store in the filesystem
  • 13. Example 2 – fraud analysis Sqoop Hive Analytics performed using HQL Import regularly changing dimension data HBase MapReduce Sqoop HDFS Import of fact data into filesystem
  • 14. What Cloudera set out to do with CDH3 Give organizations an integrated, complete data management system that is 100% Apache open source Provide a platform that the rest of the enterprise IT ecosystem could integrate with Continue to make Apache Hadoop even easier to adopt Provide a level of release predictability that organizations can plan their maintenance and upgrade cycles on
  • 15. Investing in interfacing with the Enterprise IT ecosystem Drivers, language enhancements, testing Cloudera’s Distribution Including Apache Hadoop Sqoop frame-work, adapters More coming… Packaging, testing
  • 16. What Cloudera set out to do with CDH3 Give organizations an integrated, complete data management system that is 100% Apache open source Provide a platform that the rest of the enterprise IT ecosystem could integrate with Continue to make Apache Hadoop even easier to adopt Provide a level of release predictability that organizations can plan their maintenance and upgrade cycles on
  • 17. Ease of adoption - making CDH a more enterprise quality artifact Regular, non-disruptive updates
  • 18. There are new features for each component too (partial list)
  • 19.
  • 22. Lower TCO through harmonization
  • 24. Expand the community of users that can work with Apache Hadoop
  • 27.
  • 28. 21 Copyright 2011 Cloudera Inc. All rights reserved