SlideShare a Scribd company logo
Hadoop Grows Up!

Doug Cutting – Hadoop project founder
June 29th 2010
Hadoop is Almost 5 Years Old!




                                2
A Buzzword…




              3
A Career…




            Copyright 2010 Cloudera Inc. All rights reserved   4
An Ecosystem…




           Copyright 2010 Cloudera Inc. All rights reserved   5
A Market…




            Copyright 2010 Cloudera Inc. All rights reserved   6
An Emerging Platform for Applications…
 Graph analysis     Machine learning                     Scientific         Archive   Security
Query & reporting    Complex ETL                     Search quality           Fraud detection
  Clickstream analysis          POS analysis                    Trade compliance      And more…




                         Copyright 2010 Cloudera Inc. All rights reserved                         7
Hadoop Started From Humble Beginnings….


• MapReduce and HDFS only

• Good for experienced Java
  programmers

• Limited application set



                 Copyright 2010 Cloudera Inc. All rights reserved   8
Innovation: the Secret to Hadoop Success

• Projects & components
  develop around Hadoop
                                                                   “Provide more levels of
                                            “Provide common
                                                                   abstraction & automation
                                            technical services”
                                                                   for job creation ”
• User base grows
                             “Make it                                       “Cover more data
                                                                            movements –
• More applications          easier to get
                             data in & out”                                 inserts, appends,
                                                                            etc”
  are made possible




                Copyright 2010 Cloudera Inc. All rights reserved                              9
But Innovation Isn’t Free

• For every release of MapReduce                             20
  and HDFS, there are >20
                                                             10
  releases of related projects
                                                               0

• Every component has its own
  schedule, versioning,                                     HBase 0.89 HDFS 0.20
  dependencies & patch                                               Pig 0.7
  requirements                                              Hive 0.6      Oozie 2.0


• Hadoop community likes to
  build 2-3 of everything
                   Copyright 2010 Cloudera Inc. All rights reserved                   10
Announcing Cloudera’s Distribution for Hadoop v3
• Open source – 100% Apache licensed
                                                 • Simplified – Cloudera manages
                                                   required versions &
                                                   dependencies
                                                 • Integrated – all components
                                                   interoperate
                                                 • Reliable – patched with fixes
                                                   from future releases to
                                                   improve stability

• Easy to consume – Debian, RPM, tarball, Virtual Machine, EC2,
  Rackspace, Softlayer

                   Copyright 2010 Cloudera Inc. All rights reserved            11
What’s New in CDH v3?
• Updates to existing Hadoop frameworks
   • Pig 0.7
   • Sqoop 1.0
   • Hadoop 0.20S (planned)
• Support for 3 new related components
   • HBase – with durability
   • Zookeeper
   • Oozie – run workflows + support for Hive & Sqoop actions
• Introducing 2 new components
   • Flume – collect streaming data with centralized
      configuration & guaranteed delivery
   • Hue – web UI and SDK for Hadoop web applications
                   Copyright 2010 Cloudera Inc. All rights reserved   12
Charles Zedlewski,
Cloudera Product Management




          Copyright 2010 Cloudera Inc. All rights reserved   13
Harnessing Hadoop Has Challenges

                Skill Set – experts only



 Complexity – more than ten components



          Manageability – hard to configure, monitor & administer



 Interoperability – limited support for DBMS &
                    analytic tools
                                                              14
Announcing Cloudera Enterprise
• Reduces the risks of running Hadoop in production
• Improves consistency, compliance and administrative overhead

                                                                Management tools
                                                                • Monitoring & config for
                                                                  data integration
                                                                • Authorization mgmt &
                                                                  provisioning
                                                                • Resource mgmt


• Production support for CDH & certified integrations (e.g. Oracle,
  Vertica)
                    Copyright 2010 Cloudera Inc. All rights reserved                        15
Demo


Copyright 2010 Cloudera Inc. All rights reserved   16
Some Announcements

• Party at our place
  • Hackathon on CDH3 – applications, enhancements, open
    source contributions
  • July 27th, 9:30am – 7:30pm
  • For invite: hackathon@cloudera.com
  • Free food & snacks


• Or stay home and read
  • Hadoop the Definitive Guide, second edition
  • Available on October 12th at Hadoop World
                 Copyright 2010 Cloudera Inc. All rights reserved   17
Thank You!

• Stop by our table if you have questions!




                 Copyright 2010 Cloudera Inc. All rights reserved   18
Copyright 2010 Cloudera Inc. All rights reserved   19

More Related Content

What's hot

Open stack @ sierra wireless
Open stack @ sierra wirelessOpen stack @ sierra wireless
Open stack @ sierra wireless
LINAGORA
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computing
Joey Echeverria
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
DataWorks Summit
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
EMC
 
HP Helion OpenStack Community Edition Deployment
HP Helion OpenStack Community Edition DeploymentHP Helion OpenStack Community Edition Deployment
HP Helion OpenStack Community Edition Deployment
Marton Kiss
 
Solving Business Challenges with OpenStack
Solving Business Challenges with OpenStackSolving Business Challenges with OpenStack
Solving Business Challenges with OpenStack
Kamesh Pemmaraju
 
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera Japan
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData Inc.
 
Big data on virtualized infrastucture
Big data on virtualized infrastuctureBig data on virtualized infrastucture
Big data on virtualized infrastucture
DataWorks Summit
 
OpenStack & business of cloud
OpenStack & business of cloudOpenStack & business of cloud
OpenStack & business of cloud
openstackindia
 
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac... Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera, Inc.
 
巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture
Wei-Chiu Chuang
 
Operating OpenStack on a Budget
Operating OpenStack on a BudgetOperating OpenStack on a Budget
Operating OpenStack on a Budget
Susan Wu
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Cloudera, Inc.
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
cdmaxime
 
Ceph and openstack at the boston meetup
Ceph and openstack at the boston meetupCeph and openstack at the boston meetup
Ceph and openstack at the boston meetup
Kamesh Pemmaraju
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
Cloudera, Inc.
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
Cloudera, Inc.
 
20120524 cern data centre evolution v2
20120524 cern data centre evolution v220120524 cern data centre evolution v2
20120524 cern data centre evolution v2
Tim Bell
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
Cloudera, Inc.
 

What's hot (20)

Open stack @ sierra wireless
Open stack @ sierra wirelessOpen stack @ sierra wireless
Open stack @ sierra wireless
 
The power of hadoop in cloud computing
The power of hadoop in cloud computingThe power of hadoop in cloud computing
The power of hadoop in cloud computing
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
 
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
Building Hadoop-as-a-Service with Pivotal Hadoop Distribution, Serengeti, & I...
 
HP Helion OpenStack Community Edition Deployment
HP Helion OpenStack Community Edition DeploymentHP Helion OpenStack Community Edition Deployment
HP Helion OpenStack Community Edition Deployment
 
Solving Business Challenges with OpenStack
Solving Business Challenges with OpenStackSolving Business Challenges with OpenStack
Solving Business Challenges with OpenStack
 
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
Cloudera + MicrosoftでHadoopするのがイイらしい。 #CWT2016
 
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot ProgramszData BI & Advanced Analytics Platform + 8 Week Pilot Programs
zData BI & Advanced Analytics Platform + 8 Week Pilot Programs
 
Big data on virtualized infrastucture
Big data on virtualized infrastuctureBig data on virtualized infrastucture
Big data on virtualized infrastucture
 
OpenStack & business of cloud
OpenStack & business of cloudOpenStack & business of cloud
OpenStack & business of cloud
 
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac... Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 
巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture巨量資料入門 The evolution of data architecture
巨量資料入門 The evolution of data architecture
 
Operating OpenStack on a Budget
Operating OpenStack on a BudgetOperating OpenStack on a Budget
Operating OpenStack on a Budget
 
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
Ceph and openstack at the boston meetup
Ceph and openstack at the boston meetupCeph and openstack at the boston meetup
Ceph and openstack at the boston meetup
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 
20120524 cern data centre evolution v2
20120524 cern data centre evolution v220120524 cern data centre evolution v2
20120524 cern data centre evolution v2
 
Hadoop Operations
Hadoop OperationsHadoop Operations
Hadoop Operations
 

Viewers also liked

Ingeniero de sistemas
Ingeniero de sistemasIngeniero de sistemas
Ingeniero de sistemas
gaviria15
 
Final ppp
Final pppFinal ppp
Articuladora1
Articuladora1Articuladora1
Articuladora1
diabea
 
Yuly Lecc1
Yuly Lecc1Yuly Lecc1
Yuly Lecc1
justojustino
 
Presentación1
Presentación1Presentación1
Presentación1
gaitancho
 
Seguridad en internet y telefonía celular
Seguridad en internet y telefonía celularSeguridad en internet y telefonía celular
Seguridad en internet y telefonía celular
Paulavicky33
 
Grandes personajes de la s.g.m.
Grandes personajes de la s.g.m.Grandes personajes de la s.g.m.
Grandes personajes de la s.g.m.
Andres Obonaga
 
XperienceCertificate_Prana
XperienceCertificate_PranaXperienceCertificate_Prana
XperienceCertificate_PranaUmang Verma
 
Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...
Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...
Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...
dsurovtsev
 
Placa madre terminado
Placa madre terminadoPlaca madre terminado
Placa madre terminado
jkyms
 
Facebook
FacebookFacebook
Facebook
alejaurg
 
Arreglos de SalonAzzul
Arreglos de SalonAzzulArreglos de SalonAzzul
Arreglos de SalonAzzul
servicios_salonazzul
 
5 december 2015
5 december  20155 december  2015
5 december 2015
Safalta Vigyan
 
Doremon ep 06 English
Doremon ep 06 EnglishDoremon ep 06 English
Doremon ep 06 English
Kairo Lee
 
sociedad red
sociedad redsociedad red
sociedad red
Rosangela Torres
 
Pascual Saludable: Tu salud, nuestra razón de ser
Pascual Saludable: Tu salud, nuestra razón de serPascual Saludable: Tu salud, nuestra razón de ser
Pascual Saludable: Tu salud, nuestra razón de ser
Calidad Pascual
 
Bases de datos y el acces
Bases de datos y el accesBases de datos y el acces
Bases de datos y el acces
cocky99
 
Ba7013 services marketing
Ba7013 services marketingBa7013 services marketing
Ba7013 services marketing
RAVIKRISHNA NAGARAJAN
 
προσκληση παππουδων
προσκληση παππουδωνπροσκληση παππουδων
προσκληση παππουδωνIoanna Chats
 

Viewers also liked (20)

Ingeniero de sistemas
Ingeniero de sistemasIngeniero de sistemas
Ingeniero de sistemas
 
Final ppp
Final pppFinal ppp
Final ppp
 
Articuladora1
Articuladora1Articuladora1
Articuladora1
 
Yuly Lecc1
Yuly Lecc1Yuly Lecc1
Yuly Lecc1
 
Presentación1
Presentación1Presentación1
Presentación1
 
Seguridad en internet y telefonía celular
Seguridad en internet y telefonía celularSeguridad en internet y telefonía celular
Seguridad en internet y telefonía celular
 
Grandes personajes de la s.g.m.
Grandes personajes de la s.g.m.Grandes personajes de la s.g.m.
Grandes personajes de la s.g.m.
 
XperienceCertificate_Prana
XperienceCertificate_PranaXperienceCertificate_Prana
XperienceCertificate_Prana
 
Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...
Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...
Extracts from paper SPE 162911: Exploration Assets Evaluation: A Practical Wa...
 
Placa madre terminado
Placa madre terminadoPlaca madre terminado
Placa madre terminado
 
Facebook
FacebookFacebook
Facebook
 
Arreglos de SalonAzzul
Arreglos de SalonAzzulArreglos de SalonAzzul
Arreglos de SalonAzzul
 
5 december 2015
5 december  20155 december  2015
5 december 2015
 
Doremon ep 06 English
Doremon ep 06 EnglishDoremon ep 06 English
Doremon ep 06 English
 
sociedad red
sociedad redsociedad red
sociedad red
 
Escenarios
EscenariosEscenarios
Escenarios
 
Pascual Saludable: Tu salud, nuestra razón de ser
Pascual Saludable: Tu salud, nuestra razón de serPascual Saludable: Tu salud, nuestra razón de ser
Pascual Saludable: Tu salud, nuestra razón de ser
 
Bases de datos y el acces
Bases de datos y el accesBases de datos y el acces
Bases de datos y el acces
 
Ba7013 services marketing
Ba7013 services marketingBa7013 services marketing
Ba7013 services marketing
 
προσκληση παππουδων
προσκληση παππουδωνπροσκληση παππουδων
προσκληση παππουδων
 

Similar to Hadoop summit cloudera keynote_v5

Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Cloudera, Inc.
 
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
Cloudera, Inc.
 
10 Common Hadoop-able Problems Webinar
10 Common Hadoop-able Problems Webinar10 Common Hadoop-able Problems Webinar
10 Common Hadoop-able Problems Webinar
Cloudera, Inc.
 
Hadoop As The Platform For The Smartgrid At TVA
Hadoop As The Platform For The Smartgrid At TVAHadoop As The Platform For The Smartgrid At TVA
Hadoop As The Platform For The Smartgrid At TVA
Cloudera, Inc.
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
DataWorks Summit
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
DataWorks Summit
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
markgrover
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
Shravan (Sean) Pabba
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
Learntek1
 
Munich HUG 21.11.2013
Munich HUG 21.11.2013Munich HUG 21.11.2013
Munich HUG 21.11.2013
Emil Andreas Siemes
 
Intel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStackIntel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStack
Silicon Valley Cloud Foundry Meetup
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
markgrover
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
Jim Kaskade
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emc
Taldor Group
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
Hortonworks
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARI
DataWorks Summit
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
Inside Analysis
 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
Riccardo Romani
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Cloudera, Inc.
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
markgrover
 

Similar to Hadoop summit cloudera keynote_v5 (20)

Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
 
20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar20100806 cloudera 10 hadoopable problems webinar
20100806 cloudera 10 hadoopable problems webinar
 
10 Common Hadoop-able Problems Webinar
10 Common Hadoop-able Problems Webinar10 Common Hadoop-able Problems Webinar
10 Common Hadoop-able Problems Webinar
 
Hadoop As The Platform For The Smartgrid At TVA
Hadoop As The Platform For The Smartgrid At TVAHadoop As The Platform For The Smartgrid At TVA
Hadoop As The Platform For The Smartgrid At TVA
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Big data - Online Training
Big data - Online TrainingBig data - Online Training
Big data - Online Training
 
Munich HUG 21.11.2013
Munich HUG 21.11.2013Munich HUG 21.11.2013
Munich HUG 21.11.2013
 
Intel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStackIntel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStack
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
A new platform for a new era emc
A new platform for a new era   emcA new platform for a new era   emc
A new platform for a new era emc
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARI
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

Hadoop summit cloudera keynote_v5

  • 1. Hadoop Grows Up! Doug Cutting – Hadoop project founder June 29th 2010
  • 2. Hadoop is Almost 5 Years Old! 2
  • 4. A Career… Copyright 2010 Cloudera Inc. All rights reserved 4
  • 5. An Ecosystem… Copyright 2010 Cloudera Inc. All rights reserved 5
  • 6. A Market… Copyright 2010 Cloudera Inc. All rights reserved 6
  • 7. An Emerging Platform for Applications… Graph analysis Machine learning Scientific Archive Security Query & reporting Complex ETL Search quality Fraud detection Clickstream analysis POS analysis Trade compliance And more… Copyright 2010 Cloudera Inc. All rights reserved 7
  • 8. Hadoop Started From Humble Beginnings…. • MapReduce and HDFS only • Good for experienced Java programmers • Limited application set Copyright 2010 Cloudera Inc. All rights reserved 8
  • 9. Innovation: the Secret to Hadoop Success • Projects & components develop around Hadoop “Provide more levels of “Provide common abstraction & automation technical services” for job creation ” • User base grows “Make it “Cover more data movements – • More applications easier to get data in & out” inserts, appends, etc” are made possible Copyright 2010 Cloudera Inc. All rights reserved 9
  • 10. But Innovation Isn’t Free • For every release of MapReduce 20 and HDFS, there are >20 10 releases of related projects 0 • Every component has its own schedule, versioning, HBase 0.89 HDFS 0.20 dependencies & patch Pig 0.7 requirements Hive 0.6 Oozie 2.0 • Hadoop community likes to build 2-3 of everything Copyright 2010 Cloudera Inc. All rights reserved 10
  • 11. Announcing Cloudera’s Distribution for Hadoop v3 • Open source – 100% Apache licensed • Simplified – Cloudera manages required versions & dependencies • Integrated – all components interoperate • Reliable – patched with fixes from future releases to improve stability • Easy to consume – Debian, RPM, tarball, Virtual Machine, EC2, Rackspace, Softlayer Copyright 2010 Cloudera Inc. All rights reserved 11
  • 12. What’s New in CDH v3? • Updates to existing Hadoop frameworks • Pig 0.7 • Sqoop 1.0 • Hadoop 0.20S (planned) • Support for 3 new related components • HBase – with durability • Zookeeper • Oozie – run workflows + support for Hive & Sqoop actions • Introducing 2 new components • Flume – collect streaming data with centralized configuration & guaranteed delivery • Hue – web UI and SDK for Hadoop web applications Copyright 2010 Cloudera Inc. All rights reserved 12
  • 13. Charles Zedlewski, Cloudera Product Management Copyright 2010 Cloudera Inc. All rights reserved 13
  • 14. Harnessing Hadoop Has Challenges Skill Set – experts only Complexity – more than ten components Manageability – hard to configure, monitor & administer Interoperability – limited support for DBMS & analytic tools 14
  • 15. Announcing Cloudera Enterprise • Reduces the risks of running Hadoop in production • Improves consistency, compliance and administrative overhead Management tools • Monitoring & config for data integration • Authorization mgmt & provisioning • Resource mgmt • Production support for CDH & certified integrations (e.g. Oracle, Vertica) Copyright 2010 Cloudera Inc. All rights reserved 15
  • 16. Demo Copyright 2010 Cloudera Inc. All rights reserved 16
  • 17. Some Announcements • Party at our place • Hackathon on CDH3 – applications, enhancements, open source contributions • July 27th, 9:30am – 7:30pm • For invite: hackathon@cloudera.com • Free food & snacks • Or stay home and read • Hadoop the Definitive Guide, second edition • Available on October 12th at Hadoop World Copyright 2010 Cloudera Inc. All rights reserved 17
  • 18. Thank You! • Stop by our table if you have questions! Copyright 2010 Cloudera Inc. All rights reserved 18
  • 19. Copyright 2010 Cloudera Inc. All rights reserved 19