How Apache Hadoop is Revolutionizing
Business Intelligence and Data Analytics

Strata Conference, Sept 22nd 2011, New York, NY

Dr. Amr Awadallah, Founder, CTO, VP of Engineering
aaa@cloudera.com, twitter: @awadallah
Business Intelligence Before Adopting Apache Hadoop

  BI Reports + Interactive Apps                        Can’t Explore Original
                                                       High Fidelity Raw Data
    RDBMS (processed data)
       ETL Compute Grid
                   Moving Data To
                   Compute Doesn’t Scale
           Storage Only Grid (original raw data)
                                                                            Archiving =
            Mostly Append
                                                                            Premature
                           Collection                                       Data Death
                     Instrumentation

                    Copyright © 2011, Cloudera, Inc. All Rights Reserved.             2
Business Intelligence After Adopting Apache Hadoop
                                                               Data Exploration &
  BI Reports + Interactive Apps                                Advanced Analytics

            RDBMS




    ETL and Aggregations                               Complex Data Processing
                 Hadoop: Storage + Compute Grid
                 Mostly Append                       Keep Data Alive For Ever
                                  Collection
                            Instrumentation

                    Copyright © 2011, Cloudera, Inc. All Rights Reserved.           3
So What is Apache Hadoop?
• A scalable fault-tolerant distributed system for data storage and
  processing (open source under the Apache license)

• Core Hadoop has two main components:
    • Hadoop Distributed File System: self-healing high-bandwidth clustered storage
    • MapReduce: fault-tolerant distributed processing


• Key business values:
    •   Flexible – Store any data, Run any analysis (Mine First, Govern Later)
    •   Scalable – Start at 1TB/3-nodes then grow to petabytes/thousands of nodes
    •   Affordable – Cost per TB at a fraction of traditional options
    •   Open Source – No Lock-In, Rich Ecosystem, Large developer community
    •   Broadly adopted – A large and active ecosystem, Proven to run at scale

                          Copyright © 2011, Cloudera, Inc. All Rights Reserved.       4
The Main Benefit: Agility/Flexibility

Schema-on-Write (RDBMS):                                  Schema-on-Read (Hadoop):
•   Schema must be created before                        •   Data is simply copied to the file
    data is loaded                                           store, no special transformation is
                                                             needed
•   Explicit load operation has to
    take place which transforms data                     •   A SerDe (Serializer/Deserlizer) is
    to database internal structure                           applied during read time to extract
                                                             the required columns
•   New columns must be added
    explicitly before data for such                      •   New data can start flowing
    columns can be loaded into the                           anytime and will appear
    database                                                 retroactively once the SerDe is
                                                             updated to parse them
•   Read is Fast                                         •   Load is Fast
                                        Benefits
•   Standards/Governance                                 •   Flexibility/Agility

                         Copyright © 2011, Cloudera, Inc. All Rights Reserved.                 5
What is Complex Data Processing?
1. Java MapReduce: Gives the most flexibility and performance,
   but potentially long development cycle (the “assembly
   language” of Hadoop).
2. Streaming MapReduce (also Pipes): Allows you to develop in
   any programming language of your choice, but slightly lower
   performance and less flexibility.
3. Pig: A high-level language out of Yahoo, suitable for batch data
   flow workloads.
4. Hive: A SQL interpreter out of Facebook, also includes a meta-
   store mapping files to their schemas and associated SerDe.
5. Oozie: A PDL XML workflow server engine that enables creating
   a workflow of jobs composed of any of the above.

                    Copyright © 2011, Cloudera, Inc. All Rights Reserved.   6
What This Means For You: Agility

Up Front Design                                                Just in Time




                Copyright © 2011, Cloudera, Inc. All Rights Reserved.         7
What This Means For You: Innovation

   Data Committee                                              Data Scientist




                Copyright © 2011, Cloudera, Inc. All Rights Reserved.           8
What This Means For You: Consolidation

        Silos                                                           Sharing




                Copyright © 2011, Cloudera, Inc. All Rights Reserved.             9
What This Means For You: Extract Value from Latent Data

  Archive to Tape                                         Keep Data Alive




                Copyright © 2011, Cloudera, Inc. All Rights Reserved.       10
What This Means For You: Ability to Grow Fluidly
Benefit #2: Scalability




                Copyright © 2011, Cloudera, Inc. All Rights Reserved.   11
What This Means For You: Data Beats Algorithm

  Smarter Algos                                            More Data




                Copyright © 2011, Cloudera, Inc. All Rights Reserved.   12
Where Does Hadoop Fit in the Enterprise Data Stack?
                                          Data Scientists          Analysts         Business Users



                                                                                       Enterprise
                                                 IDEs            BI, Analytics
                           System                                                      Reporting
                          Operators
                                          Development Tools                 Business Intelligence Tools


                          Cloudera
                         Mgmt Suite                                                               Enterprise
                                                                                                    Data
  Data
             ETL Tools




Architects                                                                                        Warehouse     Customers



                                                                                                  Low-Latency     Web
                                                                                                    Serving     Application

                                                                           Relational               Systems
                     Logs             Files           Web Data
                                                                           Databases

                                          Copyright © 2011, Cloudera, Inc. All Rights Reserved.                         13
Use The Right Tool For The Right Job

    Relational Databases:                             Hadoop:




Use when:                                              Use when:
•   Interactive OLAP Analytics (<1sec)                 •   Structured or Not (Agility)
•   Multistep ACID Transactions                        •   Scalability of Storage/Compute
•   100% SQL Compliance                                •   Complex Data Processing
                         Copyright © 2011, Cloudera, Inc. All Rights Reserved.              14
Two Core Use Cases Common Across Many Industries

Use Case                   Application                       Industry                            Application      Use Case
                      Social Network Analysis                  Web                   Clickstream Sessionization
 ADVANCED ANALYTICS




                                                             Media




                                                                                                                   DATA PROCESSING
                       Content Optimization                                          Clickstream Sessionization

                        Network Analytics                      Telco                              Mediation

                       Loyalty & Promotions                   Retail                             Data Factory

                          Fraud Analysis                    Financial                    Trade Reconciliation

                          Entity Analysis                    Federal                               SIGINT

                       Sequencing Analysis             Bioinformatics                      Genome Mapping

                         Product Quality              Manufacturing                     Mfg Process Tracking



                                         Copyright © 2011, Cloudera, Inc. All Rights Reserved.                               15
CDH: Cloudera’s Distribution Including Apache Hadoop
                     UI Framework                HUE                               SDK              HUE SDK


               Workflow       OOZIE             Scheduling         OOZIE                 Metadata      HIVE


                                        Languages / Compilers
                                                                       PIG, HIVE     Fast Read/Write
         Data Integration
                                                                                          Access
         FLUME, SQOOP, ODBC                                                                  HBASE


                                               Coordination                                ZOOKEEPER




•   Open Source – 100% Apache licensed, 100% Open Source, 100% Free.
•   Enterprise Ready – Predictable releases, Documentation, Hotfix Patches, Intensive QA
•   Integrated – All required component versions & dependencies are managed for you
•   Industry Standard – Existing RDBMS, ETL and BI systems work best with it
•   Many Form Factors – Public Cloud, Private Cloud, Ubuntu, RHEL, 32/64bit, etc

                                 Copyright © 2011, Cloudera, Inc. All Rights Reserved.                        16
SCM Express: Simplifies Installation and Configuration

    Service & Configuration Manager
    (SCM) Express takes the complexity out of
    deploying and configuring CDH.

     Provision a complete Hadoop stack in minutes
     Centrally manage system services through a user-
      friendly interface
     Manages services for up to 50 nodes
     FREE to download


KEY FEATURES
Automated, wizard-based    Central, real-time        Ability to configure the         Incorporates          Automates the expansion
   installation of the      dashboard for           cluster while it’s running   comprehensive validation   of services to new nodes
 complete Hadoop stack       configuration                                          and error checking       when they come online
                             management


         1                       2                            3                           4                          5
                                            ©2011 Cloudera, Inc. All Rights Reserved.                                         17
What is Cloudera Enterprise?

Cloudera Enterprise makes open source                            CLOUDERA ENTERPRISE COMPONENTS
Apache Hadoop enterprise-easy
                                                               Cloudera                       Production-Level
 Simplify and Accelerate Hadoop Deployment
                                                            Management Suite                      Support
 Reduce Adoption Costs and Risks
 Lower the Cost of Administration                             Comprehensive                Our Team of Experts
                                                             Toolset for Hadoop             On-Call to Help You
 Increase the Transparency & Control of Hadoop
                                                               Administration                 Meet Your SLAs
 Leverage the Experience of Our Experts



   3 of the top 5 telecommunications, mobile services, defense & intelligence,
     banking, media and retail organizations depend on Cloudera Enterprise

            EFFECTIVENESS                                                         EFFICIENCY
            Ensuring Repeatable Value from                                        Enabling Apache Hadoop to be
            Apache Hadoop Deployments                                             Affordably Run in Production



                                     ©2011 Cloudera, Inc. All Rights Reserved.                                    18
Hadoop World 2011

    The largest gathering of Hadoop practitioners, developers,
    business executives, industry luminaries and innovative
    companies in the Hadoop ecosystem.

•    1400 attendees, 25+ sponsors
                                                                        November 8-9
•    60 sessions across 5 tracks for:
                                                                   Sheraton New York Hotel
      – Business Decision Makers                                        & Towers, NYC
      – Enterprise Architects
      – IT Operators                                                   Learn more and register at
      – Data Scientists                                            www.hadoopworld.com
      – Developers
•    Cloudera Training and Certification                                  $50 discount for
     (November 7, 10, 11)                                                 Strata attendees



                           ©2011 Cloudera, Inc. All Rights Reserved.                                19
What I Would Like You To Remember:
• The Key Benefits of the Apache Hadoop Data Platform:
   • Agility/Flexibility (Enables Innovation/Exploration).
   • Complex Data Processing (Any Language, Any Problem).
   • Scalability of Storage/Compute (Freedom to Grow).
   • Economical Active Archive (Keep All Your Data Alive).

• Cloudera Enterprise enables:
   •   Lower the Cost of Management and Administration.
   •   Simplify and Accelerate Hadoop Deployment.
   •   Increase the Transparency & Control of Hadoop.
   •   Firm SLAs on Issue Resolution.
                   Copyright © 2011, Cloudera, Inc. All Rights Reserved.   20
Contact Information:



          Amr Awadallah
        aaa@cloudera.com
           650-644-3921
   http://twitter.com/awadallah




                  Copyright © 2011, Cloudera, Inc. All Rights Reserved.   21
Copyright © 2011, Cloudera, Inc. All Rights Reserved.   22
Appendix



      Copyright © 2011, Cloudera, Inc. All Rights Reserved.   23
Hadoop Timeline

                                                                              Fastest sort of a TB, 3.5mins
                                                                              over 910 nodes
                         Doug Cutting adds DFS &
                        MapReduce support to Nutch                                              • Fastest sort of a TB, 62secs
                                                                                                over 1,460 nodes
                                                            NY Times converts 4TB of            • Sorted a PB in 16.25hours
Doug Cutting & Mike Cafarella                                                                   over 3,658 nodes
                                                          image archives over 100 EC2s
  started working on Nutch


     2002        2003           2004         2005            2006            2007         2008           2009

             Google publishes GFS &
                                                   Yahoo! hires Cutting,                      Cloudera         Doug Cutting
               MapReduce papers
                                                 Hadoop spins out of Nutch                    Founded         joins Cloudera

                                                                     Facebooks launches Hive:
                                                                      SQL Support for Hadoop
                                                                                                  Hadoop Summit 2009,
                                                                                                     750 attendees


                                  Copyright © 2011, Cloudera, Inc. All Rights Reserved.                                  24
Cloudera’s Track Record
• Customers: Multiple customers with >1,000 Hadoop nodes under management
• Supporting dozens of diverse production use cases including ones that are revenue critical
  with tight SLA’s

• Community: years of demonstrated leadership in the Apache Hadoop ecosystem.
  Cloudera employees are:
    • The largest contributor to the Hadoop ecosystem in patches
    • Founders of 70% of the projects in the Apache Hadoop ecosystem including Apache
      Hadoop itself
    • The first to build & integrate what is now the reference Hadoop stack

• Industry: Multiple years of experience providing Hadoop solutions across industries:
    • 2 of the top 5 payments companies run Cloudera
    • 3 of the top 5 commerical banks run Cloudera
    • 2 of the top 4 online travel companies run Cloudera


                            Copyright © 2011, Cloudera, Inc. All Rights Reserved.        25
Cloudera Enterprise Management Suite

Utility                   It Helps You…                       So You Can…                        It’s Like…
Activity Monitor          • Consolidate all user activities
                            into a real-time view
                                                              • Improve performance              • MySQL Enterprise Monitor
                                                              • Improve conformance to           • Quest Foglight for Oracle /
                          • Diagnose user performance           SLAs                               SQL Server
                          • Track activity metrics            • Improve QOS



Service &                 • Manage system services            • Lower cost of administration     • Red Hat Satellite Server
                          • Automate changes                  • Improve uptime                   • Microsoft System Center
Configuration             • Validate settings                                                    • Oracle Enterprise Manager
Manager                   • 1-click security


Resource                  • Report on the usage of
                            scarce resources
                                                              • Improve quality of service       • VMware vCenter
                                                              • Extend the life of the cluster
Manager                   • Plan for capacity expansion




Authorization             • Centralize management of all
                            users, groups and privileges
                                                              • Lower the costs of
                                                                administration
                                                                                                 • Teradata security
                                                                                                   administration
Manager                   • Manage permissions via            • Improve compliance
                            delegated administration




                   ©2011 Cloudera, Inc. All Rights Reserved.                                                             26
CDH Integrates with Existing IT Infrastructure

   BI/Analytics   ETL                   Databases                 Cloud/OS      Hardware




                        Copyright © 2011, Cloudera, Inc. All Rights Reserved.              27
Copyright © 2011, Cloudera, Inc. All Rights Reserved.   28

Business Intelligence and Data Analytics Revolutionized with Apache Hadoop

  • 1.
    How Apache Hadoopis Revolutionizing Business Intelligence and Data Analytics Strata Conference, Sept 22nd 2011, New York, NY Dr. Amr Awadallah, Founder, CTO, VP of Engineering aaa@cloudera.com, twitter: @awadallah
  • 2.
    Business Intelligence BeforeAdopting Apache Hadoop BI Reports + Interactive Apps Can’t Explore Original High Fidelity Raw Data RDBMS (processed data) ETL Compute Grid Moving Data To Compute Doesn’t Scale Storage Only Grid (original raw data) Archiving = Mostly Append Premature Collection Data Death Instrumentation Copyright © 2011, Cloudera, Inc. All Rights Reserved. 2
  • 3.
    Business Intelligence AfterAdopting Apache Hadoop Data Exploration & BI Reports + Interactive Apps Advanced Analytics RDBMS ETL and Aggregations Complex Data Processing Hadoop: Storage + Compute Grid Mostly Append Keep Data Alive For Ever Collection Instrumentation Copyright © 2011, Cloudera, Inc. All Rights Reserved. 3
  • 4.
    So What isApache Hadoop? • A scalable fault-tolerant distributed system for data storage and processing (open source under the Apache license) • Core Hadoop has two main components: • Hadoop Distributed File System: self-healing high-bandwidth clustered storage • MapReduce: fault-tolerant distributed processing • Key business values: • Flexible – Store any data, Run any analysis (Mine First, Govern Later) • Scalable – Start at 1TB/3-nodes then grow to petabytes/thousands of nodes • Affordable – Cost per TB at a fraction of traditional options • Open Source – No Lock-In, Rich Ecosystem, Large developer community • Broadly adopted – A large and active ecosystem, Proven to run at scale Copyright © 2011, Cloudera, Inc. All Rights Reserved. 4
  • 5.
    The Main Benefit:Agility/Flexibility Schema-on-Write (RDBMS): Schema-on-Read (Hadoop): • Schema must be created before • Data is simply copied to the file data is loaded store, no special transformation is needed • Explicit load operation has to take place which transforms data • A SerDe (Serializer/Deserlizer) is to database internal structure applied during read time to extract the required columns • New columns must be added explicitly before data for such • New data can start flowing columns can be loaded into the anytime and will appear database retroactively once the SerDe is updated to parse them • Read is Fast • Load is Fast Benefits • Standards/Governance • Flexibility/Agility Copyright © 2011, Cloudera, Inc. All Rights Reserved. 5
  • 6.
    What is ComplexData Processing? 1. Java MapReduce: Gives the most flexibility and performance, but potentially long development cycle (the “assembly language” of Hadoop). 2. Streaming MapReduce (also Pipes): Allows you to develop in any programming language of your choice, but slightly lower performance and less flexibility. 3. Pig: A high-level language out of Yahoo, suitable for batch data flow workloads. 4. Hive: A SQL interpreter out of Facebook, also includes a meta- store mapping files to their schemas and associated SerDe. 5. Oozie: A PDL XML workflow server engine that enables creating a workflow of jobs composed of any of the above. Copyright © 2011, Cloudera, Inc. All Rights Reserved. 6
  • 7.
    What This MeansFor You: Agility Up Front Design Just in Time Copyright © 2011, Cloudera, Inc. All Rights Reserved. 7
  • 8.
    What This MeansFor You: Innovation Data Committee Data Scientist Copyright © 2011, Cloudera, Inc. All Rights Reserved. 8
  • 9.
    What This MeansFor You: Consolidation Silos Sharing Copyright © 2011, Cloudera, Inc. All Rights Reserved. 9
  • 10.
    What This MeansFor You: Extract Value from Latent Data Archive to Tape Keep Data Alive Copyright © 2011, Cloudera, Inc. All Rights Reserved. 10
  • 11.
    What This MeansFor You: Ability to Grow Fluidly Benefit #2: Scalability Copyright © 2011, Cloudera, Inc. All Rights Reserved. 11
  • 12.
    What This MeansFor You: Data Beats Algorithm Smarter Algos More Data Copyright © 2011, Cloudera, Inc. All Rights Reserved. 12
  • 13.
    Where Does HadoopFit in the Enterprise Data Stack? Data Scientists Analysts Business Users Enterprise IDEs BI, Analytics System Reporting Operators Development Tools Business Intelligence Tools Cloudera Mgmt Suite Enterprise Data Data ETL Tools Architects Warehouse Customers Low-Latency Web Serving Application Relational Systems Logs Files Web Data Databases Copyright © 2011, Cloudera, Inc. All Rights Reserved. 13
  • 14.
    Use The RightTool For The Right Job Relational Databases: Hadoop: Use when: Use when: • Interactive OLAP Analytics (<1sec) • Structured or Not (Agility) • Multistep ACID Transactions • Scalability of Storage/Compute • 100% SQL Compliance • Complex Data Processing Copyright © 2011, Cloudera, Inc. All Rights Reserved. 14
  • 15.
    Two Core UseCases Common Across Many Industries Use Case Application Industry Application Use Case Social Network Analysis Web Clickstream Sessionization ADVANCED ANALYTICS Media DATA PROCESSING Content Optimization Clickstream Sessionization Network Analytics Telco Mediation Loyalty & Promotions Retail Data Factory Fraud Analysis Financial Trade Reconciliation Entity Analysis Federal SIGINT Sequencing Analysis Bioinformatics Genome Mapping Product Quality Manufacturing Mfg Process Tracking Copyright © 2011, Cloudera, Inc. All Rights Reserved. 15
  • 16.
    CDH: Cloudera’s DistributionIncluding Apache Hadoop UI Framework HUE SDK HUE SDK Workflow OOZIE Scheduling OOZIE Metadata HIVE Languages / Compilers PIG, HIVE Fast Read/Write Data Integration Access FLUME, SQOOP, ODBC HBASE Coordination ZOOKEEPER • Open Source – 100% Apache licensed, 100% Open Source, 100% Free. • Enterprise Ready – Predictable releases, Documentation, Hotfix Patches, Intensive QA • Integrated – All required component versions & dependencies are managed for you • Industry Standard – Existing RDBMS, ETL and BI systems work best with it • Many Form Factors – Public Cloud, Private Cloud, Ubuntu, RHEL, 32/64bit, etc Copyright © 2011, Cloudera, Inc. All Rights Reserved. 16
  • 17.
    SCM Express: SimplifiesInstallation and Configuration Service & Configuration Manager (SCM) Express takes the complexity out of deploying and configuring CDH.  Provision a complete Hadoop stack in minutes  Centrally manage system services through a user- friendly interface  Manages services for up to 50 nodes  FREE to download KEY FEATURES Automated, wizard-based Central, real-time Ability to configure the Incorporates Automates the expansion installation of the dashboard for cluster while it’s running comprehensive validation of services to new nodes complete Hadoop stack configuration and error checking when they come online management 1 2 3 4 5 ©2011 Cloudera, Inc. All Rights Reserved. 17
  • 18.
    What is ClouderaEnterprise? Cloudera Enterprise makes open source CLOUDERA ENTERPRISE COMPONENTS Apache Hadoop enterprise-easy Cloudera Production-Level  Simplify and Accelerate Hadoop Deployment Management Suite Support  Reduce Adoption Costs and Risks  Lower the Cost of Administration Comprehensive Our Team of Experts Toolset for Hadoop On-Call to Help You  Increase the Transparency & Control of Hadoop Administration Meet Your SLAs  Leverage the Experience of Our Experts 3 of the top 5 telecommunications, mobile services, defense & intelligence, banking, media and retail organizations depend on Cloudera Enterprise EFFECTIVENESS EFFICIENCY Ensuring Repeatable Value from Enabling Apache Hadoop to be Apache Hadoop Deployments Affordably Run in Production ©2011 Cloudera, Inc. All Rights Reserved. 18
  • 19.
    Hadoop World 2011 The largest gathering of Hadoop practitioners, developers, business executives, industry luminaries and innovative companies in the Hadoop ecosystem. • 1400 attendees, 25+ sponsors November 8-9 • 60 sessions across 5 tracks for: Sheraton New York Hotel – Business Decision Makers & Towers, NYC – Enterprise Architects – IT Operators Learn more and register at – Data Scientists www.hadoopworld.com – Developers • Cloudera Training and Certification $50 discount for (November 7, 10, 11) Strata attendees ©2011 Cloudera, Inc. All Rights Reserved. 19
  • 20.
    What I WouldLike You To Remember: • The Key Benefits of the Apache Hadoop Data Platform: • Agility/Flexibility (Enables Innovation/Exploration). • Complex Data Processing (Any Language, Any Problem). • Scalability of Storage/Compute (Freedom to Grow). • Economical Active Archive (Keep All Your Data Alive). • Cloudera Enterprise enables: • Lower the Cost of Management and Administration. • Simplify and Accelerate Hadoop Deployment. • Increase the Transparency & Control of Hadoop. • Firm SLAs on Issue Resolution. Copyright © 2011, Cloudera, Inc. All Rights Reserved. 20
  • 21.
    Contact Information: Amr Awadallah aaa@cloudera.com 650-644-3921 http://twitter.com/awadallah Copyright © 2011, Cloudera, Inc. All Rights Reserved. 21
  • 22.
    Copyright © 2011,Cloudera, Inc. All Rights Reserved. 22
  • 23.
    Appendix Copyright © 2011, Cloudera, Inc. All Rights Reserved. 23
  • 24.
    Hadoop Timeline Fastest sort of a TB, 3.5mins over 910 nodes Doug Cutting adds DFS & MapReduce support to Nutch • Fastest sort of a TB, 62secs over 1,460 nodes NY Times converts 4TB of • Sorted a PB in 16.25hours Doug Cutting & Mike Cafarella over 3,658 nodes image archives over 100 EC2s started working on Nutch 2002 2003 2004 2005 2006 2007 2008 2009 Google publishes GFS & Yahoo! hires Cutting, Cloudera Doug Cutting MapReduce papers Hadoop spins out of Nutch Founded joins Cloudera Facebooks launches Hive: SQL Support for Hadoop Hadoop Summit 2009, 750 attendees Copyright © 2011, Cloudera, Inc. All Rights Reserved. 24
  • 25.
    Cloudera’s Track Record •Customers: Multiple customers with >1,000 Hadoop nodes under management • Supporting dozens of diverse production use cases including ones that are revenue critical with tight SLA’s • Community: years of demonstrated leadership in the Apache Hadoop ecosystem. Cloudera employees are: • The largest contributor to the Hadoop ecosystem in patches • Founders of 70% of the projects in the Apache Hadoop ecosystem including Apache Hadoop itself • The first to build & integrate what is now the reference Hadoop stack • Industry: Multiple years of experience providing Hadoop solutions across industries: • 2 of the top 5 payments companies run Cloudera • 3 of the top 5 commerical banks run Cloudera • 2 of the top 4 online travel companies run Cloudera Copyright © 2011, Cloudera, Inc. All Rights Reserved. 25
  • 26.
    Cloudera Enterprise ManagementSuite Utility It Helps You… So You Can… It’s Like… Activity Monitor • Consolidate all user activities into a real-time view • Improve performance • MySQL Enterprise Monitor • Improve conformance to • Quest Foglight for Oracle / • Diagnose user performance SLAs SQL Server • Track activity metrics • Improve QOS Service & • Manage system services • Lower cost of administration • Red Hat Satellite Server • Automate changes • Improve uptime • Microsoft System Center Configuration • Validate settings • Oracle Enterprise Manager Manager • 1-click security Resource • Report on the usage of scarce resources • Improve quality of service • VMware vCenter • Extend the life of the cluster Manager • Plan for capacity expansion Authorization • Centralize management of all users, groups and privileges • Lower the costs of administration • Teradata security administration Manager • Manage permissions via • Improve compliance delegated administration ©2011 Cloudera, Inc. All Rights Reserved. 26
  • 27.
    CDH Integrates withExisting IT Infrastructure BI/Analytics ETL Databases Cloud/OS Hardware Copyright © 2011, Cloudera, Inc. All Rights Reserved. 27
  • 28.
    Copyright © 2011,Cloudera, Inc. All Rights Reserved. 28