SlideShare a Scribd company logo
Hortonworks & Systems Integrators
Mitch Ferguson
VP, Business Development

Rikin Shah
Dir, Field Engineering


September 5, 2012



© Hortonworks Inc. 2012         Page 1
Big data changes the game

                                                                     Transactions + Interactions
Petabytes
                  BIG DATA                       Mobile Web                  + Observations
                                                 Sentiment

                                                  User Click Stream
                                                                    SMS/MMS
                                                                                   = BIG DATA
                                                                         Speech to Text

                                                                Social Interactions & Feeds
  Terabytes       WEB                Web logs
                                                                         Spatial & GPS Coordinates
                                         A/B testing
                                                                                Sensors / RFID / Devices
                                                  Behavioral Targeting
   Gigabytes      CRM                                                                   Business Data Feeds
                                                             Dynamic Pricing
                                     Segmentation                                             External Demographics
                                                                    Search Marketing
                                         Customer Touches                                      User Generated Content
                  ERP
   Megabytes                                                           Affiliate Networks
                   Purchase detail              Support Contacts                                  HD Video, Audio, Images
                                                                         Dynamic Funnels
                   Purchase record
                                                    Offer details          Offer history            Product/Service Logs
                   Payment record



                                                  Increasing Data Variety and Complexity


               © Hortonworks Inc. 2012
Hortonworks Snapshot
                                     The industry leading and only 100% open
                                     source Apache Hadoop distribution


•  Headquarters
                                     Most experienced open source leadership team
   Sunnyvale, CA                      –    Rob Bearden – CEO (JBoss, SpringSource, i2, Oracle)
                                      –    Shaun Connolly – VP Strategy (VMW, SpringSource, Red Hat, JBoss)
•  100+ Employees
                                      –    Mitch Ferguson: VP BD (SringSource, VMWare)
•  Formed with core                   –    John Kreisa – VP Marketing (Red Hat, Cloudera, MarkLogic, Bus Obj)
   Apache Hadoop                      –    Ari Zilka – CPO (Teracotta, Accenture, Walmart.com)
   engineering team                   –    Greg Pavlik – VP Eng. (Oracle SOA & Integration platform)
   from Yahoo!

•  40+ engineers and
   architects including
                                     Business model focused on customer success:
   25+ Hadoop                        Hadoop support, services & training
   committers
                                      – Subscription support for Hortonworks Data Platform
                                      – Training business: Private and public classes
                                        available for developers & administrators



           © Hortonworks Inc. 2012
Hortonworks Business Strategy

  Enable the next gen data management
                platform

• Accelerate the adoption of Apache Hadoop

• Create a vibrant eco-systems
   – ISVs, IHV, Systems Integrators

• Provide world-class enterprise Support & Training




      © Hortonworks Inc. 2012
Hortonworks Vision & Role

                                We believe that by the end of 2015,
                                more than half the world's data will be
                                processed by Apache Hadoop.



  1       Be diligent stewards of the open source core

  2       Be tireless innovators beyond the core

  3       Provide robust data platform services & open APIs

  4       Enable the ecosystem at each layer of the stack

  5       Make the platform enterprise-ready & easy to use


      © Hortonworks Inc. 2012
Enabling Hadoop as Enterprise Big Data Platform



Applications,
                                                                                Installation &
Business Tools,                                                                 Configuration,
Development Tools,                                                              Administration,
Open APIs and access                                                            Monitoring,
Data Movement & Integration,                                                    High Availability,
Data Management Systems,                                                        Replication,
Systems Management                                Hortonworks                   Multi-tenancy, ..
                                                  Data Platform

                                              DEVELOPER
                                       Data Platform Services & Open APIs

                                        Metadata, Indexing, Search, Security,
                                       Management, Data Extract & Load, APIs




             © Hortonworks Inc. 2012
Hortonworks Partner Eco-System




   © Hortonworks Inc. 2012
Hortonworks & SIs
 Our business models are 100% Complementary

• Systems Integrators are a corner-stone of our business model
• Enable high-value & repeatable solutions
• Leverage multi-party relationships to accelerate business




                                            Systems Integrator



                                 Customer

       © Hortonworks Inc. 2012
Why Hortonworks?
•  The most Apache Hadoop experience and expertise
   –  Reliable Hadoop from the experts, project leaders, architects and
      builders
   –  Collectively over 90 years operational Hadoop experience
      (at least double that of the closest competitor)

•  Influence community direction
   –  Provides a direct connection to drive innovation in the community

•  Focus on the ecosystem
   –  Roadmap and vision to provide access to the wide ecosystem of
      enterprise application, such as Teradata

•  Industry momentum
   –  Collaborate across partners (ISVs/IHVs/SIs) to enable high-value
      solutions



       © Hortonworks Inc. 2012
Hortonworks Apache Hadoop Leadership

Hortonworkers… the builders,                                             Leadership
operators and core architects                             • VP and PMC of Hadoop
of Apache Hadoop                                           Arun Murthy
                                                          • Core Architect of YARN
                                                           Arun Murthy
•  Most experienced team running Hadoop                   • Core Architect MapReduce2
   in production at scale (> 5 years, 42000 nodes)         Arun Murthy

•  All “stable” releases of Apache Hadoop                 • VP & PMC of Pig
                                                           Daniel Dai
   have been shipped by Hortonworkers
                                                          • VP of Zookeeper
                                                           Mahdev Konar
“We have noticed more activity over the                   • Inventor of HCatalog
last year from Hortonworks’ engineers                      Alan Gates
on building out Apache Hadoop’s more                      • Project Lead for Ambari
innovative features. These include                         Mahedev Konar
YARN, Ambari and HCatalog..”                              • Original Project Lead
                                                           Eric Baldschweiler
                                   - Jeff Kelly: Wkibon


         © Hortonworks Inc. 2012
Hortonworks Data Platform




© Hortonworks Inc. 2012
Hortonworks Data Platform

                                                                                                                           Develop                            Interact




                                                                                                     Non-Relational Database




                                                                                                                                                                                                          Talend Open Studio for Big Data, Sqoop, Flume)
                                                                                                                                         Scripting                  Query
          Management & Monitoring Services




                                                                                                                                            (Pig)                    (Hive)




                                                                                                                                                                              Data Integration Services
                                                                   Workflow & Scheduling




                                                                                                                                                                                                                   (HCatalog APIs, WebHDFS,
                                                                                                                               (HBase)


                                                                                                                                              Metadata Services
                                             (Ambari, Zookeeper)




                                                                                                                                                       (HCatalog)
                                                                                           (Oozie)




Operate                                                                                                                                    Distributed Processing                                                                                          Integrate
                                                                                                                                                      (MapReduce)




                                                                                                                                         Distributed Storage
                                                                                                                                                    (HDFS)


                                                                                                          Hortonworks Data Platform

          © Hortonworks Inc. 2012
Apache Hadoop Release Management

                                                     1.1.1      1.1.2
       Hadoop 1




                                     1.0.1   1.0.2      1.0.3        HDP 1.0

•    Apache Hadoop Release management is run by Hortonworks
      •  Matt Foley, Release Manager for Hadoop 1
      •  Arun Murthy, Release Manager for Hadoop 2
      •  Ashutosh Chauhan, Release Manager for Hive
      •  Daniel Dai, Release Manager for Pig
      •  Alan Gates, Release Manager for Hcat
•    Hadoop Core releases validated (and fixed) by Hortonworks
      •  ~1300 end to end system tests run in house using our IP before any release can be made
•    Hortonworks Data Platform is released directly from Apache Hadoop branches


           © Hortonworks Inc. 2012
Full Stack High Availability




© Hortonworks Inc. 2012
Full Stack High Availability                                                                       HA

                                                                                                   HA




•  Failover and restart for
     •  NameNode
     •  JobTracker
     •  HBase and other services to come…      HA Cluster            Core Switch

                                                            Rack Switch              Rack Switch
•  Open API allows use of Proven HA
   from multiple vendors (Red Hat &
   VMWare)                                                  Namenode                 Namenode
                                                            HA Manager               HA Manager
•  Minimized changes to clients and
   configuration                                            Job Tracker          Job Tracker
•  Complementary to 2.0 HA efforts                           HA Manager              HA Manager

•  Server & Operating System failure
                                                            Etc. daemon          Etc. daemon
   detection and VM restart
                                                            HA Manager               HA Manager
•  Smart resource management
   ensures sufficient resources are                                       HA Pairs
   available to restart VMs
                            Addresses HA needs on stable Apache Hadoop 1.0

        © Hortonworks Inc. 2012
Capacity Scheduler Delivers Multi-tenancy

• Queue definition
  – % of total system memory
  – % CPU utilization (not slot count)
• Queues per team
  – Soft limits and hard so you can use entire cluster if available
  – Ownership and security built in
• Proactive resource management
  – Lots of rules and observation points
  – Don’t start another task if it will blow up the node
  – Don’t start another task if other workloads are spinning up
• Better than Fair + Preemption (HDP Supports All)
  – Utilization not measured by slot count (can blow up a node /
    cluster)
  – Doesn’t start all tasks automatically (proactive vs. reactive)
      © Hortonworks Inc. 2012
HCatalog
                          METADATA




© Hortonworks Inc. 2012
Metadata Services
Apache HCatalog provides flexible metadata
services across tools and external access
 •  Consistency of metadata and data models across tools
    (MapReduce, Pig, HBase and Hive)
 •  Accessibility: share data as tables in and out of HDFS
 •  Availability: enables flexible, thin-client access via REST API




                                  HCatalog                        Shared table
                                                                  and schema
                                                                  management
   •  Raw Hadoop data                        Table access         opens the
   •  Inconsistent, unknown                  Aligned metadata     platform
   •  Tool specific access                   REST API



        © Hortonworks Inc. 2012
Options Lead to Complexity
 Feature                        MapReduce         Pig                   Hive
 Record format                  Key value pairs   Tuple                 Record
 Data model                     User defined      int, float, string,   int, float, string,
                                                  bytes, maps,          maps, structs, lists
                                                  tuples, bags
 Schema                         Encoded in app    Declared in script    Read from
                                                  or read by loader     metadata
 Data location                  Encoded in app    Declared in script    Read from
                                                                        metadata
 Data format                    Encoded in app    Declared in script    Read from
                                                                        metadata


•    Pig and MR users need to know a lot to write their apps
•    When data schema, location, or format change Pig and MR apps must be
     rewritten, retested, and redeployed
•    Hive users have to load data from Pig/MR users to have access to it
           © Hortonworks 2012
Hadoop Ecosystem

                                  Hive                       Pig                 MapReduce
                                 (SQL)                   (scripting)               (Java)




                    Interface:            Interface:      Interface:
                       SQL                  SerDe        Load/Store

                                          DML

                                            Input/         Input/                   Input/
                                         OutputFormat   OutputFormat             OutputFormat
                  DDL




                     metastore                          dn1   dn2      dn3   .       .    .
                - tables
                - partitions
                - files                                   .     .        .    .       .    .
                - types

                                                         .              .    .       .   dnN


                                                                        HDFS



   © Hortonworks Inc. 2012
Opening up Metadata to MR & Pig

                                   Hive             Pig                 MapReduce
                                  (SQL)         (scripting)               (Java)




                                                       HCat Metadata layer
                                Interface:      Interface:
                                   SQL        HCatLoad/Store


                                                      Interface:
                                                        SerDe

                                                     HCatInput/OutputFormat




                                  metastore    dn1   dn2      dn3   .     .    .
                             - tables
                             - partitions
                             - files             .      .       .    .     .    .
                             - types

                                                .              .    .     .   dnN


                                                               HDFS



   © Hortonworks Inc. 2012
Tools With HCatalog
 Feature                         MapReduce +            Pig + HCatalog        Hive
                                 HCatalog
 Record format                   Record                 Tuple                 Record
 Data model                      int, float, string,    int, float, string,   int, float, string,
                                 maps, structs, lists   bytes, maps,          maps, structs, lists
                                                        tuples, bags
 Schema                          Read from              Read from             Read from
                                 metadata               metadata              metadata
 Data location                   Read from              Read from             Read from
                                 metadata               metadata              metadata
 Data format                     Read from              Read from             Read from
                                 metadata               metadata              metadata

•    Pig/MR users can read schema from metadata
•    Pig/MR users are insulated from schema, location, and format changes
•    All users have access to other users’ data as soon as it is committed
            © Hortonworks 2012
Metadata Services



     applications                                       DML          Hive


                                       HCatalog         DML          HBase
                              REST
     data stores              •  ddl
                              •  dml                    DML           Pig


                                       create             describe
     visualization

      Existing
                                            metastore                Hadoop
   Infrastructure                                                    Cluster




    © Hortonworks Inc. 2012
Services Integration

Provides RESTful API as
“front door” for Hadoop             Existing & New Applications




•    Opens the door to              WebHDFS            HCatalog RESTful Web Services
     languages other than Java

•    Thin clients via web                      MapReduce           Pig   Hive
     services vs. fat-clients in                             HCatalog
     gateway

•    Insulation from interface                                           External
                                        HDFS               HBase
     changes release to release                                           Store




     Opens Hadoop to integration with existing and new applications


          © Hortonworks Inc. 2012
Data Integration Services

•  Intuitive graphical data
   integration tools for HDFS,
   Hive, HBase, HCatalog and Pig

•  Oozie scheduling allows you to
   manage and stage jobs

•  Connectors for any database,
   business application or system

•  Integrated HCatalog storage

 Bridge the gap between
 legacy data & Hadoop

 Simplify and speed development

      © Hortonworks Inc. 2012
Teradata and Hortonworks Partner to Provide
                the First Enterprise Reference Architecture
                          for Hadoop and Big Data

        Partnership provides clear path to enterprise for Hadoop
    •  Reference architecture that provides guidance on best applications
         for Teradata, Teradata Aster, and Hadoop

    •  Clear partnership between industry and community leaders

    •  Deeper integration to ease data movement in/out of Hadoop

    •  Joint R&D and go-to-market


© Hortonworks Inc. 2012
Ambari
Cluster Provisioning
Configuration Management
Monitoring




© Hortonworks Inc. 2012
Ambari Architecture
•  Installs your cluster onto
   target HW for you                data and task
                                                                         n1    n2        n3     .   .   .
                                         sink
                                                      Puppet              .        .      .     .   .   .

                                   Nagios   Ganglia
•  Manage, reconfigure from                                               .               .     .   .   nN

   one place                             worker node
                                                                                          Hadoop



•  Monitor key and meaningful
   Hadoop metrics, not just OS /                               Ambari

   HW                                                           Nagios   Ganglia       Puppet

                                                               controller
                                                                        php portal
•  Scalable in line w/ Hadoop                                                           view

   itself




                                                                         operator



        © Hortonworks Inc. 2012
Ambari
                          Live Demonstration




© Hortonworks Inc. 2012
Why HDP?

ONLY Hortonworks Data Platform provides…
•  Tightly aligned to core Apache Hadoop development line
   - Reduces risk for customers who may add custom coding or projects

•  Enterprise Integration
  - HCatalog provides scalable, extensible integration point to Hadoop data

•  Most reliable Hadoop distribution
  - Full stack high availability on v1 delivers the strongest SLA guarantees

•  Multi-tenant scheduling and resource management
  - Capacity and fair scheduling optimizes cluster resources

•  Integration with operations, eases cluster management
  - Ambari is the most open/complete operations platform for Hadoop clusters




        © Hortonworks Inc. 2012
Hortonworks Support Subscriptions
Objective: help organizations to successfully develop
and deploy solutions based upon Apache Hadoop
• Full-lifecycle technical support available
  – Developer support for design, development and POCs
  – Production support for staging and production environments
       – Up to 24x7 with 1-hour response times

• Delivered by the Apache Hadoop experts
  – Backed by development team that has released every major
    version of Apache Hadoop since 0.1

• Forward-compatibility
  – Hortonworks’ leadership role helps ensure bug fixes and patches
    can be included in future versions of Hadoop projects



                                                                 Page 31
      © Hortonworks Inc. 2012
Hortonworks Training
Objective: help organizations overcome Hadoop
knowledge gaps
• Expert role-based training for developers,
  administrators & data analysts
  – Heavy emphasis on hands-on labs
  – Extensive schedule of public training courses available
    (hortonworks.com/training)

• Comprehensive certification programs



• Customized, on-site courses available

                                                              Page 32
      © Hortonworks Inc. 2012
Thank You!
Questions & Answers




                              Page 33
    © Hortonworks Inc. 2012

More Related Content

What's hot

Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache HadoopData Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
Hortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
Hortonworks
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
Hortonworks
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
Hortonworks
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopHortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Hortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Hortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Hortonworks
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Hortonworks
 
Software Architecture and Predictive Models in R
Software Architecture and Predictive Models in RSoftware Architecture and Predictive Models in R
Software Architecture and Predictive Models in R
Harlan Harris
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Hortonworks
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
Hortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 

What's hot (20)

Data Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache HadoopData Discovery, Visualization, and Apache Hadoop
Data Discovery, Visualization, and Apache Hadoop
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
YARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache HadoopYARN: Future of Data Processing with Apache Hadoop
YARN: Future of Data Processing with Apache Hadoop
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Software Architecture and Predictive Models in R
Software Architecture and Predictive Models in RSoftware Architecture and Predictive Models in R
Software Architecture and Predictive Models in R
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 

Similar to Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx

Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
Teradata Aster
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
Inside Analysis
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Hortonworks
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisOW2
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
Hortonworks
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
DataWorks Summit
 
Ventana Research Presents: Best Practices with Hadoop - Real World Data
Ventana Research Presents:  Best Practices with Hadoop - Real World DataVentana Research Presents:  Best Practices with Hadoop - Real World Data
Ventana Research Presents: Best Practices with Hadoop - Real World Data
Cloudera, Inc.
 
vBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and BeyondvBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and Beyond
CloudStack - Open Source Cloud Computing Project
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache Hadoop
Hortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Cloudera, Inc.
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
DataWorks Summit
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationDataWorks Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks
 
GDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache AtlasGDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
Hortonworks
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User Group
Mats Johansson
 
Meetup oslo hortonworks HDP
Meetup oslo hortonworks HDPMeetup oslo hortonworks HDP
Meetup oslo hortonworks HDP
Alexander Bakos Leirvåg
 

Similar to Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx (20)

2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
 
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
 
Ventana Research Presents: Best Practices with Hadoop - Real World Data
Ventana Research Presents:  Best Practices with Hadoop - Real World DataVentana Research Presents:  Best Practices with Hadoop - Real World Data
Ventana Research Presents: Best Practices with Hadoop - Real World Data
 
vBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and BeyondvBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and Beyond
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache Hadoop
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
Apache Hadoop Now Next and Beyond
Apache Hadoop Now Next and BeyondApache Hadoop Now Next and Beyond
Apache Hadoop Now Next and Beyond
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
GDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache AtlasGDPR Community Showcase for Apache Ranger and Apache Atlas
GDPR Community Showcase for Apache Ranger and Apache Atlas
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Hortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User GroupHortonworks Hadoop @ Oslo Hadoop User Group
Hortonworks Hadoop @ Oslo Hadoop User Group
 
Meetup oslo hortonworks HDP
Meetup oslo hortonworks HDPMeetup oslo hortonworks HDP
Meetup oslo hortonworks HDP
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 

Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx

  • 1. Hortonworks & Systems Integrators Mitch Ferguson VP, Business Development Rikin Shah Dir, Field Engineering September 5, 2012 © Hortonworks Inc. 2012 Page 1
  • 2. Big data changes the game Transactions + Interactions Petabytes BIG DATA Mobile Web + Observations Sentiment User Click Stream SMS/MMS = BIG DATA Speech to Text Social Interactions & Feeds Terabytes WEB Web logs Spatial & GPS Coordinates A/B testing Sensors / RFID / Devices Behavioral Targeting Gigabytes CRM Business Data Feeds Dynamic Pricing Segmentation External Demographics Search Marketing Customer Touches User Generated Content ERP Megabytes Affiliate Networks Purchase detail Support Contacts HD Video, Audio, Images Dynamic Funnels Purchase record Offer details Offer history Product/Service Logs Payment record Increasing Data Variety and Complexity © Hortonworks Inc. 2012
  • 3. Hortonworks Snapshot The industry leading and only 100% open source Apache Hadoop distribution •  Headquarters Most experienced open source leadership team Sunnyvale, CA –  Rob Bearden – CEO (JBoss, SpringSource, i2, Oracle) –  Shaun Connolly – VP Strategy (VMW, SpringSource, Red Hat, JBoss) •  100+ Employees –  Mitch Ferguson: VP BD (SringSource, VMWare) •  Formed with core –  John Kreisa – VP Marketing (Red Hat, Cloudera, MarkLogic, Bus Obj) Apache Hadoop –  Ari Zilka – CPO (Teracotta, Accenture, Walmart.com) engineering team –  Greg Pavlik – VP Eng. (Oracle SOA & Integration platform) from Yahoo! •  40+ engineers and architects including Business model focused on customer success: 25+ Hadoop Hadoop support, services & training committers – Subscription support for Hortonworks Data Platform – Training business: Private and public classes available for developers & administrators © Hortonworks Inc. 2012
  • 4. Hortonworks Business Strategy Enable the next gen data management platform • Accelerate the adoption of Apache Hadoop • Create a vibrant eco-systems – ISVs, IHV, Systems Integrators • Provide world-class enterprise Support & Training © Hortonworks Inc. 2012
  • 5. Hortonworks Vision & Role We believe that by the end of 2015, more than half the world's data will be processed by Apache Hadoop. 1 Be diligent stewards of the open source core 2 Be tireless innovators beyond the core 3 Provide robust data platform services & open APIs 4 Enable the ecosystem at each layer of the stack 5 Make the platform enterprise-ready & easy to use © Hortonworks Inc. 2012
  • 6. Enabling Hadoop as Enterprise Big Data Platform Applications, Installation & Business Tools, Configuration, Development Tools, Administration, Open APIs and access Monitoring, Data Movement & Integration, High Availability, Data Management Systems, Replication, Systems Management Hortonworks Multi-tenancy, .. Data Platform DEVELOPER Data Platform Services & Open APIs Metadata, Indexing, Search, Security, Management, Data Extract & Load, APIs © Hortonworks Inc. 2012
  • 7. Hortonworks Partner Eco-System © Hortonworks Inc. 2012
  • 8. Hortonworks & SIs Our business models are 100% Complementary • Systems Integrators are a corner-stone of our business model • Enable high-value & repeatable solutions • Leverage multi-party relationships to accelerate business Systems Integrator Customer © Hortonworks Inc. 2012
  • 9. Why Hortonworks? •  The most Apache Hadoop experience and expertise –  Reliable Hadoop from the experts, project leaders, architects and builders –  Collectively over 90 years operational Hadoop experience (at least double that of the closest competitor) •  Influence community direction –  Provides a direct connection to drive innovation in the community •  Focus on the ecosystem –  Roadmap and vision to provide access to the wide ecosystem of enterprise application, such as Teradata •  Industry momentum –  Collaborate across partners (ISVs/IHVs/SIs) to enable high-value solutions © Hortonworks Inc. 2012
  • 10. Hortonworks Apache Hadoop Leadership Hortonworkers… the builders, Leadership operators and core architects • VP and PMC of Hadoop of Apache Hadoop Arun Murthy • Core Architect of YARN Arun Murthy •  Most experienced team running Hadoop • Core Architect MapReduce2 in production at scale (> 5 years, 42000 nodes) Arun Murthy •  All “stable” releases of Apache Hadoop • VP & PMC of Pig Daniel Dai have been shipped by Hortonworkers • VP of Zookeeper Mahdev Konar “We have noticed more activity over the • Inventor of HCatalog last year from Hortonworks’ engineers Alan Gates on building out Apache Hadoop’s more • Project Lead for Ambari innovative features. These include Mahedev Konar YARN, Ambari and HCatalog..” • Original Project Lead Eric Baldschweiler - Jeff Kelly: Wkibon © Hortonworks Inc. 2012
  • 11. Hortonworks Data Platform © Hortonworks Inc. 2012
  • 12. Hortonworks Data Platform Develop Interact Non-Relational Database Talend Open Studio for Big Data, Sqoop, Flume) Scripting Query Management & Monitoring Services (Pig) (Hive) Data Integration Services Workflow & Scheduling (HCatalog APIs, WebHDFS, (HBase) Metadata Services (Ambari, Zookeeper) (HCatalog) (Oozie) Operate Distributed Processing Integrate (MapReduce) Distributed Storage (HDFS) Hortonworks Data Platform © Hortonworks Inc. 2012
  • 13. Apache Hadoop Release Management 1.1.1 1.1.2 Hadoop 1 1.0.1 1.0.2 1.0.3 HDP 1.0 •  Apache Hadoop Release management is run by Hortonworks •  Matt Foley, Release Manager for Hadoop 1 •  Arun Murthy, Release Manager for Hadoop 2 •  Ashutosh Chauhan, Release Manager for Hive •  Daniel Dai, Release Manager for Pig •  Alan Gates, Release Manager for Hcat •  Hadoop Core releases validated (and fixed) by Hortonworks •  ~1300 end to end system tests run in house using our IP before any release can be made •  Hortonworks Data Platform is released directly from Apache Hadoop branches © Hortonworks Inc. 2012
  • 14. Full Stack High Availability © Hortonworks Inc. 2012
  • 15. Full Stack High Availability HA HA •  Failover and restart for •  NameNode •  JobTracker •  HBase and other services to come… HA Cluster Core Switch Rack Switch Rack Switch •  Open API allows use of Proven HA from multiple vendors (Red Hat & VMWare) Namenode Namenode HA Manager HA Manager •  Minimized changes to clients and configuration Job Tracker Job Tracker •  Complementary to 2.0 HA efforts HA Manager HA Manager •  Server & Operating System failure Etc. daemon Etc. daemon detection and VM restart HA Manager HA Manager •  Smart resource management ensures sufficient resources are HA Pairs available to restart VMs Addresses HA needs on stable Apache Hadoop 1.0 © Hortonworks Inc. 2012
  • 16. Capacity Scheduler Delivers Multi-tenancy • Queue definition – % of total system memory – % CPU utilization (not slot count) • Queues per team – Soft limits and hard so you can use entire cluster if available – Ownership and security built in • Proactive resource management – Lots of rules and observation points – Don’t start another task if it will blow up the node – Don’t start another task if other workloads are spinning up • Better than Fair + Preemption (HDP Supports All) – Utilization not measured by slot count (can blow up a node / cluster) – Doesn’t start all tasks automatically (proactive vs. reactive) © Hortonworks Inc. 2012
  • 17. HCatalog METADATA © Hortonworks Inc. 2012
  • 18. Metadata Services Apache HCatalog provides flexible metadata services across tools and external access •  Consistency of metadata and data models across tools (MapReduce, Pig, HBase and Hive) •  Accessibility: share data as tables in and out of HDFS •  Availability: enables flexible, thin-client access via REST API HCatalog Shared table and schema management •  Raw Hadoop data Table access opens the •  Inconsistent, unknown Aligned metadata platform •  Tool specific access REST API © Hortonworks Inc. 2012
  • 19. Options Lead to Complexity Feature MapReduce Pig Hive Record format Key value pairs Tuple Record Data model User defined int, float, string, int, float, string, bytes, maps, maps, structs, lists tuples, bags Schema Encoded in app Declared in script Read from or read by loader metadata Data location Encoded in app Declared in script Read from metadata Data format Encoded in app Declared in script Read from metadata •  Pig and MR users need to know a lot to write their apps •  When data schema, location, or format change Pig and MR apps must be rewritten, retested, and redeployed •  Hive users have to load data from Pig/MR users to have access to it © Hortonworks 2012
  • 20. Hadoop Ecosystem Hive Pig MapReduce (SQL) (scripting) (Java) Interface: Interface: Interface: SQL SerDe Load/Store DML Input/ Input/ Input/ OutputFormat OutputFormat OutputFormat DDL metastore dn1 dn2 dn3 . . . - tables - partitions - files . . . . . . - types . . . . dnN HDFS © Hortonworks Inc. 2012
  • 21. Opening up Metadata to MR & Pig Hive Pig MapReduce (SQL) (scripting) (Java) HCat Metadata layer Interface: Interface: SQL HCatLoad/Store Interface: SerDe HCatInput/OutputFormat metastore dn1 dn2 dn3 . . . - tables - partitions - files . . . . . . - types . . . . dnN HDFS © Hortonworks Inc. 2012
  • 22. Tools With HCatalog Feature MapReduce + Pig + HCatalog Hive HCatalog Record format Record Tuple Record Data model int, float, string, int, float, string, int, float, string, maps, structs, lists bytes, maps, maps, structs, lists tuples, bags Schema Read from Read from Read from metadata metadata metadata Data location Read from Read from Read from metadata metadata metadata Data format Read from Read from Read from metadata metadata metadata •  Pig/MR users can read schema from metadata •  Pig/MR users are insulated from schema, location, and format changes •  All users have access to other users’ data as soon as it is committed © Hortonworks 2012
  • 23. Metadata Services applications DML Hive HCatalog DML HBase REST data stores •  ddl •  dml DML Pig create describe visualization Existing metastore Hadoop Infrastructure Cluster © Hortonworks Inc. 2012
  • 24. Services Integration Provides RESTful API as “front door” for Hadoop Existing & New Applications •  Opens the door to WebHDFS HCatalog RESTful Web Services languages other than Java •  Thin clients via web MapReduce Pig Hive services vs. fat-clients in HCatalog gateway •  Insulation from interface External HDFS HBase changes release to release Store Opens Hadoop to integration with existing and new applications © Hortonworks Inc. 2012
  • 25. Data Integration Services •  Intuitive graphical data integration tools for HDFS, Hive, HBase, HCatalog and Pig •  Oozie scheduling allows you to manage and stage jobs •  Connectors for any database, business application or system •  Integrated HCatalog storage Bridge the gap between legacy data & Hadoop Simplify and speed development © Hortonworks Inc. 2012
  • 26. Teradata and Hortonworks Partner to Provide the First Enterprise Reference Architecture for Hadoop and Big Data Partnership provides clear path to enterprise for Hadoop •  Reference architecture that provides guidance on best applications for Teradata, Teradata Aster, and Hadoop •  Clear partnership between industry and community leaders •  Deeper integration to ease data movement in/out of Hadoop •  Joint R&D and go-to-market © Hortonworks Inc. 2012
  • 28. Ambari Architecture •  Installs your cluster onto target HW for you data and task n1 n2 n3 . . . sink Puppet . . . . . . Nagios Ganglia •  Manage, reconfigure from . . . . nN one place worker node Hadoop •  Monitor key and meaningful Hadoop metrics, not just OS / Ambari HW Nagios Ganglia Puppet controller php portal •  Scalable in line w/ Hadoop view itself operator © Hortonworks Inc. 2012
  • 29. Ambari Live Demonstration © Hortonworks Inc. 2012
  • 30. Why HDP? ONLY Hortonworks Data Platform provides… •  Tightly aligned to core Apache Hadoop development line - Reduces risk for customers who may add custom coding or projects •  Enterprise Integration - HCatalog provides scalable, extensible integration point to Hadoop data •  Most reliable Hadoop distribution - Full stack high availability on v1 delivers the strongest SLA guarantees •  Multi-tenant scheduling and resource management - Capacity and fair scheduling optimizes cluster resources •  Integration with operations, eases cluster management - Ambari is the most open/complete operations platform for Hadoop clusters © Hortonworks Inc. 2012
  • 31. Hortonworks Support Subscriptions Objective: help organizations to successfully develop and deploy solutions based upon Apache Hadoop • Full-lifecycle technical support available – Developer support for design, development and POCs – Production support for staging and production environments – Up to 24x7 with 1-hour response times • Delivered by the Apache Hadoop experts – Backed by development team that has released every major version of Apache Hadoop since 0.1 • Forward-compatibility – Hortonworks’ leadership role helps ensure bug fixes and patches can be included in future versions of Hadoop projects Page 31 © Hortonworks Inc. 2012
  • 32. Hortonworks Training Objective: help organizations overcome Hadoop knowledge gaps • Expert role-based training for developers, administrators & data analysts – Heavy emphasis on hands-on labs – Extensive schedule of public training courses available (hortonworks.com/training) • Comprehensive certification programs • Customized, on-site courses available Page 32 © Hortonworks Inc. 2012
  • 33. Thank You! Questions & Answers Page 33 © Hortonworks Inc. 2012