2012 06 hortonworks paris hug

Hortonworks
Enabling Apache Hadoop to
power next-generation enterprise data architectures

June 2012

© Hortonworks Inc. 2012 Page 1

Topics
• Big Data Market Overview

• Hortonworks Company & Strategy Overview

• Hortonworks Offerings
– Hortonworks Data Platform Subscriptions
– Public & On-site Training
– Expert Short-term Consulting Services

Page 2
© Hortonworks Inc. 2012

Big Data = Transactions + Interactions + Observations

BIG DATA
Sensors / RFID / Devices User Generated Content
Petabytes Mobile Web Sentiment
Social Interactions & Feeds
User Click Stream
Spatial & GPS Coordinates
Web logs WEB A/B testing
Terabytes External Demographics
Offer history Dynamic Pricing
Business Data Feeds
Affiliate Networks
CRM HD Video, Audio, Images
Gigabytes Segmentation Search Marketing
Offer details Speech to Text
ERP Behavioral Targeting
Purchase detail Customer Touches
Product/Service Logs
Megabytes Purchase record Support Contacts Dynamic Funnels
Payment record SMS/MMS

Increasing Data Variety and Complexity
Source: Contents of above graphic created in partnership with Teradata, Inc.

Page 3

What is Apache Hadoop?

• Collection of Open Source Projects One of the best examples of
– Apache Software Foundation (ASF) open source driving innovation
– Loosely coupled, ship early/often and creating a market

• Foundation for Big Data Solutions
– Stores petabytes of data reliably
– Hadoop Distributed File System
– Runs highly distributed computations
– Hadoop MapReduce framework
– Enables a rational economics model
– Commodity servers & storage
– Powers data-driven business

Page 4

7 Key Drivers for Hadoop
Business Pressure
1 Opportunity to enable innovative new business models

2 Potential new insights that drive competitive advantage

Technical Pressure
3 Data collected and stored continues to grow exponentially

4 Data is increasingly everywhere and in many formats

5 Traditional solutions not designed for new requirements

Financial Pressure
6 Cost of data systems, as % of IT spend, continues to grow

7 Cost advantages of commodity hardware & open source

Page 5

3 Phases of Hadoop Adoption
Educate/Evaluate Initial Production Wide-scale Production
Timeline 1 - 12 months 9 - 24 months 18 - 36 months

Stage Awareness, adoption and Departmental production Enterprise wide production
proof of enterprise viability usage usage
Description See it -> Learn it -> Do it Single business use case, Multiple use cases,
Evaluation, exploration, focused solution architecture broader solution architecture
POCs, Dev & Admin training
Key Questions What are the potential use Can the solution enable How can the solution be
cases? Which one should I future business models? leveraged enterprise-wide?
focus on?
Am I maximizing the value What is required to enable,
How do I get value now? from the chosen use case? integrate, operate at scale?

Where does Hadoop fit in How does this solution What does our next-
my data architecture? interact within our generation data architecture
departmental data look like?
Can I leverage my existing architecture?
tools/platforms? How can I maximize access
How do I operationalize the to data while minimizing
Can I replace any of my solution? risk?
existing systems?

Page 6

What’s Needed to Accelerate Adoption?
• Enterprise tooling to become a complete data platform
– Open deployment & provisioning
– Higher quality data loading
– Monitoring and management
– APIs for easy and efficient integration

• Ecosystem support & development
– Existing infrastructure vendors need to continue to integrate
– Apps need to continue to be developed on this infrastructure

• Market to rally around core Apache Hadoop
– To avoid splintering/market fragmentation
– To accelerate adoption

Page 7

Topics


– Expert Architectural Services

Page 8

Hortonworks Vision & Role

We believe that by the end of 2015,
more than half the world's data will be
processed by Apache Hadoop.

1 Make Hadoop easy to use and consume

2 Make Hadoop an enterprise-viable data platform

3 Provide open APIs and data services

4 Enable ecosystem at each layer of the data stack

5 Be stewards of the core and innovators on the edges

Page 9

Hortonworks Strategy
• Lead within Hadoop Community
– Team has delivered every major Hadoop
release since 0.1
– Experience managing world’s largest
deployment
– Ongoing access to Y!’s 1,000+ users and
40k+ nodes for testing, QA, etc.

• Embrace & Enable Hadoop Ecosystem
– 100% open source software

– Full lifecycle support subscriptions

– Expert role-based training

– Enable solution architectures

Page 10

Enable Hadoop to be Next-Gen Data Platform

Enable the ecosystem at each layer Make Hadoop easy to use/consume

• Usability
Applications & Solutions • Ease of Installation
Tools & Languages
Make Hadoop ent viable platform
BI & Analytics
Data Management Systems Installation & Configuration

Data Movement & Integration Administration
Infrastructure Platform
Hortonworks Monitoring
Data Platform
Data Extract & Load
Load and process data
Enterprise data services

Provide open APIs and data services
Page 12

Next-Generation Data Architecture
Audio, Web, Mobile, CRM,
Video, ERP, SCM, …
Images Business
Transactions
Docs,
Text, & Interactions
XML

Web
Logs,
Clicks
Big Data
SQL NoSQL NewSQL
Social, G Refinery
raph, Fe
eds

EDW MPP NewSQL
Sensors,
Devices,
RFID

Business
Spatial, Intelligence
GPS Apache Hadoop
& Analytics
Events,
Other Dashboards, Reports,
Visualization, …

Page 14

Maximizing the Value from ALL of your Data
Audio, Retain runtime models and
Video,
Images
historical data for ongoing 4 Business
refinement & analysis
Transactions
Docs,
Text, & Interactions
XML

Web
Logs,
Web, Mobile, CRM,
Clicks ERP, SCM, …
Big Data
Social, G Refinery Classic
raph, Fe
3 Share refined data and 1
eds ETL
runtime models processing
Sensors, 2
Devices,
RFID Store, aggregate,
and transform Business
Spatial,
multi-structured
GPS data to unlock
Intelligence
value & Analytics
Retain historical
Events, data to unlock 5
Other
additional value Dashboards, Reports,
Visualization, …
Page 15

Topics


– Expert Short-term Consulting Services

Page 16

Balancing Innovation & Stability
• Apache: Be aggressive - ship early and often
– Projects need to keep innovating and visibly improve
– Aim for big improvements on trunk
– Make early buggy releases

• Hortonworks: Be predictable - ship when stable
– We need to ship stable, working releases
– Make packaged binary releases available
– We need to do regular sustaining engineering releases
– QA for stable Hadoop releases
– HDP quarterly release trains sweep in stable Apache projects
– Enables HDP to stay reasonably current and predictable while minimizing risk
of thrashing that coordinating large # of Apache projects can cause

Page 17

Hadoop Now, Next, and Beyond
Apache community, including Hortonworks investing to improve Hadoop:
• Make Hadoop an open, extensible, and enterprise viable platform
• Enable more applications to run on Apache Hadoop
“Hadoop.Beyond”
Integrate w/ecosystem
“Hadoop.Next”
(Hadoop 2.x)
HDP 2

“Hadoop.Now” Next-gen MapReduce & HDFS
(Hadoop 1.0)
HDP 1
Most stable Hadoop ever

Page 18

Hortonworks Support Subscriptions
Objective: help organizations to successfully develop
and deploy solutions based upon Apache Hadoop
• Full-lifecycle technical support available
– Developer support for design, development and POCs
– Production support for staging and production environments
– Up to 24x7 with 1-hour response times

• Delivered by the Apache Hadoop experts
– Backed by development team that has released every major
version of Apache Hadoop since 0.1

• Forward-compatibility
– Hortonworks’ leadership role helps ensure bug fixes and patches
can be included in future versions of Hadoop projects

Page 19

Cluster Subscriptions
Starter Standard Enterprise
Per Cluster Per Cluster
Unit 3 month 20 Nodes w/ 250TB of Storage 20 Nodes w/ 250TB of Storage
(Compute or Storage Expansion) (Compute or Storage Expansion)

Supported Hortonworks Data Platform (HDP) and patches and updates for HDP.
Software Software acquired via Hortonworks website and Cluster Subscriptions.
Cluster operators can interact with the expert Hortonworks support staff during the
proof-of-concept, staging and deployment phases.

We Support: Configuration and installation questions, explanation of routine
Support maintenance, analysis of performance issues, diagnosis of system or application
Coverage issues and any bug fixes or patches that may be necessary.

We Don’t Support: Production issues with customer code, end-to-end debugging of
customer code, development of customer code, 3rd-party products used during
development and deployment.
Web, Monday to Friday, Web, Monday to Friday,
Access 6am to 6pm PT 6am to 6pm PT
Web and Phone, 24 x 7

Incidents Unlimited Unlimited Unlimited

Priority 1: 1 Hour
Response Business Day Business Day Priority 2: 4 Hours
Priority 3: 8 Hours / Biz Day

Page 20

Developer Subscription
Developer
Price Per Developer
Hortonworks Data Platform (HDP) and patches and updates for HDP.
Software acquired via Hortonworks website and Cluster Subscriptions.
Supported
Software Software acquired via Hortonworks website, Cluster Subscriptions, or Virtual/Cloud
Sandbox environments.

Developers can interact with the expert Hortonworks support staff to receive guidance
on the use of the software and answers for “how-to” questions.

We Support: Design advice, performance tuning advice, code snippet review and
Support
advice, problem diagnosis, bug reports, and other development related questions.
Coverage
We Don't Support: Production issues with customer code, end-to-end debugging of
customer code, development of customer code, 3rd-party products used during
development and deployment.
Access Web, Monday to Friday, 6am to 6pm PT
Incidents Unlimited
Response 4 Hours / Business Day

Page 21

Hortonworks Training
Objective: help organizations overcome Hadoop
knowledge gaps
• Expert role-based training for developers,
administrators & data analysts
– Heavy emphasis on hands-on labs
– Extensive schedule of public training courses available
(hortonworks.com/training)

• Comprehensive certification programs

• Customized, on-site courses available

Page 22

Hortonworks Architectural Services
• Services team dedicated to Hadoop Architecture and
Optimization
– Extensive cluster experience from smaller <100 clusters to the
largest in the world
– Recognized technical experts on Hadoop
• We work closely with the technical teams to
understand the business need and use case
– Translate the needs and use cases to technical requirements
– Callout other considerations based on our extensive knowledge
for growing and expanding clusters
• Designed for short-term high-impact knowledge
transfer and assist
– Complement internal technical team and SI

Page 23

Thank You!
Questions & Answers

Page 24

2012 06 hortonworks paris hug

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to 2012 06 hortonworks paris hug

Similar to 2012 06 hortonworks paris hug (20)

More from Modern Data Stack France

More from Modern Data Stack France (20)

Recently uploaded

Recently uploaded (20)

2012 06 hortonworks paris hug

Editor's Notes