Bigger Data For Your Budget

VDave Porter
Dave Porter – SproutCore Architect, Appnovation
davep@appnovation.com
Bigger Data For Your Budget
CANADIAN HEADQUARTERS
152 West Hastings Street
Vancouver BC, V6B 1G8
UNITED STATES OFFICE
3414 Peachtree Road, #1600
Atlanta Georgia, 30326-1164
UNITED KINGDOM OFFICE
3000 Hillswood Drive
Hillswood Business Park
Chertsey KT16 0RS, UK
www.appnovation.com
info@appnovation.com
How to turn your Big Data into Big Insights
without breaking the bank

VDave Porter
John Kreisa
VP Marketing, Hortonworks
Dave Porter
SproutCore Architect,
Appnovation Technologies
Speakers

VDave Porter
Appnovation is one
of the world’s TOP
OPEN SOURCE
DEVELOPMENT
SHOPS.

VDave Porter
LOCATIONS
VANCOUVER OFFICE
ATLANTA OFFICE
LONDON OFFICE

VDave Porter
Bigger Data
For Your Budget

VDave Porter
Databases
Server logs
Raw transactional data
Human-Quality Input
WHAT IS BIG DATA?

VDave Porter
Website Traffic Patterns
Financial Transactions
Science
People
WHERE IS IT COMING FROM?

VDave Porter
Curing Cancer
Beating XDR-TB
Finding Earth 2.0 in Outer Space
Seeing Deeper Into Your Business
THE PROMISE OF BIG DATA

VDave Porter
Retail Inventory System
WHAT CAN BIG DATA DO FOR ME?

VDave Porter
Overnight Batch Cycle

VDave Porter
Hourly Cycle

VDave Porter
Collecting & Storing
Processing & Analyzing
THE BIG DATA CHALLENGES

VDave Porter
Collecting & Storing
…on expensive hardware
Processing & Analyzing
…with expensive software
THE BIG DATA CHALLENGES

VDave Porter
Open Source Software,
Running on Commodity Hardware.
BIGGER DATA FOR YOUR BUDGET

VDave Porter

VDave Porter
Gnomes … with flashlights (and notepads)
HADOOP:

VDave Porter
+
HADOOP:

© Hortonworks Inc. 2013
A Brief History of Apache Hadoop
Page 22
2013
Focus on INNOVATION
2005: Yahoo! creates
team under E14 to
work on Hadoop
Focus on OPERATIONS
2008: Yahoo team extends focus to
operations to support multiple
projects & growing clusters
Yahoo! begins to
Operate at scale
Enterprise
Hadoop
Apache Project
Established
Hortonworks
Data Platform
2004 2008 2010 20122006
STABILITY
2011: Hortonworks created to focus on
“Enterprise Hadoop“. Starts with 24
key Hadoop engineers from Yahoo

Hortonworks Snapshot
Page 23
• We distribute the only 100%
Open Source Enterprise
Hadoop Distribution:
Hortonworks Data
Platform
• We engineer, test & certify
HDP for enterprise usage
• We employ the core
architects, builders and
operators of Apache Hadoop
• We drive innovation within
Apache Software
Foundation projects
• We are uniquely positioned
to deliver the highest quality
of Hadoop support
• We enable the ecosystem to
work better with Hadoop
Develop Distribute Support
We develop, distribute and support
the ONLY 100% open source
Enterprise Hadoop distribution
Endorsed by Strategic Partners
Headquarters: Palo Alto, CA
Employees: 180+ and growing
Investors: Benchmark, Index, Yahoo

Hortonworks Process for Enterprise Hadoop
Page 24
Upstream Community Projects Downstream Enterprise Product
Hortonworks
Data Platform
Design &
Develop
Distribute
Integrate
& Test
Package
& Certify
Apache
HCatalo
g
Apache
Pig
Apache
HBase
Other
Apache
Projects
Apache
Hive
Apache
Ambari
Apache
Hadoop
Test &
Patch
Design & Develop
Release
No Lock-in: Integrated, tested & certified distribution lowers
risk by ensuring close alignment with Apache projects
Virtuous cycle when development & fixed issues done upstream & stable project releases flow downstream
Stable Project
Releases
Fixed Issues

Enhancing the Core of Apache Hadoop
Deliver high-scale
storage & processing
with enterprise-ready
platform services
Unique Focus Areas:
• Bigger, faster, more flexible
Continued focus on speed & scale and
enabling near-real-time apps
• Tested & certified at scale
Run ~1300 system tests on large Yahoo
clusters for every release
• Enterprise-ready services
High availability, disaster recovery,
snapshots, security, …
Page 25
HADOOP CORE
Hortonworkers are the architects, operators,
and builders of core Hadoop
Distributed
Storage & Processing
PLATFORM SERVICES Enterprise Readiness

Page 26
HADOOP CORE
DATA
SERVICES
Provide data services to
store, process & access
data in many ways
Unique Focus Areas:
• Apache HCatalog
Metadata services for consistent table
access to Hadoop data
• Apache Hive
Explore & process Hadoop data via SQL &
ODBC-compliant BI tools
Distributed
Hortonworks enables Hadoop data to be
accessed via existing tools & systems
Store,
Process and
Access Data
Data Services for Full Data Lifecycle

Operational Services for Ease of Use
Page 27
OPERATIONAL
SERVICES
Include complete
operational services for
productive operations
& management
Unique Focus Area:
• Apache Ambari:
Provision, manage & monitor a cluster;
complete REST APIs to integrate with
existing operational tools; job & task
visualizer to diagnose issues
Only Hortonworks provides a complete open
source Hadoop management tool
Manage &
Operate at
Scale
DATA
SERVICES
Store,
Process and
Access Data
HADOOP CORE
Distributed

OS Cloud VM Appliance
Page 28
PLATFORM SERVICES
HADOOP CORE
DATA
SERVICES
OPERATIONAL
SERVICES
Manage &
Operate at
Scale
Store,
Process and
Access Data
Enterprise Readiness
Only Hortonworks
allows you to deploy
seamlessly across any
deployment option
• Linux & Windows
• Azure, Rackspace & other clouds
• Virtual platforms
• Big data appliances
HORTONWORKS
DATA PLATFORM (HDP)
Distributed
Deployable Across a Range of Options

OS Cloud VM Appliance
HDP: Enterprise Hadoop Distribution
Page 29
PLATFORM SERVICES
HADOOP CORE
DATA
SERVICES
OPERATIONAL
SERVICES
Manage &
Operate at
Scale
Store,
Process and
Access Data
HORTONWORKS
DATA PLATFORM (HDP)
Distributed
Hortonworks
Data Platform (HDP)
Enterprise Hadoop
• The ONLY 100% open source
and complete distribution
• Enterprise grade, proven and
tested at scale
• Ecosystem endorsed to
ensure interoperability
Enterprise Readiness

Existing Data Architecture
Page 30
APPLICATIONSDATASYSTEMS
TRADITIONAL REPOS
RDBMS EDW MPP
DATASOURCES
OLTP,
POS
SYSTEMS
OPERATIONAL
TOOLS
MANAGE &
MONITOR
Traditional Sources
(RDBMS, OLTP, OLAP)
DEV & DATA
TOOLS
BUILD &
TEST
Business
Analytics
Custom
Applications
Enterprise
Applications

An Emerging Data Architecture
Page 31
TRADITIONAL REPOS
RDBMS EDW MPP
DATASOURCES
MOBILE
DATA
OLTP,
POS
SYSTEMS
OPERATIONAL
TOOLS
MANAGE &
MONITOR
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources
(web logs, email, sensor data, social media)
DEV & DATA
TOOLS
BUILD &
TEST
Business
Analytics
Custom
Applications
Enterprise
Applications
HORTONWORKS
DATA PLATFORM

Interoperating With Your Tools
Page 32
TRADITIONAL REPOS
DEV & DATA
TOOLS
OPERATIONAL
TOOLS
Viewpoint
Microsoft Applications
HORTONWORKS
DATA PLATFORM
DATASOURCES
MOBILE
DATA
OLTP,
POS
SYSTEMS
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources

Big Data
Transactions, Interactions, Observations
Hadoop Patterns of Use
Page 33
Business Case
HORTONWORKS
DATA PLATFORM
Refine Explore Enrich

Operational Data Refinery
Page 34
DATASYSTEMSDATASOURCES
1
3
1 Capture
Capture all data
Process
Parse, cleanse, apply
structure & transform
Exchange
Push to existing data
warehouse for use with
existing analytic tools
2
3
Refine Explore
Enric
h
2
APPLICATIONS
Collect data and apply a
known algorithm to it in
trusted operational
process
TRADITIONAL REPOS
RDBMS EDW MPP
HORTONWORKS
DATA PLATFORM
Business
Analytics
Custom
Applications
Enterprise
Applications
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources

Big Data Exploration & Visualization
Page 35
APPLICATIONS
1 Capture
Capture all data
Process
Exchange
Explore and visualize
with analytics tools
supporting Hadoop
2
3
Collect data and perform
iterative investigation for
value
3
2
TRADITIONAL REPOS
RDBMS EDW MPP
1
HORTONWORKS
DATA PLATFORM
Business
Analytics
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources

Application Enrichment
Page 36
APPLICATIONS
1 Capture
Capture all data
Process
Exchange
Incorporate data directly
into applications
2
3
Collect data, analyze and
present salient results for
online apps
3
1
2
TRADITIONAL REPOS
RDBMS EDW MPP
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources
Custom
Applications
Enterprise
Applications
NOSQL
HORTONWORKS
DATA PLATFORM

VDave Porter
Next Steps
Hortonworks.com
/sandbox
Hortonworks.com
/hadoop-training
@Appnovation
DaveP@Appnovation.com JKriesa@Hortonworks.com
@hortonworks
@hortonworks_U
Appnovation.com
/Blog
Blog
LEAR
N

VDave Porter
Thank You For Your Participation!
CANADIAN HEADQUARTERS
UNITED STATES OFFICE
UNITED KINGDOM OFFICE
www.appnovation.com
info@appnovation.com

Bigger Data For Your Budget

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Bigger Data For Your Budget

Similar to Bigger Data For Your Budget (20)

More from Hortonworks

More from Hortonworks (20)

Recently uploaded

Recently uploaded (20)

Bigger Data For Your Budget

Editor's Notes