Munich HUG 21.11.2013

Hortonworks: We Do Hadoop.
Our mission is to enable your Modern Data Architecture
by delivering One Enterprise Hadoop

November 2013

© Hortonworks Inc. 2013 - Confidential

Page 1

Agenda
• Hortonworks Overview of Tez
– Quick and painless

• A driver for Tez: The Stinger Initiative
• Tez Deep Dive
• Demo

Page 2

A Brief History of Apache Hadoop
Apache Project
Established

Yahoo! begins to
Operate at scale

Hortonworks
Data Platform

2013
2004

2006

2008

2010

2005: Hadoop created
at Yahoo!

2012

Focus on INNOVATION

2008: Yahoo team extends focus to
operations to support multiple
projects & growing clusters

Focus on OPERATIONS

2011: Hortonworks created to focus on
“Enterprise Hadoop“. Starts with 24
key Hadoop engineers from Yahoo


Enterprise
Hadoop

STABILITY

Page 3

Our Mission:

Enable your Modern Data Architecture
by delivering One Enterprise Hadoop

Our Commitment
Headquarters: Palo Alto, CA
Employees: 240+ and growing
Customers: 120+ and growing
Investors: Benchmark, Index,
Yahoo, Dragoneer, Tenaya

Innovate in the Open
We employ the core architects and operators of Hadoop and
drive innovation through open source Apache Foundation
projects to avoid vendor lock-in

Certify for the Enterprise
Trusted Partners with:

We engineer, test and certify the Hortonworks Data Platform for
enterprise usage and deliver the highest quality of support

Interoperate with the Ecosystem
We work with partners to deeply integrate Hadoop with key
technologies so you can leverage existing skills and investments


Page 4

DATA SYSTEM

APPLICATIONS

Goal: Interoperable and Familiar
BusinessObjects BI

DEV & DATA TOOLS

OPERATIONAL TOOLS

RDBMS

HANA

EDW

MPP

SOURCES

INFRASTRUCTURE

Existing Sources

Emerging Sources

(CRM, ERP, Clickstream, Logs)

(Sensor, Sentiment, Geo, Unstructured)


Page 5

Betting on Hortonworks…

HDInsight &
HDP for Windows

Teradata Portfolio
for Hadoop

• Only Hadoop Distribution
for Windows Azure &
Windows Server

• Seamless data access
between Teradata and
Hadoop (SQL-H)

• Native integration with
SQL Server, Excel, and
System Center

• Simple management &
monitoring with Viewpoint
integration

• Extends Hadoop to .NET
community

• Flexible deployment
options

Instant Access +
Infinite Scale
• SAP can assure their
customers they are
deploying an SAP HANA
+ Hadoop architecture
fully supported by SAP
• Enables analytics apps
(BOBJ) to interact with
Hadoop

Complete Portfolio for Hadoop

UDA
Diagram
Appliances


Page 6

Hortonworks Approach to Enterprise Hadoop
Community Driven Enterprise Apache Hadoop
Identify and introduce enterprise
requirements into the public domain
Work with the community to advance and
incubate open source projects

Apply Enterprise Rigor to provide the most
stable and reliable distribution


Driving Hadoop Innovation
Total Net Lines Contributed
to Apache Hadoop

End Users
449,768 lines

Hortonworks engineers focus on making
Apache Hadoop an enterprise viable
platform that powers modern data
architectures and deeply integrates
with existing data center technologies

614,041 lines

147,933 lines
10 Others

21

63
total

LinkedIn: 3

IBM: 3
Facebook: 5

Yahoo: 10
Cloudera: 7

Total Number of Committers
to Apache Hadoop


HDP: Enterprise Hadoop Platform
OPERATIONAL
SERVICES
AMBARI

FLUME
HBASE

FALCON*
OOZIE

Hortonworks
Data Platform (HDP)

DATA
SERVICES
PIG

SQOOP

HIVE &
HCATALOG

• The ONLY 100% open source
and complete platform

LOAD &
EXTRACT

HADOOP
CORE

NFS
WebHDFS

MAP
REDUCE

TEZ

YARN
HDFS
Enterprise Readiness

PLATFORM
SERVICES

KNOX*

High Availability, Disaster
Recovery, Rolling
Upgrades, Security and
Snapshots

HORTONWORKS
DATA PLATFORM (HDP)
OS/VM

Cloud


• Integrates full range of
enterprise-ready services
• Certified and tested at scale
• Engineered for deep
ecosystem interoperability

Appliance

Page 9

Hortonworks: The Value of “Open” for You
Connect With the Hadoop Community
We employ a large number of Apache project committers & innovators so
that you are represented in the open source community

Avoid Vendor Lock
Hortonworks Data Platform remain as close to the open source trunk as
possible and is developed 100% in the open so you are never locked in

The partners you rely on, rely on Hortonworks
We work with partners to deeply integrate Hadoop with data center
technologies so you can leverage existing skills and investments

Certified for the Enterprise
We engineer, test and certify the Hortonworks Data Platform at scale to
ensure reliability and stability you require for enterprise use

Support from the experts
We provide the highest quality of support for deploying at scale. You are
supported by hundreds of years of Hadoop experience


Page 10

SQL-in-Hadoop with Apache Hive
Business
Analytics

Custom
Apps
SQL

Hadoop

Hive
MapReduce

Tez

YARN

• Apache Hive is the standard for
SQL interaction with Hadoop
– Enterprise makes final purchasing
decision on two key characteristics:
'compatibility' with existing
investments (60%) and skills (20%)
– Most application claim Hive
compatibility TODAY*

HDFS

• Stinger Initiative: Simple Focus
Improves existing
tools & preserves
investments

– Performance
– SQL-Compatibility
Claims publicly made by: Teradata, Microsoft, Oracle, Microstrategy, IBM, Information
Builders, SAS, QlikTech, SAP, Tableau, Tibco, Actuate, Jaspersoft, Alteryx, Datameer, Pentah
o


Page 11

Stinger Initiative Goals
Execution
Engine

+

Tez

Windowing
&
Subqueries

Query
Planner
Hive

+

Data
Types

+

File
Format

= 100X

ORC file

= SQL Compatible

• Enables Hive to support interactive workloads
• Improves existing tools & preserves investments

Stinger: Hive For All Analytics
Parameterized Reports

Enterprise Reports
Dashboard / Scorecard

Data Mining

Visualization

100X Faster
+
SQL Compatible

Interactive


Batch

Stinger Roadmap
• Join optimizations
• ORCFile
• SQL:2003
windowing
functions
DATA TYPES
• Subqueries for
IN, NOT
IN, HAVING
• Datatypes:
CHAR, VARCHAR,
DATETIME
• Improvements to
DECIMAL datatype
• Integration with Tez
and Tez Service
• Vectorization
Preview
• Intelligent Optimizer
• Column Statistics
• Authentication and
Authorization
Enhancements
• Full vector query


Page 14

Stinger: Some early Results

• Query Engine Work ONLY
• Uses TPC “style” benchmark
• Just a few weeks of work
• OTHER work coming

Page 15

Apache Tez : Accelerating
Hadoop Query Processing


Page 16

Tez – Introduction
• Distributed execution
framework targeted towards
data-processing applications.
• Based on expressing a
computation as a dataflow
graph.
• Built on top of YARN – the
resource management
framework for Hadoop.
• Open source Apache incubator
project and Apache licensed.


Page 17

Old School Hadoop: MapReduce


Fundamentals of YARN
• The fundamental idea of YARN is to split up the two
major responsibilities of the JobTracker/TaskTracker
into separate entities:
– a global ResourceManager
– a per-application ApplicationMaster.
– a per-node slave NodeManager and
– a per-application Container running on a NodeManager


Page 19

New School Hadoop with YARN
Node
Manager

Container

App Mstr

Client
Resource
Manager

Node
Manager

Client
App Mstr

MapReduce Status
Job Submission
Node Status
Resource Request


Container

Node
Manager

Container

Container

Tez – Design Themes
• Empowering End Users
• Execution Performance


Page 21

Tez – Empowering End Users
• Expressive dataflow definition API’s
• Flexible Input-Processor-Output runtime model
• Data type agnostic
• Simplifying deployment


Page 22

– Enable definition of complex data flow pipelines using simple
graph connection API’s. Tez expands the logical plan at runtime.
– Targeted towards data processing applications like Hive/Pig but
not limited to it. Hive/Pig query plans naturally map to Tez dataflow
graphs with no translation impedance.
TaskA-1

TaskA-2

TaskD-1


TaskB-1

TaskD-2

TaskB-2

TaskC-1

TaskE-1

TaskC-2

TaskE-2

Page 23

Task-1

Task-2

Task-1

Task-2

Sample
s
Sampler

Preprocessor Stage

Ranges

Distributed Sort


Task-1

Task-2

Partition Stage

Aggregate Stage

Page 24

– Construct physical runtime executors dynamically by connecting
different inputs, processors and outputs.
– End goal is to have a library of inputs, outputs and processors that
can be programmatically composed to generate useful tasks.

ShuffleInput

ShuffleInput

ReduceProcessor

ReduceProcessor

JoinProcessor

FileSortedOutput

HDFSOutput

FileSortedOutput

IntermediateReduce

FinalReduce

PairwiseJoin


Input1

Input2

Page 25

– Tez is only concerned with the movement of data. Files and
streams of bytes.
– Does not impose any data format on the user application. MR
application can use Key-Value pairs on top of Tez. Hive and Pig
can use tuple oriented formats that are natural and native to them.

Tez
Task

File
Bytes

User Code
Key Value

Bytes
Tuples

Stream


Page 26

• Simplifying deployment
– Tez is a completely client side application.
– No deployments to do. Simply upload to any accessible
FileSystem and change local Tez configuration to point to that.
– Enables running different versions concurrently. Easy to test new
functionality while keeping stable versions for production.
– Leverages YARN local resources.
HDFS
Tez Lib 1

Tez Lib 2

TezClient

TezTask

TezTask

TezClient

Client
Machine

Node
Manager

Node
Manager

Client
Machine


Page 27

• Simplifying usage
With great power API’s come great responsibilities 
Tez is a framework on which end user applications can
be built


Page 28

Tez – Execution Performance
• Performance gains over Map Reduce
• Optimal resource management
• Plan reconfiguration at runtime
• Dynamic physical data flow decisions


Page 29

• Performance gains over Map Reduce
– Eliminate replicated write barrier between successive
computations.
– Eliminate job launch overhead of workflow jobs.
– Eliminate extra stage of map reads in every workflow job.
– Eliminate queue and resource contention suffered by workflow
jobs that are started after a predecessor job completes.

Pig/Hive - MR


Pig/Hive - Tez

Page 30

• Optimal resource management
– Reuse YARN containers to launch new tasks.
– Reuse YARN containers to enable shared objects across tasks.

Start Task

Tez
Application Master

Task Done

Start Task

YARN Container


TezTask1

TezTask2

Shared Objects

TezTask Host

YARN Container

Page 31

• Plan reconfiguration at runtime
– Dynamic runtime concurrency control based on data size, user
operator resources, available cluster resources and locality.
– Advanced changes in dataflow graph structure.
– Progressive graph construction in concert with user optimizer.

HDFS
Blocks
Stage 1
50 maps
100
partitions

Stage 2
100
reducers

Stage 1
50 maps
100
partitions

Only 10GB’s
of data

Stage 2
100 10
reducers

YARN
Resources


Page 32

• Dynamic physical data flow decisions
– Decide the type of physical byte movement and storage on the fly.
– Store intermediate data on distributed store, local store or inmemory.
– Transfer bytes via blocking files or streaming and the spectrum in
between.
Producer
(small size)

Producer

Local File

Consumer


At Runtime

In-Memory

Consumer

Page 33

Tez – Deep Dive – API
Simple DAG definition API
DAG dag = new DAG();
Vertex map1 = new Vertex(MapProcessor.class);
Vertex map2 = new Vertex(MapProcessor.class);
Vertex reduce1 = new Vertex(ReduceProcessor.class);
Vertex reduce2 = new Vertex(ReduceProcessor.class);
Vertex join1 = new Vertex(JoinProcessor.class);
…….
Edge edge1 = Edge(map1, reduce1, SCATTER_GATHER,
PERSISTED, SEQUENTIAL, MOutput.class, RInput.class);
Edge edge2 = Edge(map2, reduce2, SCATTER_GATHER,
Edge edge3 = Edge(reduce1, join1, SCATTER_GATHER,
Edge edge4 = Edge(reduce2, join1, SCATTER_GATHER,
…….
dag.addVertex(map1).addVertex(map2)
.addVertex(reduce1).addVertex(reduce2)
.addVertex(join1)
.addEdge(edge1).addEdge(edge2)
.addEdge(edge3).addEdge(edge4);

map1

map2
Scatter_Gather
Bipartite Sequential

reduce1

reduce2
Scatter_Gather
Bipartite Sequential

join1

Page 34

Tez – Deep Dive – API
Edge properties define the connection between
producer and consumer vertices in the DAG
• Data movement – Defines routing of data between tasks
– One-To-One : Data from the ith producer task routes to the ith consumer
task.
– Broadcast : Data from a producer task routes to all consumer tasks.
– Scatter-Gather : Producer tasks scatter data into shards and consumer
tasks gather the data. The ith shard from all producer tasks routes to the ith
consumer task.

• Scheduling – Defines when a consumer task is scheduled
– Sequential : Consumer task may be scheduled after a producer task
completes.
– Concurrent : Consumer task must be co-scheduled with a producer task.

• Data source – Defines the lifetime/reliability of a task output
– Persisted : Output will be available after the task exits. Output may be lost
later on.
– Persisted-Reliable : Output is reliably stored and will always be available
– Ephemeral : Output is available only while the producer task is running


Page 35

Tez – Deep Dive – Scheduling
Start
vertex

• Vertex Scheduler
Determines when
tasks in a vertex
can start

Get container

map1

Get Priority

• DAG Scheduler
Determines
priority of task

Start
vertex

• Task Scheduler
Allocates
containers from
YARN and
assigns them to
tasks

Vertex Scheduler

DAG
Scheduler

Task
Scheduler

Start
tasks

reduce1

Get Priority
Get container


Page 36

Tez – Deep Dive – Task Execution
• Start task shell with
user specified
env, resources etc.
• Fetch and
instantiate
Input, Processor, O
utput objects
• Receive
(incremental) input
information and
process the input
• Provide output
information


Task Attempt
(logical in AM)
Env, cmd
line, resources
Input
Processor
Output

Task Attempt
(real on machine)
Start container

Task JVM

Get Task
Input
Processor

Data
Information

Data Events

Output

Page 37

Tez - Sessions
• The amount of work programmed into a script/query may
not be doable within a single Tez DAG.


Page 38

Tez - Sessions

• Even better performance gains may be achieved through
caching with the session: Within AM or container

Page 39

Tez – Automatic Reduce Parallelism
Event Model
Map tasks send
data statistics
events to the
Reduce Vertex
Manager.
Vertex Manager
Pluggable user logic
that understands the
data statistics and
can formulate the
correct parallelism.
Advises vertex
controller on
parallelism

Data Size Statistics

Vertex Manager

Map Vertex

Set Parallelism
Re-Route

Vertex State
Machine

App Master

Reduce Vertex
Cancel Task


Page 40

Tez – Reduce Slow Start/Pre-launch
Event Model
Map completion
events sent to the
Reduce Vertex
Manager.
Vertex Manager
Pluggable user logic
that understands the
data size. Advises the
vertex controller to
launch the reducers
before all maps have
completed so that
shuffle can start.


Task Completed

Vertex Manager

Map Vertex

Start Tasks

Vertex State
Machine

App Master

Start

Reduce Vertex

Page 41

Tez – Current status
• Apache Incubator Project
– Rapid development. Over 330 jiras opened. Over 220 resolved.
– Growing community.

• Focus on stability
– Testing and quality are highest priority.
– Working on Tez+YARN to fix basic performance overheads.
– Code ready and deployed on multi-node environments.

• DAG of MR processing is working
– Already functionally equivalent to Map Reduce. Existing Map
Reduce jobs can be executed on Tez with few or no changes.
– Working Hive prototype that can target Tez for execution of
queries (HIVE-4660).
– Work started on prototype of Pig that can target Tez.


Page 42

Tez – Current status
Dimension
Table 1

Dimension
Table 1

Fact Table

Fact Table

Join

Dimension
Table 2

Result
Table 1

Optimization for
small data sets

Dimension
Table 1
Dimension
Table 1

Join

Result
Table 2

Dimension
Table 3
Join

Typical pattern in a
TPC-DS query


Result
Table 3

Both can now run
as a single Tez job

Page 43

Tez – MRR Performance
TPC-DS Query 12 with Hive on Tez
80
75

70

65

Elapsed Time (seconds)

60

50

55

55

54

46

40

30

35

34

RC File
Scale 200

ORC File
Scale 200

Traditional
Map-Reduce
Tez Map
Reduce Reduce

20

10

0


RC File
Scale 1000

ORC File
Scale 1000
Page 44

Tez – Roadmap
• Full DAG support
– Multi-way input and output.
– Other graph connection patterns.

• Performance optimizations
– Container reuse
– Cross task shared resources
– Using HDFS data caching

• Runtime plan optimizations
– Automatic input (map) parallelism
– Automatic aggregation (reduce) parallelism

• Usability.
– Stability and testability
– Recovery and history

Page 45

Tez – Community
• Early adopters and contributors welcome
– Adopters to drive more scenarios. Contributors to make them
happen.
– Hive and Pig communities are on-board and making great
progress - HIVE-4660 and PIG-3446

• Stay tuned for Tez meetups with deep dives on Tez
architecture and using Tez
– http://www.meetup.com/Apache-Tez-User-Group

• Useful links
– Work tracking: https://issues.apache.org/jira/browse/TEZ
– Code: https://github.com/apache/incubator-tez
– Developer list: dev@tez.incubator.apache.org
User list: user@tez.incubator.apache.org
Issues list: issues@tez.incubator.apache.org

Page 46

Tez – Takeaways
• Distributed execution framework that works on
computations represented as dataflow graphs
• Naturally maps to execution plans produced by query
optimizers
• Execution architecture designed to enable dynamic
performance optimizations at runtime
• Open source Apache project – your use-cases and
code are welcome
• It works and is already being used by Hive


Page 47

Tez
https://github.com/t3rmin4t0r/tez-autobuild
Tez:

https://github.com/apache/tez.git

Demo:

https://github.com/t3rmin4t0r/tez-autobuild

Thanks for your time and attention!
Questions?


Page 48

Munich HUG 21.11.2013

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Munich HUG 21.11.2013

Similar to Munich HUG 21.11.2013 (20)

Recently uploaded

Recently uploaded (20)

Munich HUG 21.11.2013

Editor's Notes