We’ll get started soon… 
Q&A box is available for your questions 
Webinar will be recorded for future viewing 
Thank you for joining! 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Combine SAS High-Performance 
Capabilities with Hadoop YARN 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
We do Hadoop.
Your speakers… 
Arun Murthy, Founder and Architect 
Hortonworks 
@acmurthy 
Paul Kent, Vice President Big Data 
SAS 
@hornpolish 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Agenda 
• Introduction to YARN 
• SAS Workloads on the Cluster 
• SAS Workloads: Resource Settings 
• SAS and YARN 
• YARN Futures 
• Next Steps 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
The 1st Generation of Hadoop: Batch 
HADOOP 1.0 
Built for Web-Scale Batch Apps 
Single 
App 
INTERACTIVE 
Single 
App 
BATCH 
HDFS 
Single 
App 
BATCH 
HDFS 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
• All other usage patterns must 
leverage that same 
infrastructure 
• Forces the creation of silos 
for managing mixed 
workloads 
Single 
App 
ONLINE 
Single 
App 
BATCH 
HDFS
Hadoop MapReduce Classic 
JobTracker 
§ Manages cluster resources and job scheduling 
TaskTracker 
§ Per-node agent 
§ Manage tasks 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
MapReduce Classic: Limitations 
Scalability 
§ Maximum Cluster size – 4,000 nodes 
§ Maximum concurrent tasks – 40,000 
§ Coarse synchronization in JobTracker 
Availability 
§ Failure kills all queued and running jobs 
Hard partition of resources into map and reduce slots 
§ Low resource utilization 
Lacks support for alternate paradigms and services 
§ Iterative applications implemented using MapReduce are 10x slower 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Our Vision: Hadoop as Next-Gen Platform 
Real-time 
HBase 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Tez 
YARN: Data Operating System 
(Cluster Resource Management) 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
MapReduce 
(Cluster Resource Management & Data Processing) 
Script 
Pig 
SQL 
Hive 
Others 
Storm, 
Solr, etc. 
1 ° ° ° ° ° 
° ° ° ° ° ° 
° ° ° ° ° ° 
° 
° 
N 
HDFS 
(Hadoop Distributed File System) 
Script 
Pig 
SQL 
Hive 
Engines 
HBase 
Accumulo, Storm, 
Solr, Spark. 
Others 
ISV Engines 
TezTez 
Others 
Engines 
Tez 
Hadoop 1 
• Silos & Largely batch 
• Single Processing engine 
Hadoop 2 w/ 
• Multiple Engines, Single Data Set 
• Batch, Interactive & Real-Time 
Java 
Cascading 
T ez 
° ° 
° ° 
° ° 
° 
° 
N 
HDFS 
(Hadoop Distributed File System)
YARN: Taking Hadoop Beyond Batch 
Applica,ons 
Run 
Na,vely 
IN 
Hadoop 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDFS2 
(Redundant, 
Reliable 
Storage) 
YARN 
(Cluster 
Resource 
Management) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, 
S4,…) 
GRAPH 
(Giraph) 
IN-­‐MEMORY 
(Spark) 
HPC 
MPI 
(OpenMPI) 
ONLINE 
(HBase) 
OTHER 
(Search) 
(Weave…) 
Store ALL DATA in one place… 
Interact with that data in MULTIPLE WAYS 
with Predictable Performance and Quality of Service
YARN 
Hortonworks Data Platform 
Script 
Pig 
SQL 
Hive 
TezT ez 
Java 
Cascading 
T ez 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Accumulo 
NoSQL 
YARN: Data Operating System 
(Cluster Resource Management) 
Others 
Engines 
Tez 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° 
° ° 
° ° 
HBase 
NoSQL 
Storm 
Stream 
Slider 
Sli der 
Others 
Engines 
Slider 
Slider 
° ° ° ° ° 
° ° ° ° ° 
° ° ° ° ° 
° 
° 
° 
Spark 
In-Memory 
° 
° 
° 
° 
° 
° 
PaaS 
Kubernetes 
LASR 
HPA 
° 
° 
N 
° 
° 
° 
° 
° 
° 
HDFS 
(Hadoop Distributed File System) 
Batch 
MR
5 Key Benefits of YARN 
1. Scale 
2. New Programming Models 5 & Services 
3. Improved cluster utilization 
4. Agility 
5. Beyond Java 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Concepts 
Application 
§ Application is a temporal job or a service submitted YARN 
§ Examples 
– Map Reduce Job (job) 
– Hbase Cluster (service) 
Container 
§ Basic unit of allocation 
§ Fine-grained resource allocation across multiple resource types (memory, cpu, disk, 
network, gpu etc.) 
– container_0 = 2GB, 1CPU 
– container_1 = 1GB, 6 CPU 
§ Replaces the fixed map/reduce slots 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Design Centre 
Split up the two major functions of JobTracker 
§ Cluster resource management 
§ Application life-cycle management 
MapReduce becomes user-land library 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
NodeManager 
NodeManager 
Container 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
Container 
1.1 
Container 
2.4 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1.2 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
Container 
1.3 
AM 
1 
Container 
2.2 
Container 
2.1 
Container 
2.3 
AM2 
YARN Architecture - Walkthrough 
Client2 
ResourceManager 
Scheduler
Multi-Tenancy with YARN 
Economics as queue-capacity 
§ Heirarchical Queues 
SLAs 
§ Preemption 
Resource Isolation 
§ Linux: cgroups 
§ MS Windows: Job Control 
§ Roadmap: Virtualization (Xen, KVM) 
Administration 
§ Queue ACLs 
§ Run-time re-configuration for queues 
§ Charge-back 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
ResourceManager 
Scheduler 
root 
Adhoc 
10% 
DW 
70% 
Mrkting 
20% 
Dev 
10% 
Reserved 
20% 
Prod 
70% 
Prod 
80% 
Dev 
20% 
P0 
70% 
P1 
30% 
Capacity Scheduler 
Hierarchical 
Queues
YARN Applications 
Data processing applications and services 
§ Services - Slider 
§ Real-time event processing – Storm, S4, other commercial platforms 
§ Tez – Generic framework to run a complex DAG 
§ MPI: OpenMPI, MPICH2 
§ Master-Worker 
§ Machine Learning: Spark 
§ Graph processing: Giraph 
§ Enabled by allowing the use of paradigm-specific application master 
Run all on the same Hadoop cluster! 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
SHARE! 
Customers are: 
wrapping up POCs 
building Bigger Clusters 
assembling their Data { Lake, Reservoir } 
want their software to SHARE the cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster - Video 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Some Requests are for a significant slice of the cluster 
Reservation will be ALL DAY, ALL WEEK, ALL MONTH? 
Memory typically fixed (15% of cluster) 
CPU floor, would like the spare capacity when available 
Some Requests are more short term 
Memory can be estimated 
Duration can be capped 
CPU floor, would like spare capacity 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads on the Cluster 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads – Resource Settings 
How much should you reserve? 
not a perfect science yet 
Long Running? 
LASR server by percent of total memory 
More like a batch request? 
HPA procedure by anecdotal experience 
Copyright © 2014, SAS Institute Inc. All rights reserved.
SAS Workloads – Resource Settings 
if [ "$USER" = "lasradm" ]; then 
# Custom settings for running under the lasradm account. 
export TKMPI_ULIMIT="-v 50000000” 
export TKMPI_MEMSIZE=50000 
export TKMPI_CGROUP="cgexec -g cpu:75” 
fi 
# if [ "$TKMPI_APPNAME" = "lasr" ]; then 
# Custom settings for a lasr process running under any account. 
# export TKMPI_ULIMIT="-v 50000000" 
# export TKMPI_MEMSIZE=50000 
# export TKMPI_CGROUP="cgexec -g cpu:75" 
Copyright © 2014, SAS Institute Inc. All rights reserved.
YARN: Taking Hadoop Beyond Batch 
Store ALL DATA in one place… 
Interact with that data in MULTIPLE WAYS 
with Predictable Performance and Quality of Service 
Applica,ons 
Run 
Na,vely 
IN 
Hadoop 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HDFS2 
YARN 
(Redundant, 
Reliable 
Storage) 
BATCH 
(MapReduce) 
INTERACTIVE 
(Tez) 
STREAMING 
(Storm, 
S4,…) 
GRAPH 
(Giraph) 
IN-­‐MEMORY 
(Spark) 
ONLINE 
(HBase)
YARN Futures 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
Container 
ResourceManager 
1.1 
NodeManager 
NodeManager 
AM 
1 
startContainer! 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1 
allocate! 
container! 2 
3
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
ResourceManager 
ServiceX 
NodeManager 
NodeManager 
AM 
1 
delegateContainer! 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
1 
allocate! 
2 
container! 
3 
4
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
ServiceX 
NodeManager 
NodeManager 
AM 
1 
ResourceManager 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
5
YARN – Delegated Container Model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
AM 
1 
ResourceManager 
Scheduler 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
6 ServiceX
PaaS - Kubernetes-on-YARN 
YARN as the default enterprise-class scheduler and resource manager for Kubernetes and 
OpenShift 3 
q First class support for containerization and mainstream PaaS 
q Updated go language bindings for YARN 
q Uses container delegation model 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Labels – Constraint Specifications 
NodeManager 
NodeManager 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
NodeManager 
NodeManager 
w/ 
GPU 
map 
1.1 
NodeManager 
NodeManager 
NodeManager 
w/ 
GPU 
NodeManager 
w/ 
GPU 
NodeManager 
NodeManager 
NodeManager 
NodeManager 
w/ 
GPU 
map1.2 
reduce1.1 
MR 
AM 
1 
DL1.1 
DL1.2 
DL1.3 
DL-­‐AM 
ResourceManager 
Scheduler
Reservations - SLAs via Allocation Planning 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
YARN 
Hortonworks Data Platform 
Script 
Pig 
SQL 
Hive 
TezT ez 
Java 
Cascading 
T ez 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Accumulo 
NoSQL 
YARN: Data Operating System 
(Cluster Resource Management) 
Others 
Engines 
Tez 
1 ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° ° ° ° ° ° ° 
° ° 
° ° 
° ° 
HBase 
NoSQL 
Storm 
Stream 
Slider 
Sli der 
Others 
Engines 
Slider 
Slider 
° ° ° ° ° 
° ° ° ° ° 
° ° ° ° ° 
° 
° 
° 
Spark 
In-Memory 
° 
° 
° 
° 
° 
° 
PaaS 
Kubernetes 
LASR 
HPA 
° 
° 
N 
° 
° 
° 
° 
° 
° 
HDFS 
(Hadoop Distributed File System) 
Batch 
MR
Next Steps… 
More about SAS & Hortonworks 
http://hortonworks.com/partner/SAS/ 
Download the Hortonworks Sandbox 
Learn Hadoop 
Build Your Analytic App 
Try Hadoop 2 
Contact us: events@hortonworks.com 
© Hortonworks Inc. 2011 – 2014. All Rights Reserved

Combine SAS High-Performance Capabilities with Hadoop YARN

  • 1.
    We’ll get startedsoon… Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 2.
    Combine SAS High-Performance Capabilities with Hadoop YARN © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop.
  • 3.
    Your speakers… ArunMurthy, Founder and Architect Hortonworks @acmurthy Paul Kent, Vice President Big Data SAS @hornpolish © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 4.
    Agenda • Introductionto YARN • SAS Workloads on the Cluster • SAS Workloads: Resource Settings • SAS and YARN • YARN Futures • Next Steps © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 5.
    The 1st Generationof Hadoop: Batch HADOOP 1.0 Built for Web-Scale Batch Apps Single App INTERACTIVE Single App BATCH HDFS Single App BATCH HDFS © Hortonworks Inc. 2011 – 2014. All Rights Reserved • All other usage patterns must leverage that same infrastructure • Forces the creation of silos for managing mixed workloads Single App ONLINE Single App BATCH HDFS
  • 6.
    Hadoop MapReduce Classic JobTracker § Manages cluster resources and job scheduling TaskTracker § Per-node agent § Manage tasks © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 7.
    MapReduce Classic: Limitations Scalability § Maximum Cluster size – 4,000 nodes § Maximum concurrent tasks – 40,000 § Coarse synchronization in JobTracker Availability § Failure kills all queued and running jobs Hard partition of resources into map and reduce slots § Low resource utilization Lacks support for alternate paradigms and services § Iterative applications implemented using MapReduce are 10x slower © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 8.
    Our Vision: Hadoopas Next-Gen Platform Real-time HBase © Hortonworks Inc. 2011 – 2014. All Rights Reserved Tez YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° MapReduce (Cluster Resource Management & Data Processing) Script Pig SQL Hive Others Storm, Solr, etc. 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) Script Pig SQL Hive Engines HBase Accumulo, Storm, Solr, Spark. Others ISV Engines TezTez Others Engines Tez Hadoop 1 • Silos & Largely batch • Single Processing engine Hadoop 2 w/ • Multiple Engines, Single Data Set • Batch, Interactive & Real-Time Java Cascading T ez ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System)
  • 9.
    YARN: Taking HadoopBeyond Batch Applica,ons Run Na,vely IN Hadoop © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-­‐MEMORY (Spark) HPC MPI (OpenMPI) ONLINE (HBase) OTHER (Search) (Weave…) Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service
  • 10.
    YARN Hortonworks DataPlatform Script Pig SQL Hive TezT ez Java Cascading T ez © Hortonworks Inc. 2011 – 2014. All Rights Reserved Accumulo NoSQL YARN: Data Operating System (Cluster Resource Management) Others Engines Tez 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HBase NoSQL Storm Stream Slider Sli der Others Engines Slider Slider ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Spark In-Memory ° ° ° ° ° ° PaaS Kubernetes LASR HPA ° ° N ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Batch MR
  • 11.
    5 Key Benefitsof YARN 1. Scale 2. New Programming Models 5 & Services 3. Improved cluster utilization 4. Agility 5. Beyond Java © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 12.
    Concepts Application §Application is a temporal job or a service submitted YARN § Examples – Map Reduce Job (job) – Hbase Cluster (service) Container § Basic unit of allocation § Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – container_0 = 2GB, 1CPU – container_1 = 1GB, 6 CPU § Replaces the fixed map/reduce slots © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 13.
    Design Centre Splitup the two major functions of JobTracker § Cluster resource management § Application life-cycle management MapReduce becomes user-land library © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 14.
    NodeManager NodeManager Container © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager Container 1.1 Container 2.4 NodeManager NodeManager NodeManager NodeManager 1.2 NodeManager NodeManager NodeManager NodeManager Container 1.3 AM 1 Container 2.2 Container 2.1 Container 2.3 AM2 YARN Architecture - Walkthrough Client2 ResourceManager Scheduler
  • 15.
    Multi-Tenancy with YARN Economics as queue-capacity § Heirarchical Queues SLAs § Preemption Resource Isolation § Linux: cgroups § MS Windows: Job Control § Roadmap: Virtualization (Xen, KVM) Administration § Queue ACLs § Run-time re-configuration for queues § Charge-back © Hortonworks Inc. 2011 – 2014. All Rights Reserved ResourceManager Scheduler root Adhoc 10% DW 70% Mrkting 20% Dev 10% Reserved 20% Prod 70% Prod 80% Dev 20% P0 70% P1 30% Capacity Scheduler Hierarchical Queues
  • 16.
    YARN Applications Dataprocessing applications and services § Services - Slider § Real-time event processing – Storm, S4, other commercial platforms § Tez – Generic framework to run a complex DAG § MPI: OpenMPI, MPICH2 § Master-Worker § Machine Learning: Spark § Graph processing: Giraph § Enabled by allowing the use of paradigm-specific application master Run all on the same Hadoop cluster! © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 17.
    SHARE! Customers are: wrapping up POCs building Bigger Clusters assembling their Data { Lake, Reservoir } want their software to SHARE the cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 18.
    SAS Workloads onthe Cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 19.
    SAS Workloads onthe Cluster - Video Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 20.
    SAS Workloads onthe Cluster Some Requests are for a significant slice of the cluster Reservation will be ALL DAY, ALL WEEK, ALL MONTH? Memory typically fixed (15% of cluster) CPU floor, would like the spare capacity when available Some Requests are more short term Memory can be estimated Duration can be capped CPU floor, would like spare capacity Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 21.
    SAS Workloads onthe Cluster Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 22.
    SAS Workloads –Resource Settings How much should you reserve? not a perfect science yet Long Running? LASR server by percent of total memory More like a batch request? HPA procedure by anecdotal experience Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 23.
    SAS Workloads –Resource Settings if [ "$USER" = "lasradm" ]; then # Custom settings for running under the lasradm account. export TKMPI_ULIMIT="-v 50000000” export TKMPI_MEMSIZE=50000 export TKMPI_CGROUP="cgexec -g cpu:75” fi # if [ "$TKMPI_APPNAME" = "lasr" ]; then # Custom settings for a lasr process running under any account. # export TKMPI_ULIMIT="-v 50000000" # export TKMPI_MEMSIZE=50000 # export TKMPI_CGROUP="cgexec -g cpu:75" Copyright © 2014, SAS Institute Inc. All rights reserved.
  • 24.
    YARN: Taking HadoopBeyond Batch Store ALL DATA in one place… Interact with that data in MULTIPLE WAYS with Predictable Performance and Quality of Service Applica,ons Run Na,vely IN Hadoop © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDFS2 YARN (Redundant, Reliable Storage) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-­‐MEMORY (Spark) ONLINE (HBase)
  • 25.
    YARN Futures ©Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 26.
    YARN – DelegatedContainer Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager Container ResourceManager 1.1 NodeManager NodeManager AM 1 startContainer! Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 1 allocate! container! 2 3
  • 27.
    YARN – DelegatedContainer Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager ResourceManager ServiceX NodeManager NodeManager AM 1 delegateContainer! Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 1 allocate! 2 container! 3 4
  • 28.
    YARN – DelegatedContainer Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager ServiceX NodeManager NodeManager AM 1 ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 5
  • 29.
    YARN – DelegatedContainer Model © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager NodeManager NodeManager AM 1 ResourceManager Scheduler NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager 6 ServiceX
  • 30.
    PaaS - Kubernetes-on-YARN YARN as the default enterprise-class scheduler and resource manager for Kubernetes and OpenShift 3 q First class support for containerization and mainstream PaaS q Updated go language bindings for YARN q Uses container delegation model © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 31.
    Labels – ConstraintSpecifications NodeManager NodeManager © Hortonworks Inc. 2011 – 2014. All Rights Reserved NodeManager NodeManager w/ GPU map 1.1 NodeManager NodeManager NodeManager w/ GPU NodeManager w/ GPU NodeManager NodeManager NodeManager NodeManager w/ GPU map1.2 reduce1.1 MR AM 1 DL1.1 DL1.2 DL1.3 DL-­‐AM ResourceManager Scheduler
  • 32.
    Reservations - SLAsvia Allocation Planning © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 33.
    YARN Hortonworks DataPlatform Script Pig SQL Hive TezT ez Java Cascading T ez © Hortonworks Inc. 2011 – 2014. All Rights Reserved Accumulo NoSQL YARN: Data Operating System (Cluster Resource Management) Others Engines Tez 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HBase NoSQL Storm Stream Slider Sli der Others Engines Slider Slider ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Spark In-Memory ° ° ° ° ° ° PaaS Kubernetes LASR HPA ° ° N ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Batch MR
  • 34.
    Next Steps… Moreabout SAS & Hortonworks http://hortonworks.com/partner/SAS/ Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 Contact us: events@hortonworks.com © Hortonworks Inc. 2011 – 2014. All Rights Reserved