Page 1 © Hortonworks Inc. 2014
Discover HDP 2.1
Apache Hadoop 2.4.0, YARN & HDFS
Hortonworks. We do Hadoop.
Page 2 © Hortonworks Inc. 2014
Speakers
Justin Sears
Hortonworks Product Marketing
Manager
Rohit Bakhshi
Hortonworks Senio...
Page 3 © Hortonworks Inc. 2014
Agenda
•  Overview of YARN in HDFS
•  New YARN & HDFS Features in HDP 2.1
•  Q & A
Page 4 © Hortonworks Inc. 2014
OPERATIONS	
  TOOLS	
  
Provision,
Manage &
Monitor
DEV	
  &	
  DATA	
  TOOLS	
  
Build &
T...
Page 5 © Hortonworks Inc. 2014
HDP 2.1: Enterprise Hadoop
HDP 2.1
Hortonworks Data Platform
	
  	
  
Provision,	
  
Manage...
Page 6 © Hortonworks Inc. 2014
HDP 2.1: Data Management
HDP 2.1
Hortonworks Data Platform
Provision,	
  
Manage	
  &	
  
M...
Page 7 © Hortonworks Inc. 2014
Agenda
Overview Features Q & A
Page 8 © Hortonworks Inc. 2014
Apache Hadoop YARN and HDFS
Flexible
Enables other purpose-built data
processing models bey...
Page 9 © Hortonworks Inc. 2014
Agenda
Overview Features Q & A
Page 10 © Hortonworks Inc. 2014
HDP 2.1 HDFS: What’s New
HDFS	
  Extended	
  ACLs	
  
•  Provides	
  granular	
  access	
 ...
Page 11 © Hortonworks Inc. 2014
HDFS Coordinated DataNode Caching
•  In memory cache for
HDFS file - enhanced
read perform...
Page 12 © Hortonworks Inc. 2014
HDP 2.1 YARN: What’s New
Resource	
  Manager	
  High	
  Availability	
  
•  No	
  service	...
Page 13 © Hortonworks Inc. 2014
YARN Resource Manager (RM) HA
Automated failover
HDP detects and reacts to Resource Manage...
Page 14 © Hortonworks Inc. 2014
Client
Standby
RM
Active
RM
ZooKeeper Service
Cluster
Monitor and try to take
active lock
...
Page 15 © Hortonworks Inc. 2014
Application Timeline Server
Entity and Event
collection
Applications of all types can crea...
Page 16 © Hortonworks Inc. 2014
Application Timeline Server
App	
  Timeline	
  
Server	
  
AMBARI	
  
Custom	
  
App	
  
M...
Page 17 © Hortonworks Inc. 2014
Capacity Scheduler Preemption
•  Enforce
SLAs
•  Preempt
across
queues
1.  Current Capacit...
Page 18 © Hortonworks Inc. 2014
Agenda
Overview Features Q & A
Page 19 © Hortonworks Inc. 2014
Learn More About the Hadoop Operating System
Hortonworks.com/labs/yarn/
Register for the r...
Upcoming SlideShare
Loading in...5
×

Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS

2,405

Published on

This is the presentation from the "Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS" webinar on May 28, 2014. Rohit Bahkshi, a senior product manager at Hortonworks, and Vinod Vavilapalli, PMC for Apache Hadoop, discuss an overview of YARN in HDFS and new features in HDP 2.1. Those new features include: HDFS extended ACLs, HTTPs wire encryption, HDFS DataNode caching, resource manager high availability, application timeline server, and capacity scheduler pre-emption.

Published in: Software, Technology, Business
0 Comments
14 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,405
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
232
Comments
0
Likes
14
Embeds 0
No embeds

No notes for slide

Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS

  1. 1. Page 1 © Hortonworks Inc. 2014 Discover HDP 2.1 Apache Hadoop 2.4.0, YARN & HDFS Hortonworks. We do Hadoop.
  2. 2. Page 2 © Hortonworks Inc. 2014 Speakers Justin Sears Hortonworks Product Marketing Manager Rohit Bakhshi Hortonworks Senior Product Manager & PM for Apache Hadoop & Apache Solr in Hortonworks Data Platform Vinod Vavilapalli Foundational Hadoop Architect, Hortonworks Engineer, PMC for Apache Hadoop & Leads YARN Development at Hortonworks
  3. 3. Page 3 © Hortonworks Inc. 2014 Agenda •  Overview of YARN in HDFS •  New YARN & HDFS Features in HDP 2.1 •  Q & A
  4. 4. Page 4 © Hortonworks Inc. 2014 OPERATIONS  TOOLS   Provision, Manage & Monitor DEV  &  DATA  TOOLS   Build & Test A Modern Data ArchitectureAPPLICATIONS  DATA    SYSTEM   REPOSITORIES   RDBMS   EDW   MPP   Business     Analy<cs   Custom   Applica<ons   Packaged   Applica<ons   Governance &Integration ENTERPRISE HADOOP Security Operations Data Access Data Management SOURCES   OLTP,  ERP,   CRM  Systems   Documents,     Emails   Web  Logs,   Click  Streams   Social   Networks   Machine   Generated   Sensor   Data   GeolocaCon   Data  
  5. 5. Page 5 © Hortonworks Inc. 2014 HDP 2.1: Enterprise Hadoop HDP 2.1 Hortonworks Data Platform     Provision,   Manage  &   Monitor     Ambari   Zookeeper   Scheduling     Oozie   Data  Workflow,   Lifecycle  &   Governance     Falcon   Sqoop   Flume   NFS   WebHDFS   YARN  :  Data  Opera<ng  System   DATA    MANAGEMENT   DATA    ACCESS   GOVERNANCE  &   INTEGRATION   OPERATIONS   Script     Pig       Search     Solr       SQL     Hive/Tez,   HCatalog       NoSQL     HBase   Accumulo       Stream       Storm         Others     In-­‐Memory   AnalyCcs,     ISV  engines   1   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   N   HDFS     (Hadoop  Distributed  File  System)   Batch     Map   Reduce       SECURITY   Authen<ca<on   Authoriza<on   Accoun<ng   Data  Protec<on     Storage:  HDFS   Resources:  YARN   Access:  Hive,  …     Pipeline:  Falcon   Cluster:  Knox  
  6. 6. Page 6 © Hortonworks Inc. 2014 HDP 2.1: Data Management HDP 2.1 Hortonworks Data Platform Provision,   Manage  &   Monitor     Ambari   Zookeeper   Scheduling     Oozie   Data  Workflow,   Lifecycle  &   Governance     Falcon   Sqoop   Flume   NFS   WebHDFS   DATA    ACCESS   GOVERNANCE  &   INTEGRATION   OPERATIONS   Script     Pig       Search     Solr       SQL     Hive/Tez,   HCatalog       NoSQL     HBase   Accumulo       Stream       Storm         Others     In-­‐Memory   AnalyCcs,     ISV  engines   Batch     Map   Reduce       SECURITY   Authen<ca<on   Authoriza<on   Accoun<ng   Data  Protec<on     Storage:  HDFS   Resources:  YARN   Access:  Hive,  …     Pipeline:  Falcon   Cluster:  Knox       YARN  :  Data  Opera<ng  System   DATA    MANAGEMENT   1   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   °   N   HDFS     (Hadoop  Distributed  File  System)  
  7. 7. Page 7 © Hortonworks Inc. 2014 Agenda Overview Features Q & A
  8. 8. Page 8 © Hortonworks Inc. 2014 Apache Hadoop YARN and HDFS Flexible Enables other purpose-built data processing models beyond MapReduce (batch), such as interactive and streaming Efficient Double processing IN Hadoop on the same hardware while providing predictable performance & quality of service Shared Provides a stable, reliable, secure foundation and shared operational services across multiple workloads The Data Operating System for Hadoop 2.0 Data  Processing  Engines  Run  Na<vely  IN  Hadoop   BATCH   MapReduce   INTERACTIVE   Tez   STREAMING   Storm   IN-­‐MEMORY   Spark   GRAPH   Giraph   SAS   LASR,  HPA   ONLINE   HBase,  Accumulo     OTHERS     HDFS:  Redundant,  Reliable  Storage   YARN:  Cluster  Resource  Management      
  9. 9. Page 9 © Hortonworks Inc. 2014 Agenda Overview Features Q & A
  10. 10. Page 10 © Hortonworks Inc. 2014 HDP 2.1 HDFS: What’s New HDFS  Extended  ACLs   •  Provides  granular  access  control  to  datasets  in  HDFS   Security   THEME   HTTPs  Wire  Encryp<on     •  swebhdfs:  HTTPs support for WebHDFS •  HTTPs support for Hadoop WebUI Security   THEME   HDFS  DataNode  Caching   •  Enhanced  read  performance  via  in  memory  caching  of  files   Performance   THEME  
  11. 11. Page 11 © Hortonworks Inc. 2014 HDFS Coordinated DataNode Caching •  In memory cache for HDFS file - enhanced read performance •  Identify files to be cached through centralized management controls •  Manage caching through pools and directives
  12. 12. Page 12 © Hortonworks Inc. 2014 HDP 2.1 YARN: What’s New Resource  Manager  High  Availability   •  No  service  disrupCon  in  YARN   Reliability   THEME   Applica<on  Timeline  Server   •  Operational monitoring across all YARN applications Monitoring   THEME   Capacity  Scheduler  Pre-­‐emp<on   •  Enforce  SLAs  across  applicaCons  and  organizaCons   Scheduling   THEME  
  13. 13. Page 13 © Hortonworks Inc. 2014 YARN Resource Manager (RM) HA Automated failover HDP detects and reacts to Resource Manager host & process failures Active/Standby Standby ResourceManager with access to shared state store Fencing Protection against Split Brain Full stack resiliency - Entire HDP Stack certified with ResourceManager HA - RM Restart enables application recovery Integrated into HDP stack - No external HA Frameworks - No external storage needed
  14. 14. Page 14 © Hortonworks Inc. 2014 Client Standby RM Active RM ZooKeeper Service Cluster Monitor and try to take active lock Monitor and maintain active lock Store State YARN RM HA: Architecture NodeManager NodeManager NodeManager
  15. 15. Page 15 © Hortonworks Inc. 2014 Application Timeline Server Entity and Event collection Applications of all types can create entities and send events Pluggable store Depending on site requirements REST APIs Applications and user-interfaces can access information via REST Visualizations Users can build tools and visualizations using the APIs Users and Admins Applications as well as the system entities/ events
  16. 16. Page 16 © Hortonworks Inc. 2014 Application Timeline Server App  Timeline   Server   AMBARI   Custom   App   Monitoring   Client  
  17. 17. Page 17 © Hortonworks Inc. 2014 Capacity Scheduler Preemption •  Enforce SLAs •  Preempt across queues 1.  Current Capacity 2.  Guaranteed Capacity 3.  Pending Requests Gather     Queue     State   STEP  1   1.  Figure out what is needed to achieve capacity balance 2.  Select applications to preempt: Over cap. Qs and FIFO order 3.  Respect bounds on amount of preemption allowed for each round Iden<fy  set  of   preemp<ons   STEP  2   1.  Remove reservations from the most recently assigned app 2.  Issue preemptions for containers of same app (reverse chronological order, last assigned container first) 3.  App Master pre-emption is last resort. Preempt   applica<on(s)   STEP  3   1.  Track containers that have been issued by not yet executed preemption 2.  After a set of execution periods, forcibly kill these containers Kill  containers   STEP  4  
  18. 18. Page 18 © Hortonworks Inc. 2014 Agenda Overview Features Q & A
  19. 19. Page 19 © Hortonworks Inc. 2014 Learn More About the Hadoop Operating System Hortonworks.com/labs/yarn/ Register for the remaining 3 Discover HDP 2.1 Webinars Hortonworks.com/ webinars Next Webinar: Apache Solr for Hadoop Search Thursday, June 12, 10am Pacific
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×