• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hadoop bangalore-meetup-dec-2011-hadoop nextgen
 

Hadoop bangalore-meetup-dec-2011-hadoop nextgen

on

  • 923 views

Hadoop NextGen/YARN/MRv2

Hadoop NextGen/YARN/MRv2

Statistics

Views

Total Views
923
Views on SlideShare
907
Embed Views
16

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 16

http://www.linkedin.com 14
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hadoop bangalore-meetup-dec-2011-hadoop nextgen Hadoop bangalore-meetup-dec-2011-hadoop nextgen Presentation Transcript

    • Hadoop  Nextgen/MRv2/YARN   Sharad  Agarwal   sharad@apache.org  
    • About  me  •  Apache  FoundaAon   –  Hadoop  CommiDer  and  PMC  member   –  Hadoop  MR  contributor  ~  4  years   –  Author  of  Hadoop  Nextgen  core  •  Head  of  Technology  PlaKorms  @InMobi   –  Formerly  Architect  @Yahoo!    
    • Hadoop  Map-­‐Reduce  Today  •  JobTracker   –  Manages  cluster   resources  and  job   scheduling  •  TaskTracker   –  Per-­‐node  agent   –  Manage  tasks  
    • Current  LimitaAons  •  Scalability   –  Maximum  Cluster  size  –  4,000  nodes   –  Maximum  concurrent  tasks  –  40,000   –  Coarse  synchronizaAon  in  JobTracker  •  Single  point  of  failure     –  Failure  kills  all  queued  and  running  jobs   –  Jobs  need  to  be  re-­‐submiDed  by  users  •  Restart  is  very  tricky  due  to  complex  state  •  Hard  parAAon  of  resources  into  map  and   reduce  slots  
    • Current  LimitaAons  •  Lacks  support  for  alternate  paradigms   –  IteraAve  applicaAons  implemented  using  Map-­‐ Reduce  are  10x  slower.     –  Example:  K-­‐Means,  PageRank  •  Lack  of  wire-­‐compaAble  protocols     –  Client  and  cluster  must  be  of  same  version   –  ApplicaAons  and  workflows  cannot  migrate  to   different  clusters  
    • Next  GeneraAon  Map-­‐Reduce   Requirements  •  Reliability  •  Availability  •  Scalability  -­‐  Clusters  of  6,000  machines   –  Each  machine  with  16  cores,  48G  RAM,  24TB  disks   –  100,000  concurrent  tasks   –  10,000  concurrent  jobs  •  Wire  CompaAbility  •  Agility  &  EvoluAon  –  Ability  for  customers  to   control  upgrades  to  the  grid  sodware  stack.  
    • Next  GeneraAon  Map-­‐Reduce   Architecture  •  Split  up  the  two  major  funcAons  of  JobTracker   –  Cluster  resource  management   –  ApplicaAon  life-­‐cycle  management  •  Map-­‐Reduce  becomes  user-­‐land  library  
    • Architecture   Node Node Manager Manager Container App Mstr App MstrClient Resource Resource Node Node Manager Manager Manager Manager ClientClient App Mstr Container Container MapReduce Status Node Node MapReduce Status Manager Manager Job Submission Job Submission Node Status Node Status Resource Request Resource Request Container Container
    • Architecture  •  Resource  Manager   –  Global  resource  scheduler   –  Hierarchical  queues  •  Node  Manager   –  Per-­‐machine  agent   –  Manages  the  life-­‐cycle  of  container   –  Container  resource  monitoring  •  ApplicaAon  Master   –  Per-­‐applicaAon   –  Manages  applicaAon  scheduling  and  task  execuAon   –  E.g.  Map-­‐Reduce  ApplicaAon  Master  
    •  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce  •  Scalability     –  ApplicaAon  life-­‐cycle  management  is  very   expensive   –  ParAAon  resource  management  and   applicaAon  life-­‐cycle  management   –  ApplicaAon  management  is  distributed   –  Hardware  trends  -­‐  Currently  run  clusters  of   4,000  machines   •  6,000  2012  machines  >  12,000  2009  machines   •  <8  cores,  16G,  4TB>  v/s  <16+  cores,  48/96G,   24TB>  
    •  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce  •  Availability     –  ApplicaAon  Master   •  OpAonal  failover  via  applicaAon-­‐specific   checkpoint   •  Map-­‐Reduce  applicaAons  pick  up  where  they   led  off   –  Resource  Manager   •  No  single  point  of  failure  -­‐  failover  via   ZooKeeper   •  ApplicaAon  Masters  are  restarted   automaAcally  
    •  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce  •  Wire  CompaAbility     –  Protocols  are  wire-­‐compaAble   –  Old  clients  can  talk  to  new  servers   –  Rolling  upgrades  
    •  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce  •  Agility  /  EvoluAon     –  Map-­‐Reduce  now  becomes  a  user-­‐land   library   –  MulAple  versions  of  Map-­‐Reduce  can  run   in  the  same  cluster  (ala  Apache  Pig)   •  Faster  deployment  cycles  for  improvements   –  Customers  upgrade  Map-­‐Reduce  versions   on  their  schedule  
    •  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce  •  UAlizaAon   –  Generic  resource  model     •  Memory   •  CPU   •  Disk  b/w   •  Network  b/w   –  Remove  fixed  parAAon  of  map  and  reduce   slots  
    •  Improvements  vis-­‐à-­‐vis  current  Map-­‐ Reduce  •  Support  for  programming  paradigms   other  than  Map-­‐Reduce   –  MPI   –  Master-­‐Worker   –  Machine  Learning   –  IteraAve  processing   –  Enabled  by  allowing  use  of  paradigm-­‐ specific  ApplicaAon  Master   –  Run  all  on  the  same  Hadoop  cluster  
    • Summary  •  The  next  generaAon  of  Map-­‐Reduce  takes   Hadoop  to  the  next  level   –  Scale-­‐out  even  further   –  High  availability   –  Cluster  UAlizaAon     –  Support  for  paradigms  other  than  Map-­‐Reduce  
    • Status  •  Apache  Hadoop  0.23  release  is  out   –  HDFS  FederaAon   –  MRv2  •  Currently  undergoing  tests  on  Small  scale  ~  500  nodes  •  Alpha     –  2000  nodes   –  Q1  2012  •  Beta/ProducAon   –  Variety  of  applicaAons  and  loads     –  4000+  nodes   –  Q2  2012      
    • QuesAons?  Follow  me  on  @twiDer:  sharad_ag