BranchReduce Distributed Branch-and-Bound on YARN

1,620 views

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,620
On SlideShare
0
From Embeds
0
Number of Embeds
106
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

BranchReduce Distributed Branch-and-Bound on YARN

  1. 1. BranchReduce  Distributed  Branch-­‐and-­‐Bound  on  YARN  June  14,  2012    
  2. 2. About  Me   Copyright  2012  Cloudera  Inc.  All  rights  reserved   2  
  3. 3. Hadoop  Distributed  Processing  Frameworks   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  4. 4. Lots  of  Other  Parallel  Processing  PlaIorms   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  5. 5. Hadoop  2.0:  Resource  Scheduling  with  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  6. 6. The  Data  Deluge  and  the  Cambrian  Explosion   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  7. 7. Parallel  Distributed  Processing  For  Everyone   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  8. 8. Building  a  New  Processing  Framework  on  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  9. 9. A  Terrifyingly  Accurate  Paraphrasing  of  JWZ      Some  people,  when  confronted  with  a  tedious  problem,  say,  “I  know,  I’ll  write  a  framework.”  Now  they  have  two  tedious  problems.   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  10. 10. On  Designing  Frameworks   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  11. 11. The  Example  YARN  App:  Distributed  Shell   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  12. 12. Do  We  Need  a  New  Programming  Language  for   Developing  YARN  ApplicaUons?   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  13. 13. Do  We  Need  a  New  Programming  Language  for   Developing  YARN  ApplicaUons?   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  14. 14. Leverage  ExisUng  Frameworks   •  Popular  RPC  libraries   with  support  for   mul@ple  languages   •  C++,  Java,  Python   •  We  need  to  make  it   easy  to  deploy  exisUng   applicaUons  on  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  15. 15. KiXen:  Playing  with  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  16. 16. Design  PaXern:  The  Unified  ApplicaUon  Master   •  Contains  business  logic   and  YARN  logic   •  Primary  reason:   Communica@on   •  Also:  dynamic  resource   alloca@on   •  Develop  our  master/ worker  applicaUons   locally  and  then  deploy   them  on  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  17. 17. YARN  Lifecycle  Management  as  a  Service   •  Specifically,  extensions   of  Guava’s  Service   interface   •  YarnClientService   •  AppMasterService   •  Contains  all  of  the  logic   for  crea@ng  applica@ons   and  keeping  an  eye  on   them   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  18. 18. Moving  the  ConfiguraUon  Logic  Out  of  Java   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  19. 19. Lua  as  a  ConfiguraUon  Language   •  Small  and  Simple   •  Looks  like  a   configura@on  file   •  Func@ons  are  there   when/if  you  need  them   •  Inheritance   •  Don’t  Repeat  Yourself   •  Forgiving  of  undefined   values   •  Java/C++  Integra@on   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  20. 20. First  KiXen  UUlity:  The  cat  FuncUon   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  21. 21. Second  KiXen  UUlity:  The  yarn  FuncUon   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  22. 22. BranchReduce  Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  23. 23. Branch-­‐and-­‐Bound   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  24. 24. The  Challenge  of  Parallel  Branch  and  Bound:  Unbalanced  Search  Space   •  Some  branches  are   pruned  quickly   •  Can  be  difficult  to   determine  the  best   splits  a  priori   •  Easy  to  revert  to  a  de   facto  single-­‐threaded   search   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  25. 25. The  SoluUon:  Work  Stealing   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  26. 26. You  Write  Three  Classes  •  A  Task  class  that  implements  Writable  •  A  GlobalState  class  that  implements  Writable  and  has  a   mergeWith(GlobalState  other)  method  •  A  Processor  class  that  defines:   •  execute(T  task,  BranchReduceContext<T,  GlobalState>  ctxt);   •  With  op@onal  iniUalize  and  cleanup  methods  •  Configura@on  is  done  via  BranchReduceJob   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  27. 27. Example:  The  Knapsack  Problem   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  28. 28. 0-­‐1  Integer  Programming  Problems   •  NP-­‐Hard  Resource   Alloca@on  Problem   •  Por_olio  Op@miza@on   •  Asset  Securi@za@on   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  29. 29. Problem  FormulaUon:  (Simplified)  LP  Format   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  30. 30. QuesUons?  @josh_wills  

×