BranchReduce Distributed Branch-and-Bound on YARN

  • 1,127 views
Uploaded on

 

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,127
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. BranchReduce  Distributed  Branch-­‐and-­‐Bound  on  YARN  June  14,  2012    
  • 2. About  Me   Copyright  2012  Cloudera  Inc.  All  rights  reserved   2  
  • 3. Hadoop  Distributed  Processing  Frameworks   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 4. Lots  of  Other  Parallel  Processing  PlaIorms   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 5. Hadoop  2.0:  Resource  Scheduling  with  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 6. The  Data  Deluge  and  the  Cambrian  Explosion   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 7. Parallel  Distributed  Processing  For  Everyone   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 8. Building  a  New  Processing  Framework  on  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 9. A  Terrifyingly  Accurate  Paraphrasing  of  JWZ      Some  people,  when  confronted  with  a  tedious  problem,  say,  “I  know,  I’ll  write  a  framework.”  Now  they  have  two  tedious  problems.   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 10. On  Designing  Frameworks   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 11. The  Example  YARN  App:  Distributed  Shell   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 12. Do  We  Need  a  New  Programming  Language  for   Developing  YARN  ApplicaUons?   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 13. Do  We  Need  a  New  Programming  Language  for   Developing  YARN  ApplicaUons?   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 14. Leverage  ExisUng  Frameworks   •  Popular  RPC  libraries   with  support  for   mul@ple  languages   •  C++,  Java,  Python   •  We  need  to  make  it   easy  to  deploy  exisUng   applicaUons  on  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 15. KiXen:  Playing  with  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 16. Design  PaXern:  The  Unified  ApplicaUon  Master   •  Contains  business  logic   and  YARN  logic   •  Primary  reason:   Communica@on   •  Also:  dynamic  resource   alloca@on   •  Develop  our  master/ worker  applicaUons   locally  and  then  deploy   them  on  YARN   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 17. YARN  Lifecycle  Management  as  a  Service   •  Specifically,  extensions   of  Guava’s  Service   interface   •  YarnClientService   •  AppMasterService   •  Contains  all  of  the  logic   for  crea@ng  applica@ons   and  keeping  an  eye  on   them   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 18. Moving  the  ConfiguraUon  Logic  Out  of  Java   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 19. Lua  as  a  ConfiguraUon  Language   •  Small  and  Simple   •  Looks  like  a   configura@on  file   •  Func@ons  are  there   when/if  you  need  them   •  Inheritance   •  Don’t  Repeat  Yourself   •  Forgiving  of  undefined   values   •  Java/C++  Integra@on   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 20. First  KiXen  UUlity:  The  cat  FuncUon   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 21. Second  KiXen  UUlity:  The  yarn  FuncUon   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 22. BranchReduce  Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 23. Branch-­‐and-­‐Bound   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 24. The  Challenge  of  Parallel  Branch  and  Bound:  Unbalanced  Search  Space   •  Some  branches  are   pruned  quickly   •  Can  be  difficult  to   determine  the  best   splits  a  priori   •  Easy  to  revert  to  a  de   facto  single-­‐threaded   search   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 25. The  SoluUon:  Work  Stealing   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 26. You  Write  Three  Classes  •  A  Task  class  that  implements  Writable  •  A  GlobalState  class  that  implements  Writable  and  has  a   mergeWith(GlobalState  other)  method  •  A  Processor  class  that  defines:   •  execute(T  task,  BranchReduceContext<T,  GlobalState>  ctxt);   •  With  op@onal  iniUalize  and  cleanup  methods  •  Configura@on  is  done  via  BranchReduceJob   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 27. Example:  The  Knapsack  Problem   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 28. 0-­‐1  Integer  Programming  Problems   •  NP-­‐Hard  Resource   Alloca@on  Problem   •  Por_olio  Op@miza@on   •  Asset  Securi@za@on   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 29. Problem  FormulaUon:  (Simplified)  LP  Format   Copyright  2012  Cloudera  Inc.  All  rights  reserved  
  • 30. QuesUons?  @josh_wills