Your SlideShare is downloading. ×
0
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop Summit 2012 | BranchReduce: Distributed Branch-and-Bound on YARN

1,894

Published on

Session Abstract</strong><div></div><div><p>Branch-and-bound is a widely used technique for efficiently searching for solutions to combinatorial optimization problems. In this session, …

Session Abstract</strong><div></div><div><p>Branch-and-bound is a widely used technique for efficiently searching for solutions to combinatorial optimization problems. In this session, we will introduce BranchReduce, an open-source Java library for performing distributed branch-and-bound on a Hadoop cluster under YARN. Applications only need to write code that is specific to their optimization problem (namely the branching rule, the lower bound computation, and the upper bound computation), and BranchReduce handles deploying the application to the cluster, managing the execution, and periodically rebalancing the search space across the machines. We will give an overview of how BranchReduce works and then walk through an example that solves a scheduling problem with a near-linear speedup over a single machine implementation.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,894
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 1600+ Lines of Java Code796 Client, 837 Application MasterRelatively Simple Lifecycle PatternsStart Up: RPCs and Resource DefinitionsPeriodic Heartbeat ChecksShut Down: Resource CleanupNo advanced featuresCheckpointingApplication Master failuresDynamic resource allocation
  • Transcript

    • 1. BranchReduceDistributed Branch-and-Bound on YARNJune 14, 2012
    • 2. About Me Copyright 2012 Cloudera Inc. All rights reserved 2
    • 3. Hadoop Distributed Processing Frameworks Copyright 2012 Cloudera Inc. All rights reserved
    • 4. Lots of Other Parallel Processing Platforms Copyright 2012 Cloudera Inc. All rights reserved
    • 5. Hadoop 2.0: Resource Scheduling with YARN Copyright 2012 Cloudera Inc. All rights reserved
    • 6. The Data Deluge and the Cambrian Explosion Copyright 2012 Cloudera Inc. All rights reserved
    • 7. Parallel Distributed Processing For Everyone Copyright 2012 Cloudera Inc. All rights reserved
    • 8. Building a New Processing Framework on YARN Copyright 2012 Cloudera Inc. All rights reserved
    • 9. A Terrifyingly Accurate Paraphrasing of JWZSome people, when confronted with a tediousproblem, say, “I know, I’ll write a framework.”Now they have two tedious problems. Copyright 2012 Cloudera Inc. All rights reserved
    • 10. On Designing Frameworks Copyright 2012 Cloudera Inc. All rights reserved
    • 11. The Example YARN App: Distributed Shell Copyright 2012 Cloudera Inc. All rights reserved
    • 12. Do We Need a New Programming Language for Developing YARN Applications? Copyright 2012 Cloudera Inc. All rights reserved
    • 13. Do We Need a New Programming Language for Developing YARN Applications? Copyright 2012 Cloudera Inc. All rights reserved
    • 14. Leverage Existing Frameworks • Popular RPC libraries with support for multiple languages • C++, Java, Python • We need to make it easy to deploy existing applications on YARN Copyright 2012 Cloudera Inc. All rights reserved
    • 15. Kitten: Playing with YARN Copyright 2012 Cloudera Inc. All rights reserved
    • 16. Design Pattern: The Unified Application Master • Contains business logic and YARN logic • Primary reason: Communication • Also: dynamic resource allocation • Develop our master/worker applications locally and then deploy them on YARN Copyright 2012 Cloudera Inc. All rights reserved
    • 17. YARN Lifecycle Management as a Service • Specifically, extensions of Guava’s Service interface • YarnClientService • AppMasterService • Contains all of the logic for creating applications and keeping an eye on them Copyright 2012 Cloudera Inc. All rights reserved
    • 18. Moving the Configuration Logic Out of Java Copyright 2012 Cloudera Inc. All rights reserved
    • 19. Lua as a Configuration Language • Small and Simple • Looks like a configuration file • Functions are there when/if you need them • Inheritance • Don’t Repeat Yourself • Forgiving of undefined values • Java/C++ Integration Copyright 2012 Cloudera Inc. All rights reserved
    • 20. First Kitten Utility: The cat Function Copyright 2012 Cloudera Inc. All rights reserved
    • 21. Second Kitten Utility: The yarn Function Copyright 2012 Cloudera Inc. All rights reserved
    • 22. BranchReduceCopyright 2012 Cloudera Inc. All rights reserved
    • 23. Branch-and-Bound Copyright 2012 Cloudera Inc. All rights reserved
    • 24. The Challenge of Parallel Branch and Bound:Unbalanced Search Space • Some branches are pruned quickly • Can be difficult to determine the best splits a priori • Easy to revert to a de facto single-threaded search Copyright 2012 Cloudera Inc. All rights reserved
    • 25. The Solution: Work Stealing Copyright 2012 Cloudera Inc. All rights reserved
    • 26. You Write Three Classes• A Task class that implements Writable• A GlobalState class that implements Writable and has a mergeWith(GlobalState other) method• A Processor class that defines: • execute(T task, BranchReduceContext<T, GlobalState> ctxt); • With optional initialize and cleanup methods• Configuration is done via BranchReduceJob Copyright 2012 Cloudera Inc. All rights reserved
    • 27. Example: The Knapsack Problem Copyright 2012 Cloudera Inc. All rights reserved
    • 28. 0-1 Integer Programming Problems • NP-Hard Resource Allocation Problem • Portfolio Optimization • Asset Securitization Copyright 2012 Cloudera Inc. All rights reserved
    • 29. Problem Formulation: (Simplified) LP Format Copyright 2012 Cloudera Inc. All rights reserved
    • 30. Questions?@josh_wills

    ×