Your SlideShare is downloading. ×
Isda
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
313
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Simple == Just two functions: Map and Reduce, Scalable == automatic parallelization across machines, fault tolerance, speculative execution,
  • Modify this slide to show partitioning function
  • Transcript

    • 1. Scaling Genetic Algorithms using MapReduce
      AbhishekVerma, Xavier Llora,
      David E. Goldberg, Roy H. Campbell
    • 2. Motivation
      Genetic Algorithms (GAs)
      applied to very large scale data-intensiveproblems
      Current approach: MPI
      Requires detailed knowledge of h/w architecture
      Complicated to program, debug, checkpoint
      Does not scale on commodity clusters
      MapReduce: simple and scalable abstraction
      Use MapReduce to scale GAs
      2
      Intelligent Systems Design and Applications 2009
    • 3. Outline
      Motivation
      MapReduce
      Genetic Algorithm
      Approach
      Experimental Results
      Conclusion
      3
      Intelligent Systems Design and Applications 2009
    • 4. MapReduce Overview
      k1
      v1
      k1
      v1
      k2
      v2
      k1
      v3
      k1
      v3
      k1
      v5
      k2
      v2
      k2
      v4
      k2
      v4
      k1
      v5
      Input
      records
      h(k1)
      Output
      records
      Map
      Reduce
      h(k1)
      h(k2)
      Split
      h(k1)
      Reduce
      Map
      h(k2)
      Split
      Shuffle
      4
      Intelligent Systems Design and Applications 2009
    • 5. Genetic Algorithm
      Initialize population with random individuals.
      Evaluate fitness value of individuals.
      Select good solutions by using tournament selection without replacement.
      Create new individuals by recombining the selected population using uniform crossover.
      Evaluate the fitness value of all offspring.
      Repeat steps 3-5 until some convergence criteria are met.
      5
      Intelligent Systems Design and Applications 2009
    • 6. Genetic Algorithm
      Initialize population with random individuals.
      Evaluate fitness value of individuals.
      Repeat steps 4-5 to 2 until some convergence criteria are met.
      Select good solutions by using tournament selection without replacement.
      Create new individuals by recombining the selected population using uniform crossover.
      6
      Map
      Reduce
      Intelligent Systems Design and Applications 2009
    • 7. MapReducing Genetic Algorithm
      7
      Random
      partitioner
      00010
      10000
      01001
      <00010, 1>
      <10000, 1>
      <01001, 2>
      Map
      10110
      00001
      Reduce
      <01001, 2>
      10001
      01000
      10001
      01000
      Reduce
      10101
      10000
      00000
      <10101, 3>
      <10000, 1>
      <00000, 0>
      Map
      <10101, 3>
      Distributed File System
      Intelligent Systems Design and Applications 2009
    • 8. MapReducing Genetic Algorithm (2)
      Modifications
      Mappers write to DFS so that clients can evaluate convergence criteria and control next iteration
      Random partitioner function
      Maintain a window of individuals in each reducer
      Optimizations
      Create the initial population in 0th MapReduce
      Compactly represent bits in array of long ints
      8
      Intelligent Systems Design and Applications 2009
    • 9. Experimental Results
      9
      Experimental setup
      52 nodes: 16GB RAM, 2TB hard drives
      Each node runs 5 mappers + 3 reducers
      Population set to nlog(n)
      Intelligent Systems Design and Applications 2009
    • 10. Scaling GAs to 100 million variables
      10
      Intelligent Systems Design and Applications 2009
    • 11. Conclusion
      Modeled GAs in MapReduce
      Scales on a commodity clusters to 100 million variables
      Can also use Pthreads(Phoenix), GPUs(Mars), …
      Future Work
      Demonstrate scalability for practical applications
      MapReduce Compact GAs and Extended Compact GAs
      Comparison with MPI implementation
      11
      Intelligent Systems Design and Applications 2009
    • 12. Questions?
    • 13. Thank You

    ×