Sparse Data Structures for Weighted Bipartite Matching

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Group

    Sparse Data Structures for Weighted Bipartite Matching - Presentation Transcript

    1. Sparse Data Structures for Weighted Bipartite Matching E. Jason Riedy Dr. James Demmel (. . . and thanks to the BeBOP group) SIAM Workshop on Combinatorial Scientific Computing 2004
    2. Use Sparse Matrix Optimizations... Take a fixed, simple algorithm: Auction alg. for matchings Repeated iterations over a sparse graph. What’s expensive, and is there anything we can do about it? Take an idea from optimizing sparse matrix-vector products. A little speed-up in some cases, but there are more ideas available...
    3. Where’s the Time Going? Auction algorithm: Iterative, greedy algorithm bipartite matching: 1. An unmatched row i finds a “most profitable” column j π(i) = maxj b(i, j) − p(i) 2. Row i places a bid for column j. Bid price raised until j is no longer the best choice. (Min. increment µ) 3. Highest bid gets the matching (i, j).
    4. Time linear in entries examined... Number of entries examined is problem-dependent. 4.5 4 3.5 3 time (s) 2.5 2 1.5 1 0.5 0 0 5e+06 1e+07 1.5e+07 2e+07 2.5e+07 3e+07 number of entries examined
    5. Expensive Inner Loop! 1.3 GHz Itanium 2 10 8 6 4 2 0 100 150 200 250 300 350 400 450 cycles per entry
    6. Verifying... Using kcachegrind (N.Nethercote and J.Weidendorfer) and valgrind (J. Seward).
    7. And Locating... No obvious culprits in the instructions...
    8. And Locating... But considering cache effects!
    9. Auction’s Inner Loop Price Index Entry value = entry - price save largest...
    10. Auction’s Inner Loop Same accesses as sparse matrix-vector multiplication! Price Index Entry y += a(i,j) * x(j)
    11. Performance Through Blocking? (Images swiped from Berkeley’s BeBOP group.)
    12. Performance Through Blocking? (Images swiped from Berkeley’s BeBOP group.)
    13. Performance Through Blocking? More entries, but 1.5× performance on Pentium 3! (Images swiped from Berkeley’s BeBOP group.)
    14. Blocking Speeds Some Matches Finite element matrix from Vavasis (in UF collection):
    15. Blocking Speeds Some Matches
    16. Blocking Speeds Some Matches
    17. Observations A blocked graph data structure may provide additional performance if: you iterate over whole rows, the graph / matrix has runs of columns, and you’re willing to use an automated tuning system. Maximizing the runs: linear arrangement. Hard, but there may be cheap heuristics. Only worth-while if you’re performing many iterations. (For mat-vec, often > 50 computations of Ax.)

    + Jason RiedyJason Riedy, 2 years ago

    custom

    395 views, 0 favs, 0 embeds more stats

    A stab at optimizing the inner loop for auctiion-ba more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 395
      • 395 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 5
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories