Your SlideShare is downloading. ×
Introduction to ha loop
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to ha loop

1,024
views

Published on

Published in: Technology, Travel

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,024
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Introduction To HaLoop
    xiafei.qiu@PCA
  • 2. How hadoop works
  • 3. Page Rank in Hadoop
  • 4.
  • 5. Page Rank in Hadoop
  • 6. Differences
  • 7.
  • 8. Loop-aware Scheduling
    place on the same physical machines those map and reduce tasks that occur in different iterations but access the same data.
  • 9. Scheduling Algorithm
    the number of reduce tasks should be invariant across iterations, so that the hash function assigning mapper outputs to reducer nodes remains unchanged.
    the master node maintains a mapping from each slave node to the data partitions that this node processed in the previous iteration.
  • 10. Caches
    Reducer Input Cache
    Same key hashed to same reducer.
    f must be deterministic, same across iterations, take tuple t as only the input.
    Number of reducers remains unchanged.
    Reducer Output Cache
    That is, if two Reduce function calls produce the same output key from two different reducer input keys, both reducer input keys must be in the same partition so that they are sent to the same reduce task.
    Mapper Input Cache
  • 11. Inspirations

×