Your SlideShare is downloading. ×
0
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Introduction of Apache Hama - 2011
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction of Apache Hama - 2011

22,116

Published on

for the lecture.

for the lecture.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
22,116
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
86
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Introduction of Apache Hama Edward J. Yoon, October 11, 2011 <edwardyoon@apache.org>
  • 2. About Me Founder of Apache Hama. Committer of Apache Bigtop. Employee for KT. http://twitter.com/eddieyoon
  • 3. What Is Hama? Apache Incubator Project. BSP (Bulk Synchronous Parallel) for massive scientific computations. Written In Java. Currently 2 releases, 3 main committers.
  • 4. Hama Characteristics Provides a Pure BSP model .  Job submission and management interface.  Multiple tasks per node.  Checkpoint recovery. Supports to run in the Clouds using Apache Whirr. Supports to run with Hadoop nextGen.
  • 5. Bulk Synchronous Parallel? Parallel programming model introduced by Valiant. Consist of a sequence of supersteps. Conceptually simple and intuitive from a programming standpoint. Used for a variety of applications e.g., scientific computing, genetic programming, …
  • 6. Schematic diagram of a superstepLocal Computation ………. Idle Communication ………. Idle Barrier Synchronization
  • 7. Internals Proactive Task Scheduling. Hadoop RPC is used for BSP tasks to communicate each other. Collection and bundling of messages as a technique to reduce network overheads and contentions. Zookeeper is used for Barrier Synchronization.
  • 8. Pi Calculation Each task executes locally its portion of the loop a number of times. One task acts as master and collects the results through the BSP communication interface.
  • 9. Structural Analysis of Network Traffic Flows Traffic flows in KT clouds.  traffic engineering, anomaly detection, traffic forecasting and capacity planning Currently BSP jobs are experimentally running on 512 multi-cores machines.
  • 10. Random Communication Benchmarks Benchmarked on 16 1U servers using 10 tasks per server. X axis is the time (sec.)of BSP job execution (32 supersteps). Y axis is the number of messages to be sent to random BSP tasks in each superstep.
  • 11. What’s Next? Support Input/Output Formatter like MapReduce. Message Compression for High Performance. Add some frameworks on top of Hama.
  • 12. More Information http://incubator.apache.org/hama http://wiki.apache.org/hama

×