The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster


Published on

The refactoring of Hadoop MapReduce framework, by separating resource management (YARN) from job execution (MapReduce) has allowed multiple programming paradigms to take advantage of the massive scale Hadoop Distributed File System (HDFS) clusters. Hamster (Hadoop And Mpi on the same cluSTER) is a port of OpenMPI to use YARN as a resource manager. Hamster allows applications written using MPI (Message Passing Interface) to run alongside other YARN applications and frameworks, such as MapReduce, on the same Hadoop cluster. In this talk, I will describe the architecture of Hamster, and present a few MPI applications that have been demonstrated to run in Hadoop. GraphLab uses MPI as one of the supported communication libraries, and can read/write data from/to HDFS. I will describe how GraphLab runs on top of Hadoop using Hamster, and present a few benchmarks in graph analytics, comparing GraphLab with other machine frameworks.

Published in: Data & Analytics
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The Zoo Expands: Labrador *Loves* Elephant, Thanks to Hamster

  1. 1. The Zoo Expands Labrador 💛 Elephant,Thanks to Hamster Milind Bhandarkar Chief Scientist, Pivotal Software, Inc.
  2. 2. About Me • • Founding member of Hadoop team atYahoo! [2005-2010] • Contributor to Apache Hadoop since v0.1 • Built and led Grid SolutionsTeam atYahoo! [2007-2010] • Parallel Programming Paradigms [1989-today] (PhD • Center for Development of Advanced Computing (C-DAC), National Center for Supercomputing Applications (NCSA), Center for Simulation of Advanced Rockets, Siebel Systems (acquired by Oracle), Pathscale Inc. (acquired by QLogic),Yahoo!, LinkedIn, and Pivotal (formerly Greenplum)
  3. 3. Hamster • Hadoop and MPI on the same cluster • Runtime for OpenMPI applications onYARN • Available on Pivotal HD
  4. 4. Why MPI ? •Hadoop Dataflow paradigms (MapReduce, TeZ etc) not suitable for iterative applications •Message Passing Interface (MPI) •Mature standard •Used extensively in HPC •Huge ecosystem
  5. 5. MPI in Science & Engg Earth Atmosphere Chemistry Biology Math Nuclear
  6. 6. MPI in Industry Mechanical ar Finance/bankOil Exploration Cryptography Spacecraft
  7. 7. OpenMPI •Mature Open Source implementation of MPI 3.0 Standard ( •New BSD license •30+ contributing organizations from academia, research and industry •
  8. 8. OpenMPI Architecture
  9. 9. Pluggable
  10. 10. Hamster Design •YARN as Resource Manager •Hamster Application Manager •Manages MPI jobs •(tries to) Implement Gang-Scheduling •Leverages OMPI/ORTE strengths •Wire-up,Task monitoring, Fast Interconnect
  11. 11. Hamster Architecture Resource Manager Scheduler AMService Node Manager Node Manager Node Manager … Proc/Container Framework Daemon NSMPI Scheduler HNP MPI AM Proc/Container …RM-AM AM-NM RM-NodeManager Client Client-RM Aux Srvcs Proc/Container Framework Daemon NS Proc/Container … Aux Srvcs RM- NodeManager
  12. 12. Hamster AppMaster • Master daemon for MPI ( similar to JobTracker in MapReduce) • Implements and participates in theYARN-RM App lifecycle protocol • Maintains heartbeat with RM to ensure liveness • MPI Scheduler - Negotiates resource allocation with YARN-RM • Head Node Process (HNP) - manages job execution
  13. 13. Hamster Node Service •User-level daemon per MPI job •Manages task execution •Coarse-grained container management •Bootstrapped byYARN-NM •Implemented asYARN Auxiliary Service
  14. 14. Why GraphLab on Hadoop ? •Graph Analytics & Machine Learning only one stage in E2E data pipeline •ETL/Preprocessing •Building Graphs from fact & dimension tables •Publishing analytics results, post-processing
  15. 15. GraphLab 2.2 •Communication patterns based on Data •SeveralToolkits (Graph Analytics + ML Algorithms) available •Graph-Programming API •Uses MPI for communication
  16. 16. Pivotal HD HDFS HBase Pig, Hive, Mahout Map Reduce Sqoop Flume Resource Management & Workflow Yarn Zookeeper Apache Pivotal Command Center Configure, Deploy, Monitor, Manage Spring XD Pivotal HD Enterprise Spring Xtension Framework Catalog Services Query Optimizer Dynamic Pipelining ANSI SQL + Analytics HAWQ – Advanced Database Services Distributed In-memory Store Query Transactions Ingestion Processing Hadoop Driver – Parallel with Compaction ANSI SQL + In-Memory GemFire XD – Real-Time Database Services MADlib Algorithms Oozie Virtual Extensions Graphlab, Open MPI
  17. 17. Performance
  18. 18. Test Environment •Pivotal Analytics Workbench Cluster •Pivotal HD 1.1 (Apache Hadoop 2.0.5) •Hamster - 1.0, OpenMPI-1.7.2 •515 nodes •2x6-core Westmere, 48GB RAM, 12x2TB SATA, Mellanox FDR Infiniband
  19. 19. Null Job •Measures overhead of launching MPI jobs •Tests scalability of resource allocation, launching and wire-up •Sub-linear scalability (slightly worse than O(logN) •Overhead of launching 15000 processes = 1 minute
  20. 20. Total RuntimeTime(Sec.) 5 18.75 32.5 46.25 60 Process number 0 4000 8000 12000 16000 E2E time
  21. 21. AllocationTimeTime(Sec.) 1 2.25 3.5 4.75 6 Number of Processes 0 4000 8000 12000 16000 Allocation Time
  22. 22. LaunchTimeTime(Sec.) 0 7.5 15 22.5 30 Number of processes 0 4000 8000 12000 16000 Launch Time
  23. 23. Comparison with OpenMPI •HPL (HP Linpack forTop-500) •Number of processes 50—1000 •Hamster 1% slower than OpenMPI
  24. 24. HPL - Hamster vs OpenMPI Time(Sec.) 0 30 60 90 120 1000 500 200 50
  25. 25. GraphLab ALS •Wikipedia dataset •4.3 M terms, 3.3M documents, 513M occurrences •17 Processes •5 Iterations
  26. 26. GraphLab ALS Time(Sec.) 0 335 670 1005 1340 Hamster OpenMPI
  27. 27. GraphLab PageRank •Twitter Dataset •4.1 M nodes, 1.4 B edges •Data Size : 26GB •NP = 17 •50 iterations: 297 seconds •100 iterations: 339 seconds
  28. 28. Questions?