Transcript of "Hadoop on a personal supercomputer"
Hadoop on a PersonalSupercomputerPaul Dingman – Chief Technologist, Integration Divisionpdingman@pervasive.com PERVASIVE DATA INNOVATION
Pervasive and Hadoop• Pervasive Software develops software products to manage, integrate and analyze data.• Innovation Lab projects around big data include: – Hadoop • Accelerate MapReduce (DataRush Community Edition) • High-speed add-ons for HBase, Avro, Hive (TurboRush) • Augment Sqoop • Enhance ETL capabilities – Benchmarks • Terasort • TPC-H • SIEM/LogAnalytics EPS • Genomics2
Why are many-core systems interesting?• Many-core processors make it possible to concentrate large amounts of processing power in a single machine. Coupled with newer storage technologies these systems can have high speed access to tremendous amounts of storage.• We have done a lot of work with multi-core systems at Pervasive Software. Our Pervasive DataRush ™ Dataflow Engine takes advantage of all available processor cores to efficiently process large volumes of data. – Analytics – Data mining – Genomics• Potential cost and energy savings due to the need for fewer nodes.• Potential performance gains by eliminating inter-node data exchange.3
Pervasive DataRush™ Speed and Scalability • World Record Performance set running Smith-Waterman algorithm • Code written on an 8 core machine scaled to 384 cores with no changes!4
Malstone-B10* Scalability 400 Run-time for 10B rows Run-time 350 370.0 300 3.2 hours with 4 250 cores Time in Minutes 200 192.4 1.5 hours with 8 150 cores Under 1 hour with 100 16 cores 90.3 50 51.6 31.5 0 2 cores 4 cores 8 cores 16 cores 32 cores Core Count * Cyber security benchmark from the Open Cloud Consortium5
How well does Hadoop work on many-coresystems?• One of the areas we wanted to explore with Hadoop is to determine how well it works on systems with lots of cores. In other words is it possible to run Hadoop in an environment where you could exploit the cores for complex operations, but still have the benefits of the distributed environment provided by Hadoop and HDFS?6
Master Node (NameNode/JobTracker) Commodity Box P1 P1 • 2 Intel Xeon L5310 CPUs 1.6 GHz (8 cores) Local DRAM (16 GB) • 16 GB DRAM (ECC) • 8 SATA Hard Disks (4 TB) • Mellanox ConnectX-2 VPI Dual Port Adapter Infiniband 500 GB … 500 GB local (8 spindles)7
Slave Nodes (DataNode/TaskTracker) • 4 AMD Opteron 6172 CPUs P1 P1 P1 P1 (48 cores) • Supermicro MB • 1 LSI 8 port HBA (6 GBps) Local DRAM (256 GB) • 2 SATA SSDs (512 GB) • 256 GB DRAM (ECC) • 32 SATA Hard Disks (64 TB) • Mellanox ConnectX-2 VPI 2TB … 2TB 2TB Dual Port Adapter Infiniband HDFS local (24 spindles, JBOD) (8 spindles)8
Hadoop Tuning• We worked from the bottom up. – Linux (various kernels and kernel settings) – File systems (EXT2, EXT3, EXT4) – Drivers (HBA) – JVMs• Initial tests were done using a single “fat” node (same config as worker nodes).• Made it easier to test different disk configurations.• For Hadoop tests we primarily used 100 GB Terasort jobs for testing. This test exercised all phases of the MapReduce process while not being too large to run frequently.10
Lessons Learned with Single Node Tuning• We found we could comfortably run 40 maps and 20 reducers given memory and CPU constraints• Use large block size for HDFS. – Execution time for map tasks was around 1 minute using 512 MB block size• More spindles is better – 1:1 ratio of map tasks to local HDFS spindles works well – EXT2 seems to work well with JBOD• Dedicated spindles for temporary files on each worker node• Configure JVM settings for larger heap size to avoid spills – Parallel GC seemed to help as well• Compression of map outputs is a huge win (LZO)• HBase scales well in fat nodes with DataRush (> 5M rows/sec bulk load; >10M rows/sec sequential scan)11
Varying Spindles for HDFS Terasort Average Execution Time 900 800 700 600 Time (secs) 500 400 Terasort Average Execution Time 300 200 100 0 8 16 24 32 40 48 HDFS Disks (2TB)12
Varying Spindles for Intermediate Outputs Terasort Average Execution Time 800 700 600 500 Time (secs) 400 Terasort Average Execution Time 300 200 100 0 4 x 2TB 8 x 2TB 16 x 2TB Fusion I/O Drive Flash RAID 0 (4 x 2TB) Drives for Intermediate Map Output13
Clustering the Nodes• We had a total of 64 hard disks for the cluster and had to split them between the two nodes.• Installed and configured Open Fabrics OFED to enable IPoIB.• Reconfigure Hadoop to cluster the nodes.15
Comparisons with Amazon Clusters• The Amazon clusters were used to get a better idea of what to expect using more conventionally sized Hadoop nodes (non-EMR).• We used „Cluster Compute Quadruple Extra Large‟ instances – 23 GB of memory – 33.5 EC2 Compute Units (Dual Intel Xeon X5570 quad-core “Nehalem” processors; 8 cores total) – 1690 GB of instance storage (2 spindles) – Very high I/O performance (10 GbE)• Used a similar Hadoop configuration, but dialed back the number of maps and reducers due to lower core count.• Used cluster sizes that were roughly core count equivalent for comparison17
Conclusions• From what we have seen Hadoop works very well on many-core systems. In fact, Hadoop runs quite well on even a single node many-core system.• Using denser nodes may make failures more expensive for some system components. When using disk arrays the handling of hard disk failures should be comparable to smaller nodes.• The MapReduce framework treats all intermediate outputs as remote resources. The copy phase of MapReduce doesn‟t benefit from locality of data.20
Questions?Follow up/more information:-Visit our boothPervasive DataRush for Hadoopwww.pervasivedatarush.com/Technology/PervasiveDataRushforHadoop.aspxPresentation content – email@example.com PERVASIVE DATA INNOVATION
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.