Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

3,926 views

Published on

Pivotal has setup and operationalized 1000 node Hadoop cluster called the Analytics Workbench. It takes special setup and skills to manage such a large deployment. This session shares how we set it up and how you will manage it.

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,926
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
40
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Pivotal: Operationalizing 1000 Node Hadoop Cluster - Analytics Workbench

  1. 1. 1© Copyright 2013 EMC Corporation. All rights reserved. Operationalizing 1000 Node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi
  2. 2. 2© Copyright 2013 EMC Corporation. All rights reserved. Agenda  Introduction  Tools – Kickstart – Parallel SSH – Puppet  Q & A
  3. 3. 3© Copyright 2013 EMC Corporation. All rights reserved. Meet AWB Introduction to the Analytics Workbench
  4. 4. 4© Copyright 2013 EMC Corporation. All rights reserved. Vision Statement Provide a collaborative platform that is: AGILE: Support platform for proving mixed mode enterprise readiness at scale. INNOVATIVE: Showcase ground breaking data science. ACCESSIBLE: Create a shared environment for rapid innovation of big data and cloud computing technologies. EDUCATIONAL: Provide a resource for educating developers, partners, and customers on big data and cloud technologies.
  5. 5. 5© Copyright 2013 EMC Corporation. All rights reserved. Partners Intel– contributed 2,000 hex-core CPUs Mellanox – contributed 72 switches, 1000+ network cards, 1400+ cables Micron – contributed 6,000 memory modules Seagate – contributed 12,000 2TB drives Supermicro – contributed 1,000+ servers Switch – contributed the hosting facility in its state-of-the-art data center VMware – provided operational support
  6. 6. 6© Copyright 2013 EMC Corporation. All rights reserved. Quick facts  Largest Hadoop cluster of its kind  Operational since July 2012  Single multi-tenant cluster  Physical cluster (no virtualization)  25 projects - 12 active, 8 in pipeline
  7. 7. 7© Copyright 2013 EMC Corporation. All rights reserved. Use-case  Pivotal Demonstration  Partner Engagements  Industry and Academia Collaboration
  8. 8. 8© Copyright 2013 EMC Corporation. All rights reserved. Tools Scalable Tool Chain & Standardization
  9. 9. 9© Copyright 2013 EMC Corporation. All rights reserved. AWB Cluster Lifecycle
  10. 10. 10© Copyright 2013 EMC Corporation. All rights reserved. AWB Cluster Lifecycle
  11. 11. 11© Copyright 2013 EMC Corporation. All rights reserved. Kickstart  Generic tool to automate OS install  Requires DHCP, TFTP and HTTP services  TFTP serves the PXELINUX HEX file, Linux kernel (vmlinuz) and in-memory file system (initrd)  HTTP serves the kickstart configuration (kickstart.cfg)
  12. 12. 12© Copyright 2013 EMC Corporation. All rights reserved. Kickstart  Example of PXELINUX file - /tftpboot/pxelinux.cfg/AC1C0401 Continued default install label install kernel centos/6.2/vmlinuz append initrd=centos/6.2/initrd.img ramdisk_size=9025 text console=ttyS2,115200,n,1 sshd=1 install=http://10.1.25.51/centos/6.2/os/x86_64 ks=http://10.1.25.51/centos/6.2/kickstart/conf/kickstart.cfg implicit 1 display message prompt 1 timeout 10
  13. 13. 13© Copyright 2013 EMC Corporation. All rights reserved. Kickstart  Example of kickstart config Continued … url --url http://10.1.25.51/centos/6.2/os/x86_64 ... %packages @core @performance … %post --log=/root/kickstart-post.log wget -O /root/post-install.tgz http://10.1.25.51/centos/6.2/post-install.tgz …
  14. 14. 14© Copyright 2013 EMC Corporation. All rights reserved. Kickstart  Generate PXELINUX and kickstart files Continued [cooi@ks ~]$ ./kickstart --generate --os centos --osver 6.2 --restart pxe node0945 Generating /tftpboot/pxelinux.cfg/AC1C0401 Setting bootdev on node0945.sp Set Boot Device to pxe Restarting node0945.sp Chassis Power Control: Cycle [cooi@ks ~]$ for i in `seq -w 1 200`; do ./kickstart --generate --os centos --osver 6.2 --restart pxe node0$i; done … Skipping
  15. 15. 15© Copyright 2013 EMC Corporation. All rights reserved. Kickstart  Enable switching or upgrading OS easily  Kickstart 60 nodes in ~45 minutes: – 1 kickstart server with software RAID5 – 100Mbps TOR and aggregator switches – Saturated the 100Mbps network  Kickstart 200 nodes in ~45 minutes: – 2 kickstart servers with software RAID5 – 100Mbps TOR switches and 1Gbps aggregator switches  Estimate to do >1000 nodes with full 1Gbps network Continued
  16. 16. 16© Copyright 2013 EMC Corporation. All rights reserved. Parallel SSH  Sys admin’s lightsaber
  17. 17. 17© Copyright 2013 EMC Corporation. All rights reserved. Parallel SSH Continued  Start/Stop Hadoop services  Orchestrate cluster deployments  Perform manual cluster administration tasks  Pick one that is user-friendly and scalable, e.g. – Massh - http://m.a.tt/er/massh/ – ClusterShell - https://github.com/cea-hpc/clustershell – Parallel Distributed Shell (pdsh) - https://code.google.com/p/pdsh
  18. 18. 18© Copyright 2013 EMC Corporation. All rights reserved. Puppet  Configuration Management framework  Install and configure all applications on the cluster  Configure monitoring system  Currently running Puppet 2.7.x
  19. 19. 19© Copyright 2013 EMC Corporation. All rights reserved. Puppet Continued
  20. 20. 20© Copyright 2013 EMC Corporation. All rights reserved. Puppet Continued
  21. 21. 21© Copyright 2013 EMC Corporation. All rights reserved. Puppet Continued
  22. 22. 22© Copyright 2013 EMC Corporation. All rights reserved. Puppet Continued
  23. 23. 23© Copyright 2013 EMC Corporation. All rights reserved. Puppet Continued  Puppet sync 600 nodes in ~15 minutes: – Use parallel SSH tool to trigger Puppet sync across the cluster – 1 Puppet master with dual hex-core CPU – Saturated CPU on the Puppet master  Switch versions of Hadoop in 2 hours  Manifests and modules are version-controlled
  24. 24. 24© Copyright 2013 EMC Corporation. All rights reserved. Puppet Continued  One quarter to learn, deploy and design our Puppet infrastructure. – It is an iterative process.  Tasks managed outside of Puppet: – User account management – Start/Stop Hadoop services – Orchestrate deployment – Rollback/uninstall applications
  25. 25. 25© Copyright 2013 EMC Corporation. All rights reserved. Cluster Management Tools Task / Tools Kickstart Parallel SSH Puppet Nagios Ganglia Install OS Install Apps Configure Apps Start / Stop Services Monitoring
  26. 26. 26© Copyright 2013 EMC Corporation. All rights reserved. Q & A http://www.analyticsworkbench.com
  27. 27. 27© Copyright 2013 EMC Corporation. All rights reserved. Pivotal Sessions at EMC World Session Presenter Dates/Times The Pivotal Platform: A Purpose-Built Platform for Big-Data- Driven Applications Josh Klahr Tue 5:30 - 6:30, Palazzo E Wed 11:30 - 12:30, Delfino 4005 Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action Noelle Sio Tue 10:00 - 11:00, Lando 4205 Thu 8:30 - 9:30, Palazzo F Pivotal: Operationalizing 1000-node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi Tue 11:30 - 12:30, Palazzo L Thu 10:00- 11:00 am, Delfino 4001A Pivotal: for Powerful Processing of Unstructured Data For Valuable Insights SK Krishnamurthy Mon 4:00 - 5:00, Lando 4201 A Tue 4:00 - 5:00, Palazzo M Pivotal: Big & Fast data – merging real-time data and deep analytics Michael Crutcher Mon 1:00 - 2:00, Lando 4201 A Wed 10:00 - 11:00, Palazzo M Pivotal: Virtualize Big Data to Make The Elephant Dance June Yang Dan Baskette Mon 11:30 - 12:30, Marcello 4401A Wed 4:00 - 5:00, Palazzo E Hadoop Design Patterns Don Miner Mon 2:30 - 3:30, Palazzo F Wed 8:30 - 9:30, Delfino 4005

×