Hadoop operations
Upcoming SlideShare
Loading in...5
×
 

Hadoop operations

on

  • 5,543 views

 

Statistics

Views

Total Views
5,543
Views on SlideShare
5,477
Embed Views
66

Actions

Likes
12
Downloads
101
Comments
0

4 Embeds 66

http://eventifier.co 44
http://eventifier.com 18
http://pinterest.com 3
http://www.pinterest.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Hadoop operations Hadoop operations Presentation Transcript

  • APOLLO GROUPHadoop Operations: Starting Out SmallSo Your Cluster Isnt Yahoo-sized (yet)Michael ArnoldPrincipal Systems Engineer14 June 2012
  • Agenda Who What (Definitions) Decisions for Now Decisions for Later Lessons LearnedAPOLLO GROUP © 2012 Apollo Group 2
  • APOLLO GROUP WhoAPOLLO GROUP Apollo Group © 2012 3
  • Who is Apollo? Apollo Group is a leading provider of higher education programs for working adults.APOLLO GROUP © 2012 Apollo Group 4
  • Who is Michael Arnold? Systems Administrator Automation geek 13 years in IT I deal with: –Server hardware specification/configuration –Server firmware –Server operating system –Hadoop application health –Monitoring all the aboveAPOLLO GROUP © 2012 Apollo Group 5
  • APOLLO GROUP What DefinitionsAPOLLO GROUP Apollo Group © 2012 6
  • Definitions Q: What is a tiny/small/medium/large cluster? A: –Tiny: 1-9 –Small: 10-99 –Medium: 100-999 –Large: 1000+ –Yahoo-sized: 4000APOLLO GROUP © 2012 Apollo Group 7
  • Definitions Q: What is a “headnode”? A: A server that runs one or more of the following Hadoop processes: –NameNode –JobTracker –Secondary NameNode –ZooKeeper –HBase MasterAPOLLO GROUP © 2012 Apollo Group 8
  • APOLLO GROUP What decisions should you make now and which can you postpone for later? Decisions for NowAPOLLO GROUP Apollo Group © 2012 9
  • Which Hadoop distribution? Amazon Apache Cloudera Greenplum Hortonworks IBM MapR Platform ComputingAPOLLO GROUP © 2012 Apollo Group 10
  • Should you virtualize? Can be OK for small clusters BUT –virtualization adds overhead –can cause performance degradation –cannot take advantage of Hadoop rack locality Virtualization can be good for: –functional testing of M/R job or workflow changes –evaluation of Hadoop upgradesAPOLLO GROUP © 2012 Apollo Group 11
  • What sort of hardware should you be considering? Inexpensive Not “enterprisey” hardware –No RAID* –No Redundant power* Low power consumption No optical drives –get systems that can boot off the network * except in headnodesAPOLLO GROUP © 2012 Apollo Group 12
  • Plan for capacity expansion Start at the bottom and work your way up Leave room in your cabinets for more machinesAPOLLO GROUP © 2012 Apollo Group 13
  • Plan for capacity expansion (cont.) Deploy your initial cluster in two cabinets –One headnode, one switch, and several (five) datanodes per cabinetAPOLLO GROUP © 2012 Apollo Group 14
  • Plan for capacity expansion (cont.) Install a second cluster in the empty space in the upper half of the cabinetAPOLLO GROUP © 2012 Apollo Group 15
  • APOLLO GROUP What decisions should you make now and which can you postpone for later? Decisions for LaterAPOLLO GROUP Apollo Group © 2012 16
  • What size cluster? Depends upon your: Budget Data size Workload characteristics SLAAPOLLO GROUP © 2012 Apollo Group 17
  • What size cluster? (cont.) Are your MapReduce jobs: compute-intensive? reading lots of data? http://www.cloudera.com/blog/2010/08/hadoophbase-capacity-planning/APOLLO GROUP © 2012 Apollo Group 18
  • Should you implement rack awareness? If more than one switch in the cluster: YESAPOLLO GROUP © 2012 Apollo Group 19
  • Should you use automation? If not in the beginning, then as soon as possible. Boot disks will fail. Automated OS and application installs: –Save time –Reduce errors •Cobbler/Spacewalk/Foreman/xCat/etc •Puppet/Chef/Cfengine/shell scripts/etcAPOLLO GROUP © 2012 Apollo Group 20
  • APOLLO GROUP Lessons LearnedAPOLLO GROUP Apollo Group © 2012 21
  • Keep It Simple Dont add redundancy and features (server/network) that will make things more complicated and expensive. Hadoop has built-in redundancies. Dont overlook them.APOLLO GROUP © 2012 Apollo Group 22
  • Automate the Hardware Twelve hours of manual work in the datacenter is not fun. Make sure all server firmware is configured identically. –HP SmartStart Scripting Toolkit –Dell OpenManage Deployment Toolkit –IBM ServerGuide Scripting ToolkitAPOLLO GROUP © 2012 Apollo Group 23
  • Rolling upgrades are possible (Just not of the Hadoop software.) Datanodes can be decommissioned, patched, and added back into the cluster without service downtime.APOLLO GROUP © 2012 Apollo Group 24
  • The smallest thing can have a big impact on the cluster Bad NIC/switchport can cause cluster slowness. Slow disks can cause intermittent job slowdowns.APOLLO GROUP © 2012 Apollo Group 25
  • HDFS blocks are weird On ext3/ext4: –Small blocks are not padded to the HDFS block- size, but rather the actual size of the data. –Each HDFS block is actually two files on the datanodes filesystem: •The actual data and •A metadata/checksum file # ls -l blk_1058778885645824207* -rw-r--r-- 1 hdfs hdfs 35094 May 14 01:26 blk_1058778885645824207 -rw-r--r-- 1 hdfs hdfs 283 May 14 01:26 blk_1058778885645824207_19155994.metaAPOLLO GROUP © 2012 Apollo Group 26
  • Do not prematurely optimize Be careful tuning your datanode filesystems. • mkfs -t ext4 -T largefile4 ... (probably bad) • mkfs -t ext4 -i 131072 -m 0 ... (better) /etc/mke2fs.conf [fs_types] hadoop = { features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink, extra_isize inode_ratio = 131072 blocksize = -1 reserved_ratio = 0 default_mntopts = acl,user_xattr }APOLLO GROUP © 2012 Apollo Group 27
  • Use DNS-friendly names for services hdfs://hdfs.delta.hadoop.apollogrp.edu:8020/ mapred.delta.hadoop.apollogrp.edu:8021 http://oozie.delta.hadoop.apollogrp.edu:11000/ hiveserver.delta.hadoop.apollogrp.edu:10000 Yes, the names are long, but I bet you can figure out how to connect to Bravo Cluster.APOLLO GROUP © 2012 Apollo Group 29
  • Use a parallel, remote execution tool pdsh/Cluster SSH/mussh/etc SSH in a for loop is so 2010 FUNC/MCollectiveAPOLLO GROUP © 2012 Apollo Group 30
  • Make your log directories as large as you can. 20-100GB /var/log –Implement log purging cronjobs or your log directories will fill up. Beware: M/R jobs can fill up /tmp as well.APOLLO GROUP © 2012 Apollo Group 31
  • Insist on IPMI 2.0 for out of band management of server hardware. Serial Over LAN is awesome when booting a system. Standardized hardware/temperature monitoring. Simple remote power control.APOLLO GROUP © 2012 Apollo Group 33
  • Spanning-tree is the devil Enable portfast on your server switch ports or the BMCs may never get a DHCP lease.APOLLO GROUP © 2012 Apollo Group 34
  • Apollo has re-built its cluster four times. You may end up doing so as well.APOLLO GROUP © 2012 Apollo Group 35
  • Apollo Timeline First build Cloudera Professional Services helped install CDH Four nodes Manually build OS via USB CDROM. CDH2APOLLO GROUP © 2012 Apollo Group 36
  • Apollo Timeline Second build Cobbler All software deployment is via kickstart. Very little is in puppet. Config files are deployed via wget. CDH2APOLLO GROUP © 2012 Apollo Group 37
  • Apollo Timeline Third build OS filesystem partitioning needed to change. Most software deployment still via kickstart. CDH3b2APOLLO GROUP © 2012 Apollo Group 38
  • Apollo Timeline Fourth build HDFS filesystem inodes needed to be increased. Full puppet automation. Added redundant/hotswap enterprise hardware for headnodes. CDH3u1APOLLO GROUP © 2012 Apollo Group 39
  • Cluster failures at Apollo Hardware –disk failures (40+) –disk cabling (6) –RAM (2) –switch port (1) Software –Cluster •NFS (NN -> 2NN metadata) –Job •TT java heap •Running out of /tmp or /var/log/hadoop •Running out of HDFS spaceAPOLLO GROUP © 2012 Apollo Group 40
  • Know your workload You can spend all the time in the world trying to get the best CPU/RAM/HDD/switch/cabinet configuration, but you are running on pure luck until you understand your clusters workload.APOLLO GROUP © 2012 Apollo Group 41
  • APOLLO GROUP Questions?APOLLO GROUP Apollo Group © 2012 42