• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Hadoop operations
 

Hadoop operations

on

  • 5,223 views

 

Statistics

Views

Total Views
5,223
Views on SlideShare
5,159
Embed Views
64

Actions

Likes
12
Downloads
101
Comments
0

4 Embeds 64

http://eventifier.co 44
http://eventifier.com 16
http://pinterest.com 3
http://www.pinterest.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hadoop operations Hadoop operations Presentation Transcript

    • APOLLO GROUPHadoop Operations: Starting Out SmallSo Your Cluster Isnt Yahoo-sized (yet)Michael ArnoldPrincipal Systems Engineer14 June 2012
    • Agenda Who What (Definitions) Decisions for Now Decisions for Later Lessons LearnedAPOLLO GROUP © 2012 Apollo Group 2
    • APOLLO GROUP WhoAPOLLO GROUP Apollo Group © 2012 3
    • Who is Apollo? Apollo Group is a leading provider of higher education programs for working adults.APOLLO GROUP © 2012 Apollo Group 4
    • Who is Michael Arnold? Systems Administrator Automation geek 13 years in IT I deal with: –Server hardware specification/configuration –Server firmware –Server operating system –Hadoop application health –Monitoring all the aboveAPOLLO GROUP © 2012 Apollo Group 5
    • APOLLO GROUP What DefinitionsAPOLLO GROUP Apollo Group © 2012 6
    • Definitions Q: What is a tiny/small/medium/large cluster? A: –Tiny: 1-9 –Small: 10-99 –Medium: 100-999 –Large: 1000+ –Yahoo-sized: 4000APOLLO GROUP © 2012 Apollo Group 7
    • Definitions Q: What is a “headnode”? A: A server that runs one or more of the following Hadoop processes: –NameNode –JobTracker –Secondary NameNode –ZooKeeper –HBase MasterAPOLLO GROUP © 2012 Apollo Group 8
    • APOLLO GROUP What decisions should you make now and which can you postpone for later? Decisions for NowAPOLLO GROUP Apollo Group © 2012 9
    • Which Hadoop distribution? Amazon Apache Cloudera Greenplum Hortonworks IBM MapR Platform ComputingAPOLLO GROUP © 2012 Apollo Group 10
    • Should you virtualize? Can be OK for small clusters BUT –virtualization adds overhead –can cause performance degradation –cannot take advantage of Hadoop rack locality Virtualization can be good for: –functional testing of M/R job or workflow changes –evaluation of Hadoop upgradesAPOLLO GROUP © 2012 Apollo Group 11
    • What sort of hardware should you be considering? Inexpensive Not “enterprisey” hardware –No RAID* –No Redundant power* Low power consumption No optical drives –get systems that can boot off the network * except in headnodesAPOLLO GROUP © 2012 Apollo Group 12
    • Plan for capacity expansion Start at the bottom and work your way up Leave room in your cabinets for more machinesAPOLLO GROUP © 2012 Apollo Group 13
    • Plan for capacity expansion (cont.) Deploy your initial cluster in two cabinets –One headnode, one switch, and several (five) datanodes per cabinetAPOLLO GROUP © 2012 Apollo Group 14
    • Plan for capacity expansion (cont.) Install a second cluster in the empty space in the upper half of the cabinetAPOLLO GROUP © 2012 Apollo Group 15
    • APOLLO GROUP What decisions should you make now and which can you postpone for later? Decisions for LaterAPOLLO GROUP Apollo Group © 2012 16
    • What size cluster? Depends upon your: Budget Data size Workload characteristics SLAAPOLLO GROUP © 2012 Apollo Group 17
    • What size cluster? (cont.) Are your MapReduce jobs: compute-intensive? reading lots of data? http://www.cloudera.com/blog/2010/08/hadoophbase-capacity-planning/APOLLO GROUP © 2012 Apollo Group 18
    • Should you implement rack awareness? If more than one switch in the cluster: YESAPOLLO GROUP © 2012 Apollo Group 19
    • Should you use automation? If not in the beginning, then as soon as possible. Boot disks will fail. Automated OS and application installs: –Save time –Reduce errors •Cobbler/Spacewalk/Foreman/xCat/etc •Puppet/Chef/Cfengine/shell scripts/etcAPOLLO GROUP © 2012 Apollo Group 20
    • APOLLO GROUP Lessons LearnedAPOLLO GROUP Apollo Group © 2012 21
    • Keep It Simple Dont add redundancy and features (server/network) that will make things more complicated and expensive. Hadoop has built-in redundancies. Dont overlook them.APOLLO GROUP © 2012 Apollo Group 22
    • Automate the Hardware Twelve hours of manual work in the datacenter is not fun. Make sure all server firmware is configured identically. –HP SmartStart Scripting Toolkit –Dell OpenManage Deployment Toolkit –IBM ServerGuide Scripting ToolkitAPOLLO GROUP © 2012 Apollo Group 23
    • Rolling upgrades are possible (Just not of the Hadoop software.) Datanodes can be decommissioned, patched, and added back into the cluster without service downtime.APOLLO GROUP © 2012 Apollo Group 24
    • The smallest thing can have a big impact on the cluster Bad NIC/switchport can cause cluster slowness. Slow disks can cause intermittent job slowdowns.APOLLO GROUP © 2012 Apollo Group 25
    • HDFS blocks are weird On ext3/ext4: –Small blocks are not padded to the HDFS block- size, but rather the actual size of the data. –Each HDFS block is actually two files on the datanodes filesystem: •The actual data and •A metadata/checksum file # ls -l blk_1058778885645824207* -rw-r--r-- 1 hdfs hdfs 35094 May 14 01:26 blk_1058778885645824207 -rw-r--r-- 1 hdfs hdfs 283 May 14 01:26 blk_1058778885645824207_19155994.metaAPOLLO GROUP © 2012 Apollo Group 26
    • Do not prematurely optimize Be careful tuning your datanode filesystems. • mkfs -t ext4 -T largefile4 ... (probably bad) • mkfs -t ext4 -i 131072 -m 0 ... (better) /etc/mke2fs.conf [fs_types] hadoop = { features = has_journal,extent,huge_file,flex_bg,uninit_bg,dir_nlink, extra_isize inode_ratio = 131072 blocksize = -1 reserved_ratio = 0 default_mntopts = acl,user_xattr }APOLLO GROUP © 2012 Apollo Group 27
    • Use DNS-friendly names for services hdfs://hdfs.delta.hadoop.apollogrp.edu:8020/ mapred.delta.hadoop.apollogrp.edu:8021 http://oozie.delta.hadoop.apollogrp.edu:11000/ hiveserver.delta.hadoop.apollogrp.edu:10000 Yes, the names are long, but I bet you can figure out how to connect to Bravo Cluster.APOLLO GROUP © 2012 Apollo Group 29
    • Use a parallel, remote execution tool pdsh/Cluster SSH/mussh/etc SSH in a for loop is so 2010 FUNC/MCollectiveAPOLLO GROUP © 2012 Apollo Group 30
    • Make your log directories as large as you can. 20-100GB /var/log –Implement log purging cronjobs or your log directories will fill up. Beware: M/R jobs can fill up /tmp as well.APOLLO GROUP © 2012 Apollo Group 31
    • Insist on IPMI 2.0 for out of band management of server hardware. Serial Over LAN is awesome when booting a system. Standardized hardware/temperature monitoring. Simple remote power control.APOLLO GROUP © 2012 Apollo Group 33
    • Spanning-tree is the devil Enable portfast on your server switch ports or the BMCs may never get a DHCP lease.APOLLO GROUP © 2012 Apollo Group 34
    • Apollo has re-built its cluster four times. You may end up doing so as well.APOLLO GROUP © 2012 Apollo Group 35
    • Apollo Timeline First build Cloudera Professional Services helped install CDH Four nodes Manually build OS via USB CDROM. CDH2APOLLO GROUP © 2012 Apollo Group 36
    • Apollo Timeline Second build Cobbler All software deployment is via kickstart. Very little is in puppet. Config files are deployed via wget. CDH2APOLLO GROUP © 2012 Apollo Group 37
    • Apollo Timeline Third build OS filesystem partitioning needed to change. Most software deployment still via kickstart. CDH3b2APOLLO GROUP © 2012 Apollo Group 38
    • Apollo Timeline Fourth build HDFS filesystem inodes needed to be increased. Full puppet automation. Added redundant/hotswap enterprise hardware for headnodes. CDH3u1APOLLO GROUP © 2012 Apollo Group 39
    • Cluster failures at Apollo Hardware –disk failures (40+) –disk cabling (6) –RAM (2) –switch port (1) Software –Cluster •NFS (NN -> 2NN metadata) –Job •TT java heap •Running out of /tmp or /var/log/hadoop •Running out of HDFS spaceAPOLLO GROUP © 2012 Apollo Group 40
    • Know your workload You can spend all the time in the world trying to get the best CPU/RAM/HDD/switch/cabinet configuration, but you are running on pure luck until you understand your clusters workload.APOLLO GROUP © 2012 Apollo Group 41
    • APOLLO GROUP Questions?APOLLO GROUP Apollo Group © 2012 42