Hado“OPS” or Had “oops”

1,679 views

Published on

Maintaining large-scale distributed systems is a herculean task and Hadoop is no exception. The scale and velocity that we operate at Rocket Fuel presents a unique challenge. We observed 5 fold PB growth in our data and 5 fold number of machines, all in just a year’s time. As Hadoop became a critical infrastructure at Rocket Fuel, we had to ensure scale and high availability so our reporting, data mining, and machine learning could continue to excel. We also had to ensure business continuity with disaster recovery plans in the face of this drastic growth. In this presentation, we will discuss what worked well for us and what we learned 9the hard way). Specifically, we will (a) describe how we automated installation and dynamic configuration using Puppet and InfraDB (b) describe the performance tuning for scaling Hadoop (c) talk about the good, bad, and ugly of scheduling and multi-tenancy (d) detail some of the hard-fought issues (e) brief our Business-Continuity Plans and Disaster Recovery (f) touch upon how we monitor our Monster Hadoop cluster, and finally, (g) share our experience of Yarn-at-Scale at Rocket Fuel.

Published in: Technology, Design
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,679
On SlideShare
0
From Embeds
0
Number of Embeds
996
Actions
Shares
0
Downloads
27
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Hado“OPS” or Had “oops”

  1. 1. Proprietary & Confidential. Copyright © 2014. Hado’ops’ or Had’oops’ 1 We’re Hiringrocketfuel.comKishore Kumar Yellamraju Abhijit Pol
  2. 2. Proprietary & Confidential. Copyright © 2014. The Web Is Monetized By Advertising
  3. 3. Proprietary & Confidential. Copyright © 2014. Delivery Methods »Display »Video »Mobile »Social
  4. 4. Proprietary & Confidential. Copyright © 2014. 6. Ad Served User Segment s 3. Bid Reques t Overview Publishers 2. Ad Request 1. Page Request 4. Bid & Ad User Engagement s Data Partners Advertisers Browser Some Exchange Partners Ad Exchange Optimize Rocket Fuel Platform Real-time Bidder Automated Decisions Model s Refresh learning Data Store Ads & Budget Model Scores Events 5. Rocketfuel Winning Ad
  5. 5. Proprietary & Confidential. Copyright © 2014. 1.25 $2.11 $1.26 $2.78 $1.256 $1.809 $2.42 1.25 $2.11 $1.26 $2.78 $0.586 $2.009 1.25 $2.11 $1.26 $2.78 $1.56 $0.00 [ + ][ + ] Site/PageGeo/WeatherTime of DayBrand AffinityUser Always buying the best impressions & serving the best ad Real Time Bidding and Serving
  6. 6. Proprietary & Confidential. Copyright © 2014. Goal: Leads & sales Goal: Coupon downloads Goal: Brand awareness Site/PageGeo/WeatherTime of DayBrand AffinityDemo Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-market Behavior Response Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-Market BehaviorResponse X Impression Scorecard Demo Brand Affinity Time of Day Geo/Weather Site/Page Ad Position In-Market Behavior Response +100 +40 -20 +20 +15 +10 +40 +35 +9.7% +40 -70 -20 +10 +15 -25 -40 - 18+0.7 % +10 -10 -20 +20 +10 -35 -25 +10 +1.4% Real Time Bidding and Serving X✓
  7. 7. Proprietary & Confidential. Copyright © 2014. 6. Ad Served User Segment s 3. Bid Reques t Overview Publishers 2. Ad Request 1. Page Request 4. Bid & Ad User Engagement s Data Partners Advertisers Browser Some Exchange Partners Ad Exchange Optimize Rocket Fuel Platform Real-time Bidder Automated Decisions Model s Refresh learning Data Store Ads & Budget Model Scores Events 5. Rocketfuel Winning Ad
  8. 8. Proprietary & Confidential. Copyright © 2014. Throughput
  9. 9. Proprietary & Confidential. Copyright © 2014. Latency
  10. 10. Proprietary & Confidential. Copyright © 2014. Architecture and Scale »Datacenters »Scale »Growth »Architecture
  11. 11. Proprietary & Confidential. Copyright © 2014. Data Center Expansion »abc
  12. 12. Proprietary & Confidential. Copyright © 2014. Data Center Design • Racks custom built at Rocket Fuel • Leased space/bandwidth in colocation facilities Hadoop Server 20 2U servers (8.5kW) Bidders 40 2-U Twin 2 servers (17kW)
  13. 13. Proprietary & Confidential. Copyright © 2014. Rocket Fuel Scale »34,474 CPU processor cores –2655 servers –187.4 Teraflops of computing »188 Terabytes of memory –13X the memory of IBM computer Watson that played Jeopardy »42PB Petabytes of storage –106X the data volume of the entire Library of Congress
  14. 14. Proprietary & Confidential. Copyright © 2014. Hadoop at Rocket Fuel »1400 servers »15K Disks »15K Cores »90 TB »30K MR slots »12K daily MR jobs
  15. 15. Proprietary & Confidential. Copyright © 2014. 200 Servers 1400 Servers 5 PB 41 PB 8x Growth
  16. 16. Proprietary & Confidential. Copyright © 2014. Data Architecture 3.0
  17. 17. Proprietary & Confidential. Copyright © 2014. Hadoop Setup QJM ZK Quorum » 6x2TB Disks » 2x6 core » 196 GB RAM » 2x1G NIC » 12x3TB Disks » 2x6 core » 64 GB RAM » 10G NIC » same as DN’s » Dedicated disk to ZK or JN JT Standby NN ZKFCZKFC Active NN DN TT DN TT DN TT DN TT DN TT DN TT
  18. 18. Proprietary & Confidential. Copyright © 2014. Operations » Maintenance » Performance Tuning » Monitoring » BCP » YARN
  19. 19. Proprietary & Confidential. Copyright © 2014. Puppet + Infradb Automation is key Maintenance is Not Easy
  20. 20. Proprietary & Confidential. Copyright © 2014. Puppet and Infradb » Automate as much as you can » Adding a slave node to Hadoop cluster < 120 seconds » Bringing up a new Hadoop cluster < 500 seconds » MR slots are automatically determined based on hardware config Isn’t it cool ? Just define once
  21. 21. Proprietary & Confidential. Copyright © 2014. No issues when cluster is small Problems starts when it grows Performance Tuning
  22. 22. Proprietary & Confidential. Copyright © 2014. dfs.namenode.handler.count dfs.image.transfer.timeout mapred.reduce.parallel.copies mapred.job.tracker.handler.count io.sort.mbio.sort.factor maxClientCnxns ZK : HDFS : MR : IMP : MAPREDUCE-2026 -XX:+UseConcMarkSweepGC -XX:CMSFullGCsBeforeCompaction=1 -XX:CMSInitiatingOccupancyFraction=60 ha.*-timeout.ms JVM: Performance Tuning mapreduce.reduce.shuffle.parallelcopies
  23. 23. Proprietary & Confidential. Copyright © 2014. MAPREDUCE-5351 MAPREDUCE-5508 "keep.failed.task.files=true" We Have an Issue!
  24. 24. Proprietary & Confidential. Copyright © 2014. #instances of "JobInProgress” class = no. of users submitted jobs X mapred.jobtracker.completeuserjobs.maximum mapred.jobtracker.completeuserjobs.maximum mapred.jobtracker.retirejob.interval mapred.jobtracker.retiredjobs.cache.size JT OOM
  25. 25. Proprietary & Confidential. Copyright © 2014. Operations » Maintenance » Performance Tuning » Monitoring » BCP » YARN
  26. 26. Proprietary & Confidential. Copyright © 2014. Monitoring Wall of Ops
  27. 27. Proprietary & Confidential. Copyright © 2014. Monitoring hadoop.namenode.CallQueueLength hadoop.jobtracker.jvm.memheapusedm Don’t fly blind, you will crash!
  28. 28. Proprietary & Confidential. Copyright © 2014. MR Workload Monitoring
  29. 29. Proprietary & Confidential. Copyright © 2014. Network Monitoring Don’t blame network, instead monitor it Network Mesh can be mess
  30. 30. Proprietary & Confidential. Copyright © 2014. Alerting Monitoring is not enough, need better Alerting
  31. 31. Proprietary & Confidential. Copyright © 2014. Alerts http://hostname:port/jmx? qry=Hadoop:service=NameNode,name=NameNodeInfo >> Checking whether NN and JT are up is a no brainer >> Reduce alert noise by having summary/aggregate alerts >> We heavily rely on custom scripts that query /jmx for NN and JT qry=hadoop:service=JobTracker,name=JobTrackerInfo NameDirStatuses, DeadNodes, NumberOfMissingBlocks , qry=Hadoop:service=NameNode,name=FSNamesystemState FSState , CapacityRemaining , NumDeadDataNodes , UnderReplicatedBlocks Blacklisted TT’s , #jobs , #slots_used , ThreadCount , qry=java.lang:type=Memory" Used jvm , free jvm etc
  32. 32. Proprietary & Confidential. Copyright © 2014. MR Workload Alerting » Monitoring MR workload and alert – In-house tool that use “houdah” ruby gem monitors – Long running jobs , jobs with more map tasks , blacklisted TT’s with more failure counts etc… » Collect details and auto-restart blacklisted TT’s » Parse the JT logfile for rouge jobs. » Parse the JT log and collects all Job related info » White-elephant or hraven could help » Parse the scheduler html page or use metrics page http://<JT-hostname>:50030/scheduler?advanced http://<JT-hostname>:50030/metrics
  33. 33. Proprietary & Confidential. Copyright © 2014. Modeling OPS ETL Ad-hoc Multi Tenancy
  34. 34. Proprietary & Confidential. Copyright © 2014. No Scheduler is perfect unless you understand and tune it properly Scheduling
  35. 35. Proprietary & Confidential. Copyright © 2014. Operations » Maintenance » Performance Tuning » Monitoring » BCP » YARN
  36. 36. Proprietary & Confidential. Copyright © 2014. BCP » BCP → Business Continuity Plan » Near real time reporting over 15+ TB of daily data » Freshness of models trained over petabytes of data
  37. 37. Proprietary & Confidential. Copyright © 2014. Data BCP Cluster INW Data Cluster US Serving Clusters EU Serving Clusters HK Serving Clusters Modeling Repor ting User Queries Amazon Backup LSV Data Cluster US/EU/HK Serving Clusters Research Ad-hoc Queries Processed Data
  38. 38. Proprietary & Confidential. Copyright © 2014. YARN » Resource Manager - Global resource scheduler - Hierarchical queues - Application management » Node Manager - Per-machine agent - Manages life cycle of container - Container resource monitoring » Application Master - Per-application - Manages application scheduling and task execution
  39. 39. Proprietary & Confidential. Copyright © 2014. YARN at Rocket FueI » Yarn is in production » 700+ nodes » 31TB RAM , 8500 disks , 8500 cores » Primary use case Map-Reduce » No more static slots » Tez , Spark , Storm are in race YAY !!!
  40. 40. Proprietary & Confidential. Copyright © 2014. Obligatory “we are hiring” slide! http://rocketfuel.com/careers
  41. 41. Proprietary & Confidential. Copyright © 2014. THANKS kishore@rocketfuel.com apol@rocketfuel.com

×