Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Apache Storm - Strata + Hadoop World 2014

152,257 views

Published on

Scaling Apache Storm: Cluster Sizing and Performance Optimization

Slides from my presentation at Strata + Hadoop World 2014

Published in: Technology
  • DOWNLOAD THE BOOK INTO AVAILABLE FORMAT (New Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://redirect.is/fyxsb0u } ......................................................................................................................... Download Full EPUB Ebook here { https://redirect.is/fyxsb0u } ......................................................................................................................... Download Full doc Ebook here { https://redirect.is/fyxsb0u } ......................................................................................................................... Download PDF EBOOK here { https://redirect.is/fyxsb0u } ......................................................................................................................... Download EPUB Ebook here { https://redirect.is/fyxsb0u } ......................................................................................................................... Download doc Ebook here { https://redirect.is/fyxsb0u } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THE can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THE is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBOOK .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookBOOK, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, EBOOK, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THE Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THE the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THE Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • You can hardly find a student who enjoys writing a college papers. Among all the other tasks they get assigned in college, writing essays is one of the most difficult assignments. Fortunately for students, there are many offers nowadays which help to make this process easier. The best service which can help you is ⇒ www.HelpWriting.net ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THE BOOK INTO AVAILABLE FORMAT (New Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THE can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THE is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBOOK .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookBOOK, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, EBOOK, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THE Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THE the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THE Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THE BOOK INTO AVAILABLE FORMAT (New Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THE can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THE is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBOOK .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookBOOK, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, EBOOK, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THE Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THE the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THE Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD THE BOOK INTO AVAILABLE FORMAT (New Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THE can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THE is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBOOK .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookBOOK, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, EBOOK, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THE Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THE the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THE Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Scaling Apache Storm - Strata + Hadoop World 2014

  1. 1. Scaling Apache Storm P. Taylor Goetz, Hortonworks @ptgoetz
  2. 2. About Me Member of Technical Staff / Storm Tech Lead @ Hortonworks Apache Storm PMC Chair @ Apache
  3. 3. About Me Member of Technical Staff / Storm Tech Lead @ Hortonworks Apache Storm PMC Chair @ Apache Volunteer Firefighter since 2004
  4. 4. 1M+ messages / sec. on a 10-15 node cluster How do you get there?
  5. 5. How do you fight fire?
  6. 6. Put the wet stuff on the red stuff. Water, and lots of it.
  7. 7. When you're dealing with big fire, you need big water.
  8. 8. Static Water Sources Lakes Streams Reservoirs, Pools, Ponds
  9. 9. Data Hydrant Active source Under pressure
  10. 10. How does this relate to Storm?
  11. 11. Little’s Law L=λW The long-term average number of customers in a stable system L is equal to the long-term average effective arrival rate, λ, multiplied by the average time a customer spends in the system, W; or expressed algebraically: L = λW. http://en.wikipedia.org/wiki/Little's_law
  12. 12. Batch vs. Streaming
  13. 13. Batch Processing Operates on data at rest Velocity is a function of performance Poor performance costs you time
  14. 14. Stream Processing Data in motion At the mercy of your data source Velocity fluctuates over time Poor performance….
  15. 15. Poor performance bursts the pipes. Buffers fill up and eat memory Timeouts / Replays “Sink” systems overwhelmed
  16. 16. What can developers do?
  17. 17. Keep tuple processing code tight public class MyBolt extends BaseRichBolt { ! public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) { // initialize task } ! public void execute(Tuple input) { // process input — QUICKLY! } ! public void declareOutputFields(OutputFieldsDeclarer declarer) { // declare output } ! } Worry about this!
  18. 18. Keep tuple processing code tight public class MyBolt extends BaseRichBolt { ! public void prepare(Map stormConf, TopologyContext context, OutputCollector collector) { // initialize task } ! public void execute(Tuple input) { // process input — QUICKLY! } ! public void declareOutputFields(OutputFieldsDeclarer declarer) { // declare output } ! } Not this.
  19. 19. Know your latencies L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns 14x L1 cache Mutex lock/unlock 25 ns Main memory reference 100 ns 20x L2 cache, 200x L1 cache Compress 1K bytes with Zippy 3,000 ns Send 1K bytes over 1 Gbps network 10,000 ns 0.01 ms Read 4K randomly from SSD* 150,000 ns 0.15 ms Read 1 MB sequentially from memory 250,000 ns 0.25 ms Round trip within same datacenter 500,000 ns 0.5 ms Read 1 MB sequentially from SSD* 1,000,000 ns 1 ms 4X memory Disk seek 10,000,000 ns 10 ms 20x datacenter roundtrip Read 1 MB sequentially from disk 20,000,000 ns 20 ms 80x memory, 20X SSD Send packet CA-­‐>Netherlands-­‐>CA 150,000,000 ns 150 ms https://gist.github.com/jboner/2841832
  20. 20. Use a Cache Guava is your friend.
  21. 21. Expose your knobs and gauges. DevOps will appreciate it.
  22. 22. Externalize Configuration Hard-coded values require recompilation/repackaging. conf.setNumWorkers(3); builder.setSpout("spout", new RandomSentenceSpout(), 5); builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout"); builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word")); Values from external config. No repackaging! conf.setNumWorkers(props.get(“num.workers")); builder.setSpout("spout", new RandomSentenceSpout(), props.get(“spout.paralellism”)); builder.setBolt("split", new SplitSentence(), props.get(“split.paralellism”)).shuffleGrouping("spout"); builder.setBolt("count", new WordCount(), props.get(“count.paralellism”)).fieldsGrouping("split", new Fields("word"));
  23. 23. What can DevOps do?
  24. 24. How big is your hose?
  25. 25. Text Find out!
  26. 26. Performance testing is essential! Text
  27. 27. How to deal with small pipes? (i.e. When your output is more like a garden hose.)
  28. 28. Parallelize Slow sinks
  29. 29. Parallelism == Manifold Take input from one big pipe and distribute it to many smaller pipes The bigger the size difference, the more parallelism you will need
  30. 30. Sizeup Initial assessment
  31. 31. Every fire is different.
  32. 32. Text
  33. 33. Every streaming use case is different.
  34. 34. Sizeup — Fire What are my water sources? What GPM can they support? How many lines (hoses) do I need? How much water will I need to flow to put this fire out?
  35. 35. Sizeup — Storm What are my input sources? At what rate do they deliver messages? What size are the messages? What's my slowest data sink?
  36. 36. There is no magic bullet.
  37. 37. But there are good starting points.
  38. 38. Numbers Where to start.
  39. 39. 1 Worker / Machine / Topology Keep unnecessary network transfer to a minimum
  40. 40. 1 Acker / Worker Default in Storm 0.9.x
  41. 41. 1 Executor / CPU Core Optimize Thread/CPU usage
  42. 42. 1 Executor / CPU Core (for CPU-bound use cases)
  43. 43. 1 Executor / CPU Core Multiply by 10x-100x for I/O bound use cases
  44. 44. Example 10 Worker Nodes 16 Cores / Machine 10 * 16 = 160 “Parallelism Units” available
  45. 45. Example 10 Worker Nodes 16 Cores / Machine 10 * 16 = 160 “Parallelism Units” available ! Subtract # Ackers: 160 - 10 = 150 Units.
  46. 46. Example 10 Worker Nodes 16 Cores / Machine (10 * 16) - 10 = 150 “Parallelism Units” available
  47. 47. Example 10 Worker Nodes 16 Cores / Machine (10 * 16) - 10 = 150 “Parallelism Units” available (* 10-100 if I/O bound) Distrubte this among tasks in topology. Higher for slow tasks, lower for fast tasks.
  48. 48. Example 150 “Parallelism Units” available Emit Calculate Persist 10 40 100
  49. 49. Watch Storm’s “capacity” metric This tells you how hard components are working. Adjust parallelism unit distribution accordingly.
  50. 50. This is just a starting point. Test, test, test. Measure, measure, measure.
  51. 51. Internal Messaging Handling backpressure.
  52. 52. Internal Messaging (Intra-worker)
  53. 53. Key Settings topology.max.spout.pending Spout/Bolt API: Controls how many tuples are in-flight (not ack’ed) Trident API: Controls how many batches are in flight (not committed)
  54. 54. Key Settings topology.max.spout.pending When reached, Storm will temporarily stop emitting data from Spout(s) WARNING: Default is “unset” (i.e. no limit)
  55. 55. Key Settings topology.max.spout.pending Spout/Bolt API: Start High (~1,000) Trident API: Start Low (~1-5)
  56. 56. Key Settings topology.message.timeout.secs Controls how long a tuple tree (Spout/Bolt API) or batch (Trident API) has to complete processing before Storm considers it timed out and fails it. Default value is 30 seconds.
  57. 57. Key Settings topology.message.timeout.secs Q: “Why am I getting tuple/batch failures for no apparent reason?” A: Timeouts due to a bottleneck. Solution: Look at the “Complete Latency” metric. Increase timeout and/or increase component parallelism to address the bottleneck.
  58. 58. Turn knobs slowly, one at a time.
  59. 59. Don't mess with settings you don't understand.
  60. 60. Storm ships with sane defaults Override only as necessary
  61. 61. Hardware Considerations
  62. 62. Nimbus Generally light load Can collocate Storm UI service m1.xlarge (or equivalent) should suffice Save the big metal for Supervisor/Worker machines…
  63. 63. Supervisor/Worker Nodes Where hardware choices have the most impact.
  64. 64. CPU Cores More is usually better The more you have the more threads you can support (i.e. parallelism) Storm potentially uses a lot of threads
  65. 65. Memory Highly use-case specific How many workers (JVMs) per node? Are you caching and/or holding in-memory state? Tests/metrics are your friends
  66. 66. Network Use bonded NICs if necessary Keep nodes “close”
  67. 67. Other performance considerations
  68. 68. Don’t “Pancake!” Separate concerns.
  69. 69. Don’t “Pancake!” Separate concerns. CPU Contention I/O Contention Disk Seeks (ZooKeeper)
  70. 70. Keep this guy happy. He has big boots and a shovel.
  71. 71. ZooKeeper Considerations Use dedicated machines, preferably bare-metal if an option Start with 3 node ensemble (can tolerate 1 node loss) I/O is ZooKeeper’s main bottleneck Dedicated disk for ZK storage SSDs greatly improve performance
  72. 72. Recap Know/track your latencies and code appropriately Externalize configuration Scaling is a factor of balancing the I/O and CPU requirements of your use case Dev + DevOps + Ops coordination and collaboration is essential
  73. 73. Thanks! P. Taylor Goetz, Hortonworks @ptgoetz

×