London hug


Published on

Talk given at the London Hadoop users' group meeting in May of 2012 about how to do real-time and batch computation on the same stream of information.

Published in: Technology, Business
  • Be the first to comment

London hug

  1. 1. Real-time and Long-time with Storm and Hadoop©MapR Technologies - Confidential 1
  2. 2. Real-time and Long-time with Storm and Hadoop MapR©MapR Technologies - Confidential 2
  3. 3.  Contact: – – @ted_dunning Slides and such: – Hash tag: #mapr_uk Collective notes:©MapR Technologies - Confidential 3
  4. 4. Company Background MapR provides the industry’s best Hadoop Distribution – Combines the best of the Hadoop community contributions with significant internally financed infrastructure development Background of Team – Deep management bench with extensive analytic, storage, virtualization, and open source experience – Google, EMC, Cisco, VMWare, Network Appliance, IBM, Microsoft, Apache Foundation, Aster Data, Brio, ParAccel Proven – MapR used across industries (Financial Services, Media, Telcom, Health Care, Internet Services, Government) – Strategic OEM relationship with EMC and Cisco – Over 1,000 installs ©MapR Technologies - Confidential 4
  5. 5. Expanding Hadoop Use Cases Hadoop APIs for Hadoop Applications NFS for file- ODBC (JDBC) based for SQL-based applications applications Mission Real-time Critical and SLA Applications dependent Applications Blue = MapR Innovations©MapR Technologies - Confidential 5
  6. 6. MapR’s Complete Distribution for Apache Hadoop Integrated, tested, hardened and MapR Control System Supported MapR LDAP, NIS Quotas, CLI, Heatmap™ Integration Alerts, Alarms REST APT 100% Hadoop, HBase, HDFS API compatible Hive Pig Oozle Sqoop HBase Whirr Easy portability/ migration between Zoo- Mahout Cascading Naglos Ganglia Flume distributions Integration Integration keeper Unique advanced features No changes required Direct Real-Time Snap- Data Access Streaming Volumes Mirrors shots Placement to Hadoop applications NFS Runs on commodity No NameNode High Performance Stateful Failover Architecture Direct Shuffle and Self Healing hardware 2.7 MapR’s Storage Services™ ©MapR Technologies - Confidential 6
  7. 7. So what about that real-time stuff?©MapR Technologies - Confidential 7
  8. 8. The Challenge Hadoop is great of processing vats of data – But sucks for real-time (by design!) Storm is great for real-time processing – But lacks any way to deal with batch processing It sounds like there isn’t a solution – Neither fashionable solution handles everything©MapR Technologies - Confidential 8
  9. 9. This is not a problem. It’s an opportunity!©MapR Technologies - Confidential 9
  10. 10. Hadoop is Not Very Real-time Unprocessed now Data t Fully Latest full Hadoop job processed period takes this long for this data©MapR Technologies - Confidential 10
  11. 11. Need to Plug the Hole in Hadoop We have real-time data with limited state – Exactly what Storm does – And what Hadoop does not We also have long-term analytics with lots of state – Exactly what Hadoop does – And what Storm does not Can Storm and Hadoop be combined?©MapR Technologies - Confidential 11
  12. 12. Real-time and Long-time together Blended now View view t Hadoop works Storm great back here works here©MapR Technologies - Confidential 12
  13. 13. An Example I want to know how many queries I get – Per second, minute, day, week Results should be available – within <2 seconds 99.9+% of the time – within 30 seconds almost always History should last >3 years Should work for 0.001 q/s up to 100,000 q/s Failure tolerant, yadda, yadda©MapR Technologies - Confidential 13
  14. 14. Rough Design – Data Flow Search Query Event Query Event Counter Counter Logger Engine Spout Spout Bolt Bolt Bolt Logger Logger Bolt Semi Snap Bolt Agg Raw Hadoop Logs Aggregator Long agg©MapR Technologies - Confidential 14
  15. 15. Counter Bolt Detail Input: Labels to count Output: Short-term semi-aggregated counts – (time-window, label, count) Input is logged until next flush Non-zero counts emitted on flush if – event count reaches threshold (typical 100K) – time since last count reaches threshold (typical 1-10s) Tuples acked when counts emitted Double count probability is > 0 but very small©MapR Technologies - Confidential 15
  16. 16. Counter Bolt Counterintuitivity Counts are emitted for same label, same time window many times – these are semi-aggregated – this is a feature – tuples can be acked within 1s – time windows can be much longer than 1s No need to send same label to same bolt – speeds failure recovery©MapR Technologies - Confidential 16
  17. 17. Design Flexibility Tuples can be ack’ed as soon as they hit the log – counter can recover state on failure – log is burn after write Count flush interval can be extended without extending tuple timeout – Decreases currency of counts in semi-aggregates Total bandwidth for log is typically not huge – All of twitter @10,000 messages per second = 10K x 2KB = 20MB/s©MapR Technologies - Confidential 17
  18. 18. Counter Bolt No-nos Cannot accumulate entire period in-memory – Tuples must be ack’ed much sooner – State must be persisted before ack’ing – State can easily grow too large to handle without disk access Cannot persist entire count table at once – Incremental persistence required©MapR Technologies - Confidential 18
  19. 19. Guarantees Counter output volume is small-ish – the greater of k tuples per 100K inputs or k tuple/s – 1 tuple/s/label/bolt for this exercise Persistence layer must provide guarantees – distributed against node failure – must have either readable flush or closed-append HDFS is distributed, but provides no guarantees and strange semantics MapRfs is distributed, provides all necessary guarantees©MapR Technologies - Confidential 19
  20. 20. Failure Modes Bolt failure – buffered tuples will go un’acked – after timeout, tuples will be resent – timeout ≈ 10s – if failure occurs after persistence, before acking, then double-counting is possible Storage (with MapR) – most failures invisible – a few continue within 0-2s, some take 10s – catastrophic cluster restart can take 2-3 min – logger can buffer this much easily©MapR Technologies - Confidential 20
  21. 21. Presentation Layer Presentation must – read recent output of Logger bolt – read relevant output of Hadoop jobs – combine semi-aggregated records User will see – counts that increment within 0-2 s of events – seamless meld of short and long-term data©MapR Technologies - Confidential 21
  22. 22. Example 2 – Real-time learning My system has to – learn a response model and – select training data – in real-time Data rate up to 100K queries per second©MapR Technologies - Confidential 22
  23. 23. Door Number 3 – AB testing in real-time I have 15 versions of my landing page Each visitor is assigned to a version – Which version? A conversion or sale or whatever can happen – How long to wait? Some versions of the landing page are horrible – Don’t want to give them traffic©MapR Technologies - Confidential 23
  24. 24. Real-time Constraints Selection must happen in <20 ms almost all the time Training events must be handled in <20 ms Failover must happen within 5 seconds Client should timeout and back-off – no need for an answer after 500ms State persistence required©MapR Technologies - Confidential 24
  25. 25. Rough Design Selector Query Event Counter DRPC Spout Timed Join Model Layer Spout Bolt Conversion Logger Logger Model Detector Bolt Bolt State Raw Logs©MapR Technologies - Confidential 25
  26. 26. A Quick Diversion You see a coin – What is the probability of heads? – Could it be larger or smaller than that? I flip the coin and while it is in the air ask again I catch the coin and ask again I look at the coin (and you don’t) and ask again Why does the answer change? – And did it ever have a single value?©MapR Technologies - Confidential 26
  27. 27. A First Conclusion Probability as expressed by humans is subjective and depends on information and experience©MapR Technologies - Confidential 27
  28. 28. A Second Diversion What is the mass of the moon? – 1/2 degree @ 385 Mm = ~ 3.8 Mm diameter (really about 3.4-ish) – V = 1/6 x pi x 3.83 x 1018 m3 = ~ 29 x 1018 m3 (really about 22) – m = rho V = 4 Mg/m3 x 29 x 1018 m3 = 1.2 x 1023 kg (really about 0.7) Is that the exact number? – Shouldn’t we have confidence bounds? Wikipedia says: 7.3477 × 1022 kg – Is that the exact number? – Shouldn’t they have confidence bounds?©MapR Technologies - Confidential 28
  29. 29. A Second Conclusion A single number is a bad way to express uncertain knowledge A distribution of values might be better©MapR Technologies - Confidential 29
  30. 30. I Dunno©MapR Technologies - Confidential 30
  31. 31. 5 and 5©MapR Technologies - Confidential 31
  32. 32. 2 and 10©MapR Technologies - Confidential 32
  33. 33. Bayesian Bandit Compute distributions based on data Sample p1 and p2 from these distributions Put a coin in bandit 1 if p1 > p2 Else, put the coin in bandit 2©MapR Technologies - Confidential 33
  34. 34. And it works! 0.12 0.11 0.1 0.09 0.08 0.07 regret 0.06 ε- greedy, ε = 0.05 0.05 0.04 Bayesian Bandit with Gam m a- Norm al 0.03 0.02 0.01 0 0 100 200 300 400 500 600 700 800 900 1000 1100 n©MapR Technologies - Confidential 34
  35. 35. Video Demo©MapR Technologies - Confidential 35
  36. 36. The Code Select an alternative n = dim(k)[1] p0 = rep(0, length.out=n) for (i in 1:n) { p0[i] = rbeta(1, k[i,2]+1, k[i,1]+1) } return (which(p0 == max(p0))) Select and learn for (z in 1:steps) { i = select(k) j = test(i) k[i,j] = k[i,j]+1 } return (k) But we already know how to count!©MapR Technologies - Confidential 36
  37. 37. The Basic Idea We can encode a distribution by sampling Sampling allows unification of exploration and exploitation Can be extended to more general response models©MapR Technologies - Confidential 37
  38. 38.  Contact: – – @ted_dunning Slides and such: –©MapR Technologies - Confidential 39
  39. 39. MapR’s Innovations©MapR Technologies - Confidential 40
  40. 40. Thank You©MapR Technologies - Confidential 41