Tech mesh london 2012

1,170 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,170
On SlideShare
0
From Embeds
0
Number of Embeds
21
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Tech mesh london 2012

  1. 1. TechMesh   London   December  2012  Copyright  Push  Technology  2012  
  2. 2. •  Distributed  Systems  /  HPC  guy.     •  Chief  Scien*st  :-­‐  at  Push  Technology   •  Alumnus  of  :-­‐       Motorola,  IONA,  BeLair,  JPMC,  StreamBase.   •  School:  Trinity  College  Dublin.     -­‐  BA  (Mod).  Comp.  Sci.+     -­‐  M.Sc.  Networks  &  Distributed  Systems       •  Responds  to:  Guinness,  Whisky   About  me?  Copyright  Push  Technology  2012   Darach@PushTechnology.com  
  3. 3. ConversaXonal   Big  Data  Copyright  Push  Technology  2012  
  4. 4. Big  Data.  The  most  important  V  is   Value.  •  4Vs:  •  Volume   •  “It’s  the  connecXons  •  Velocity   that  ma]er  most”    •  Variety   -­‐  David  Evans,  Cisco  •  Variability  Copyright  Push  Technology  2012  
  5. 5. Connect  all  the  things  Copyright  Push  Technology  2012  
  6. 6. An  Internet  Minute   10k,  1ms   1k,  100us  Copyright  Push  Technology  2012  
  7. 7. The  Last  Mile  is  HARD   Buffer   Bloat   Network   Ba]ery!   SaturaXon   Download  Bandwidth   30.00   20.00   Expected  (Mbps)   10.00   Download  (Mbps)   0.00   Sep-­‐06   -­‐10.00   Oct-­‐06   Oct-­‐06   Nov-­‐06   Dec-­‐06   Dec-­‐06   Jan-­‐07   Feb-­‐07   Latency  1.00E+00  1.00E+01   y  =  -­‐1.1667x  +  45703   Latency  (ms)  1.00E+02   R²  =  0.02496   Expected  (ms)  1.00E+03   Linear  (Latency  (ms))  1.00E+04   Sep-­‐06   Oct-­‐06   Oct-­‐06   Nov-­‐06   Dec-­‐06   Dec-­‐06   Jan-­‐07   Feb-­‐07   Copyright  Push  Technology  2012  
  8. 8. ConversaXons  M2M   M2H   TradiXonal  Messaging   BidirecXonal  Real   MQTT,  AMQP   Time  Data  DistribuXon   WebSockets   W3C  Server  Sent   Events  Copyright  Push  Technology  2012  
  9. 9. TradiXonal  Messaging   A B ba bb Producers ? ConsumersPros   Cons  •  Loosely  coupled.   •  No  data  model.  Slinging  blobs  •  All  you  can  eat  messaging  pa]erns   •  Fast  producer,  slow  consumer?  Ouch.  •  Familiar   •  No  data  ‘smarts’.  A  blob  is  a  blob.  Copyright  Push  Technology  2012  
  10. 10. Invented  yonks  ago…  Before  the  InterWebs  For  ‘reliable’  networks  For  machine  to  machine  Remember  DEC  Message  Queues?  -­‐  That  basically.  Vomit!  Copyright  Push  Technology  2012  
  11. 11. When  fallacies  were  simple   - The  network  is  reliable   - Latency  is  zero     - Bandwidth  is  infinite   - There  is  one  administrator   - Topology  does  not  change   - The  network  is  secure   - Transport  cost  is  zero   - The  network  is  homogeneous     Copyright  Push  Technology  2012  
  12. 12. Then  in  1992,  this  happened:  The  phrase  ‘surfing  the  internet’  was  coined  by  Jean  Poly.    First  SMS  sent    First  base.  Copyright  Push  Technology  2012  
  13. 13. It  grew,  and  it  grew  Copyright  Push  Technology  2012  
  14. 14. Then  in  2007,  this  happened:  The   god  phone:    Surfing  died.  Touching  happened.    Second  base  unlocked.  Copyright  Push  Technology  2012  
  15. 15. Then  in  2007,  this  happened:  So  we  took  all  the  things  and  put  them  in  the  internet:    Cloud  happened.    So  we  could  touch  all  the  things.   Messaging     Apps  They  grew  and  they  grew  like  all  the   Hardware   good  things  do.   Virtualize  all  the  things   Services   Skills,   SpecialXes  Copyright  Push  Technology  2012  
  16. 16. Then  in  2007,  this  happened:  And  it  grew  and  it  grew.    Like  all  the  good  things  do.   Messaging   Apps   Hardware   Virtualize  all  the  things   Services   Skills,   SpecialXes  Copyright  Push  Technology  2012  
  17. 17. So  we  scaled  all  the  things  Big  data  is  fundamentally  about  extracXng  value,  understanding   implicaXons,  gaining  insight  so  we  can   learn  and  improve  conXnuously.    It’s  nothing  new.   big   DATA  Copyright  Push  Technology  2012  
  18. 18. But,  problem:  The  bird,  basically.   Immediately  Inconsistent.   But,  Eventually  Consistent   …  Maybe.     Humans   Are   grumpy   IIBECMCopyright  Push  Technology  2012  
  19. 19. Stop.   - The  network  is  not  reliable     nor  is  it  cost  free.   - Latency  is  not  zero   nor  is  it  a  democracy.   - Bandwidth  is  not  infinite   nor  predictable  especially  the  last  mile!   - There  is  not  only  one  administrator   trust,  relaXonships  are  key   - Topology  does  change   It  should,  however,  converge  eventually   - The  network  is  not  secure   nor  is  the  data  that  flows  through  it   - Transport  cost  is  not  zero   but  what  you  don’t  do  is  free   - The  network  is  not  homogeneous   nor  is  it  smart  Copyright  Push  Technology  2012  
  20. 20. Look.   - What  and  How  are  what  geeks  do.   - Why  gets  you  paid   - Business  Value  and  Trust  dictate  What  and  How   -   Policies,  Events  and  Content  implements  Business  Value   - Science  basically.  But  think  like  a  carpenter:   - Measure  twice.  Cut  once.  Copyright  Push  Technology  2012  
  21. 21. Listen.   -   Every  nuance  comes  with  a  set  of  tradeoffs.   -   Choosing  the  right  ones  can  be  hard,  but  it  pays  off.   -   Context,  Environment  are  criXcal   -   Break  all  the  rules,  one  benchmark  at  a  Xme.     -   Benchmark  Driven  Development  Copyright  Push  Technology  2012  
  22. 22. AcXon:  Data  DistribuXon  Messaging  remixed  around:    Relevance  -­‐  Queue  depth  for  conflatable  data  should  be  0  or  1.  No  more    Responsiveness  -­‐  Use  HTTP/REST  for  things.  Stream  the  li]le  things    Timeliness  -­‐  It’s  relaXve.  M2M  !=  M2H.    Context    -­‐  Packed  binary,  deltas  mostly,  snapshot  on  subscribe.    Environment-­‐  Don’t  send  1M  1K  events  to  a  mobile  phone  with  0.5mbps.    Copyright  Push  Technology  2012  
  23. 23. AcXon.  Virtualize  Client  Queues   A B ba bb Buffer Producers Bloat Consumers Nuance:  Client  telemetry.  Tradeoff:  Durable  subscripXons  harder  Copyright  Push  Technology  2012  
  24. 24. AcXon.  Add  data  caching   A B ba x bb x Producers Is it a Consumers cache? One  hop  closer  to  the  edge  …  Copyright  Push  Technology  2012  
  25. 25. AcXon.  Exploit  data  structure   A B Snapshot Delta ba x bb x Producers State! Consumers Snapshot  recovery.  Deltas  or  Changes  mostly.  Conserves  bandwidth  Copyright  Push  Technology  2012  
  26. 26. AcXon.  Behaviors   A B ba x bb x X The Producers topic is Consumers the cloud! Extensible.  Nuance?  Roll  your  own  protocols.  Tradeoff?  3rd  party  code  in  the  engine  :/  Copyright  Push  Technology  2012  
  27. 27. AcXon.  Structural  ConflaXon   A B ba x bb x X Producers Current Consumers data! + Ensures  only  current  +  consistent  data  is  distributed.  AcXvely  soaks  up  bloat.  Extensible!  Copyright  Push  Technology  2012  
  28. 28. AcXon.  Structural  ConflaXon   C2 A2 C1 B1 A1 op Cop Bop Aop ... ... C2 A2 C1 B1 A1 R C2 B1 A2 Current   5 4 3 2 1 5 1 4 Current   C2 A2 C1 B1 A1 M+ C2 B1 A+ +     5 4 3 2 1 8 1 5 Consistent   = = C2 + C1 A2 + A1 = = 5+3 4+1Replace   Merge  •  Replace/Overwrite  ‘A1’  with  ‘A2’   •  Merge  A1,  A2  under  some  operaXon.  •  ‘Current  data  right  now’   •  OperaXon  is  pluggable  •  Fast   •  Tunable  consistency   •  Performance  f(operaXon)  Copyright  Push  Technology  2012  
  29. 29. Benchmark  Driven   Development     ‘Measuring  Embedded   CEP’  Copyright  Push  Technology  2012  
  30. 30. Performance  is  not  a  number  Copyright  Push  Technology  2012  
  31. 31. Benchmark  Models   Box A Box B Throughput  (Worst  case)   Latency  (Best  case)   •  Ramp  clients  conXnuously   •  Really  simple.  1  client   •  100  messages  per  second  per  client   •  Ping  –  Pong.  Measure  RTT  Xme.   •  Payload:  125  ..  2000  bytes   •  Payload:  125  ..  2000  bytes   •  Message  style  vs  with  conflaXon  Copyright  Push  Technology  2012  
  32. 32. Throughput.  Non-­‐conflated   •  Cold  start  server   •  Ramp  750  clients  ‘simultaneously’  at  5   second  intervals   •  5  minute  benchmark  duraXon   •  Clients  onboarded  linearly  unXl  IO   (here)  or  compute  saturaXon  occurs.     •  What  the  ‘server’  sees  Copyright  Push  Technology  2012  
  33. 33. Throughput.  Non-­‐conflated   •  What  the  ‘client’  sees     •  At  and  beyond  saturaXon  of  some   resource?   •  Things  break!     •  New  connecXons  fail.  Good.   •  Long  established  connecXons  ok.   •  Recent  connecXons  Xmeout  and  client   connecXons  are  dropped.  Good.   •  Diffusion  handles  smaller  message   sizes  more  gracefully     •  Back-­‐pressure  ‘waveform’  can  be   tuned  out.  Or,  you  could  use  structural   conflaXon!  Copyright  Push  Technology  2012  
  34. 34. Throughput.  Replace-­‐conflated   •  Cold  start  server     •  Ramp  750  clients  ‘simultaneously’  at  5   second  intervals     •  5  minute  benchmark  duraXon   •  Clients  onboarded  linearly  unXl  IO   (here)  or  compute  saturaXon  occurs.     •  Takes  longer  to  saturate  than  non-­‐ conflated  case     •  Handles  more  concurrent  connecXons     •  Again,  ‘server’  view  Copyright  Push  Technology  2012  
  35. 35. Throughput.  Replace-­‐Conflated   •  What  the  ‘client’  sees     •  Once  saturaXon  occurs  Diffusion   adapts  acXvely  by  degrading  messages   per  second  per  client     •  This  is  good.  Soak  up  the  peaks   through  fairer  distribuXon  of  data.     •  Handles  spikes  in  number  of   connecXons,  volume  of  data  or   saturaXon  of  other  resources  using  a   single  load  adapXve  mechanism  Copyright  Push  Technology  2012  
  36. 36. Throughput.  Stats  •  Data  /  (Data  +  Overhead)?   •  1.09  GB/Sec  payload  at  saturaXon.     •  10Ge  offers  theoreXc  of  1.25GB/Sec.   •  Ballpark  87.2%  of  traffic  represents  business  data.  •  Benefits?   •  OpXmize  for  business  value  of  distributed  data.   •  Most  real-­‐world  high-­‐frequency  real-­‐Xme  data  is   recoverable   •  Stock  prices,  Gaming  odds   •  Don’t  use  conflaXon  for  transacXonal  data,  natch!  Copyright  Push  Technology  2012  
  37. 37. Latency  (Best  case)   •  Cold  start  server     •  1  solitary  client     •  5  minute  benchmark  duraXon   •  Measure  ping-­‐pong  round  trip  Xme     •  Results  vary  by  network  card  make,   model,  OS  and  Diffusion  tuning.  Copyright  Push  Technology  2012  
  38. 38. Latency.  Round  Trip  Stats   •  Sub-­‐millisecond  at   the  99.99%   percenXle   •  As  low  as  15   microseconds   •  On  average   significantly  sub   100  microseconds   for  small  data  Copyright  Push  Technology  2012  
  39. 39. Latency.  Single  Hop  Stats   •  Sub  500   microseconds  at   the  99.99%   percenXle   •  As  low  as  8   microseconds   •  On  average   significantly  sub  50   microseconds  for   small  data  Copyright  Push  Technology  2012  
  40. 40. Latency  in  perspecXve   ~ ~ 125 1000 bytes bytes (avg) (avg) 2.4 us 5.0 us 8.0 us 50.0 us•  2.4  us.  A  tuned  benchmark  in  C  with  low  latency  10Ge  NIC,  with  kernel   bypass,  with  FPGA  acceleraXon  •  5.0  us.  A  basic  java  benchmark  –  as  good  as  it  gets  in  java  •  Diffusion  is  measurably  ‘moving  to  the  ley’  release  on  release  •  We’ve  been  acXvely  tracking  and  conXnuously  improving  since  4.0  Copyright  Push  Technology  2012  
  41. 41. Embedded   Event   Processing  Copyright  Push  Technology  2012  
  42. 42. CEP,  basically   w w S C Q w w What  is  eep.js?   •  Add  aggregate  funcXons  and  window  operaXons  to  Node.js   •  4  window  types:  tumbling,  sliding,  periodic,  monotonic   •  Node  makes  evented  IO  easy.  So  just  add  windows.   •  Fast.  8-­‐40  million  events  per  second  (upper  bound).  Copyright  Push  Technology  2012  
  43. 43. Tumbling  Windows   x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5init() 2 3 4 5 init() init() t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... What  is  a  tumbling  window?   •  Every  N  events,  give  me  an  average  of  the  last  N  events   •  Does  not  overlap  windows   •  ‘Closing’  a  window,  ‘Emits’  a  result  (the  average)   •  Closing  a  window,  Opens  a  new  window   Copyright  Push  Technology  2012  
  44. 44. Aggregate  FuncXons   What  is  an  aggregate  func*on?   •  A  funcXon  that  computes  values  over  events.   •  The  cost  of  calculaXons  are  ammorXzed  per  event   •  Just  follow  the  above  recipe   •  Example:  Aggregate  2M  events  (equity  prices),  send  to  GPU   on  emit,  receive  2M  opXons  put/call  prices  as  a  result.  Copyright  Push  Technology  2012  
  45. 45. meh:  Fumbling  Windows  Copyright  Push  Technology  2012  
  46. 46. Lesser  Fumbling  Windows  Copyright  Push  Technology  2012  
  47. 47. Event  Windows,  tumbling.  Copyright  Push  Technology  2012  
  48. 48. Sliding  Windows  init() 1 2 3 4 5 .. .. .. .. x() 1 2 3 4 .. .. .. .. x() 1 2 3 .. .. .. .. x() 1 2 .. .. .. .. t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... What  is  a  sliding  window?   •  Like  tumbling,  except  can  overlap.  But  O(N2),  Keep  N  small.   •  Every  event  opens  a  new  window.   •  Ayer  N  events,  every  subsequent  event  emits  a  result.   •  Like  all  windows,  cost  of  calculaXon  amorXzed  over  events   Copyright  Push  Technology  2012  
  49. 49. Event  Windows,  sliding.  Copyright  Push  Technology  2012  
  50. 50. Periodic  Windows   x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5init() 2 3 4 5 init() init() t0 t1 t2 t3 ... What  is  a  periodic  window?   •  Driven  by  ‘wall  clock  Xme’  in  milliseconds   •  Not  monotonic,  natch.  Beware  of  NTP   Copyright  Push  Technology  2012  
  51. 51. Event  Windows,  periodic.  Copyright  Push  Technology  2012  
  52. 52. Monotonic  Windows   my my my x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5init() 2 3 4 5 init() init() t0 t1 t2 t3 ... What  is  a  monotonic  window?   •  Driven  mad  by  ‘wall  clock  Xme’?  Need  a  logical  clock?   •  No  worries.  Provide  your  own  clock!  Eg.  Vector  clock   Copyright  Push  Technology  2012  
  53. 53. Event  Windows,  monotonic.  Copyright  Push  Technology  2012  
  54. 54. Event  Windows,  monotonic.  Copyright  Push  Technology  2012  
  55. 55. Event  Windows,  monotonic.  Copyright  Push  Technology  2012  
  56. 56. EEP  for  Node.js  (eep.js)  Embedded  Event  Processing:    •  Simple  to  use.  Aggregates  FuncXons  and  Windowed  event  processing.      •  npm  install  eep.  Use  it.  Fork  it.  I’m  not  a  JS  guy,  I’m  learning  because  of  Node.js.      •  Fast.  CEP  engines  typically  handle  ~250K/sec.    •  For  small  N  (most  common)  is  34x  -­‐  200x  faster  than  commercial  CEP  engines.    •  But,  at  a  small  price.  Simple.  No  mulX-­‐dimensional,  infinite  or  predicate  windows    •  Reduces  a  flood  of  events  into  a  few  in  near  real  Xme    •  Can  handle  8-­‐40  million  events  per  second  (max,  on  my  laptop).  YMMV.    •  Combinators  may  be  added.  [Ugh,  if  I  need  combinators]  Copyright  Push  Technology  2012  
  57. 57. Le  Old  Performance?   1000   100   10  Millions   Java  Tumbling  Events   Java  Sliding   1   per   2   4   8   16   32   64   128   256   512   1024   2048   4096   8192   16384  Second   Node  Tumbling   Node  Sliding   0.1   0.01   0.001   Window  Size  [Fixed]   Copyright  Push  Technology  2012  
  58. 58. One  More  Thing  …  •  1  tablespoon  of  Mechanical  Sympathy  •  1  dash  of  revisiXng  first  principles  •  1  simple  observaXon  •  Sliding  Windows.  O(N2)  <-­‐  Wat?  •  …  It’s  the  Algorithm,  Stoopid!  Copyright  Push  Technology  2012  
  59. 59. It’s  the  Algorithm,  Stoopid!  Copyright  Push  Technology  2012  
  60. 60. It’s  the  Algorithm,  Stoopid!  Copyright  Push  Technology  2012  
  61. 61. Reversible  AggregaXon  •  Takes  us  from  O(N2)  to  O(N)  for  Sliding  windows  Copyright  Push  Technology  2012  
  62. 62. CompensaXng  Aggregates  init() 1 2 3 4 5 .. .. .. .. x() 1 2 3 4 .. .. .. .. do { x() 1 2 3 … .. .. .. .. } while (…) x() 1 2 .. .. .. .. 1 compensate() t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... Copyright  Push  Technology  2012  
  63. 63. Le  New  Performance?  Copyright  Push  Technology  2012  
  64. 64. EEP  looking  forward?   •  Streams  &  Pipes   •  Eep  for  Erlang,  Java,  …   •  Infinite  windows  (never  open,  never  close,  always  ‘open’)   •  Peep  –  Data  parallel  eep   •  ConXnuous  Streaming  Map  Reduce   •  Deep  –  Data  analyXcs  with  Peep   •  Because  I’ve  spent  4+  weeks  up  to  my  fourth  arse   in  bash,  sed,  awk,  excel  and  gnuplot.       Eep  all  the  things,  basically.  Copyright  Push  Technology  2012  
  65. 65. •  Thank  you  for  listening  to,  having  me     •  Le  twi]er:  @darachennis     •  Github:  h]ps://github.com/darach/eep-­‐js     •  npm  install  eep     •  EEP.erl  ‘soon’     •  Fork  it,  Port  it,  Enjoy  it!   QuesXons?  Copyright  Push  Technology  2012   Darach@PushTechnology.com  

×