Your SlideShare is downloading. ×
0
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Data distribution in the cloud with Node.js
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Data distribution in the cloud with Node.js

17,674

Published on

Messaging becomes Data Distributions gets embedded event processing (not complex, made simple) - bending all the rules one benchmark at a time - Push Technology, Waratek and other things

Messaging becomes Data Distributions gets embedded event processing (not complex, made simple) - bending all the rules one benchmark at a time - Push Technology, Waratek and other things

Published in: Technology
0 Comments
25 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
17,674
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
125
Comments
0
Likes
25
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Surfing the internet coined by Jean Poly
  • Abstractio
  • Abstractio
  • Abstractio
  • Abstractio
  • Abstractio
  • Abstractio
  • Abstractio
  • Transcript

    • 1. Data Distribution in the cloud with Node.jsCopyright Push Technology 2012
    • 2. • British startup. Founded in 2006. • ‘Last mile’ data distribution specialist. • Data-centric approach to messaging/caching. • Preferred by 6 of the top 10 online eGaming exchanges. • Growing fast. 400% year on year. • Focus: Better bang for your bytes! Introducing Push TechnologyCopyright Push Technology 2012 Twitter: @push_technology
    • 3. • Distributed Systems / HPC guy. • Chief Scientist :- at Push Technology • Alumnus of :- Motorola, IONA, Betfair, JPMC, StreamBase. • School: Trinity College Dublin. - BA (Mod). Comp. Sci.+ - M.Sc. Networks & Distributed Systems • Responds to: Guinness, Whisky About me?Copyright Push Technology 2012 Darach@PushTechnology.com
    • 4. • Favorite language: Erlang • Favorite bits? • OTP – Behaviors • Bit Syntax • Least favorite language: Java • Paid to write this stuff • Love the JVM • Liking Node a lot. • Small fast data guy. I work in microseconds, measure in nanoseconds. On my critical path micro-benchmarking is a way of life. About me?Copyright Push Technology 2012 Darach@PushTechnology.com
    • 5. 1st clean room certified JVM in 10 years. Built in Dublin! It rocks. Tenant #1 Tenant #2 Tenant #N (Diffusion) (Diffusion) (Diffusion) Push Technology Diffusion Waratek Cloud VM for Java Benefits • High density deployments • Elastic. Scalable on demand • Meterability: Bandwidth and compute utilization • Multi-tenant. Each tenant fully isolatedCopyright Push Technology 2012
    • 6. A US Cap Market second? • 174 microseconds round trip time rules out High Frequency Trading applications. Not on the critical path! Source: Me, former life @StreamBase • http://slidesha.re/guZOVe
    • 7. Data Distribution. Wat?Copyright Push Technology 2012
    • 8. Traditional Messaging A B ba bb Producers ? ConsumersPros Cons• Loosely coupled. • No data model. Slinging blobs• All you can eat messaging patterns • Fast producer, slow consumer? Ouch.• Familiar • No data ‘smarts’. A blob is a blob.Copyright Push Technology 2012
    • 9. Invented yonks ago…Before the InterWebsFor ‘reliable’ networksFor machine to machineRemember DEC Message Queues?- That basically. Vomit!Copyright Push Technology 2012
    • 10. When fallacies were simple -The network is reliable -Latency is zero -Bandwidth is infinite -There is one administrator -The network is secure -Transport cost is zero -The network is homogeneousCopyright Push Technology 2012
    • 11. Then in 1992, this happened:The phrase ‘surfing the internet’ was coined by Jean Poly.First base.Copyright Push Technology 2012
    • 12. It grew, and it grewCopyright Push Technology 2012
    • 13. Then in 2007, this happened:The god phone:Surfing died. Touching happened.Second base unlocked.Copyright Push Technology 2012
    • 14. Then in 2007, this happened:So we took all the things and put them in the internet:Cloud happened.So we could touch all the things. Messaging Apps Hardware Virtualize all the things Services Skills, SpecialtiesCopyright Push Technology 2012
    • 15. Then in 2009, this happened:Ryan Dahl, basically.Tyrannically asynchronous.Devilishly event oriented.Amazoidingly non-blocking.Copyright Push Technology 2012
    • 16. It grew, and it grewLike all the good things do.Copyright Push Technology 2012
    • 17. Stop. Fallacies? Reality: -The network is not reliable nor is it cost free. -Latency is not zero nor is it a democracy. -Bandwidth is not infinite nor predictable especially the last mile! -There is not only one administrator trust, relationships are key -The network is not secure nor is the data that flows through it -Transport cost is not zero but what you don’t do is free -The network is not homogeneous nor is it smartCopyright Push Technology 2012
    • 18. Look. What, How & Why? -What and How are what geeks do. -Why gets you paid -Business Value and Trust dictate What and How - Policies, Events and Content implements Business Value -Science basically. But think like a carpenter: -Measure twice. Cut once.Copyright Push Technology 2012
    • 19. The Problem: The bird, basically. Immediately Inconsistent. But, Eventually Consistent … Maybe.Copyright Push Technology 2012
    • 20. Listen. - Every nuance comes with a set of tradeoffs. - Choosing the right ones can be hard, but it pays off. - Context, Environment are critical - Break all the rules, one benchmark at a time. - Benchmark Driven Development FTWCopyright Push Technology 2012
    • 21. Act. - You measured twice, right? - So get cutting! - SimplesCopyright Push Technology 2012
    • 22. Act. Telepathy? Telemetry! A B ba bb Buffer Producers Bloat Consumers Virtualize client queues? Nuance: ‘See’ backlog, client affinity. Tradeoff GD harder :/Copyright Push Technology 2012
    • 23. Act. Stateless or Stateful Topics A B ba x bb x Producers Is it a Consumers cache? Data one hop closer to consumers. Good state? Touch it! Exploit it! Use it!Copyright Push Technology 2012
    • 24. Act. Finagle the data A B Snapshot Delta ba x bb x Producers State! Consumers Last value cached. Tradeoff? Memory. Snapshot on subscribe. Deltas thereafterCopyright Push Technology 2012
    • 25. Act. ‘Smart data’ A B C A C D t0 t1 Don’t repeat yourself. Send the changes, not the whole list after initial ‘snapshot’.Copyright Push Technology 2012
    • 26. Act. Behaviors A B ba x bb x X The Producers topic is Consumers the cloud! Extensible. Nuance? Roll your own protocols. Tradeoff? 3rd party code in the engine :/Copyright Push Technology 2012
    • 27. Data DistributionMessaging remixed around:Relevance - Queue depth for conflatable data should be 0 or 1. No moreResponsiveness - Use HTTP/REST for things. Stream the little thingsTimeliness - It’s relative. M2M != M2H.Context - Packed binary, deltas mostly, snapshot on subscribe.Environment- Don’t send 1M 1K events to a mobile phone with 0.5mbps.Copyright Push Technology 2012
    • 28. An Example Operations:> Tenants :> Gaming Live Internet Apps Finance QA + Dev + UATCopyright Push Technology 2012
    • 29. Either way?It’s about the data.Period.The rest (analysis, storage, transformation) is sugar.Copyright Push Technology 2012
    • 30. Sugar? Streams w w S C Q w w Stream Operations • Mapping. Change/enrich the data structurally. • Aggregation. A ‘window of’ data. Eg. A seconds worth. • Splitting & Filtering • Combining multiple streams. Eg. Temporal pattern matching • Access/Store. Eg: CRUD, variable, file, …Copyright Push Technology 2012
    • 31. Sugar? Streams w w S C Q w w Stream Operations • Mapping. Just a function call in Node.js • Aggregation. A ‘window of’ data. Eg. A seconds worth. • Splitting & Filtering. An expression or a set thereof. • Combining multiple streams. It depends. Can be ‘complex’ • Access/Store. Trivial.Copyright Push Technology 2012
    • 32. Embedded Event Processing with Node.js. eep.jsCopyright Push Technology 2012
    • 33. Introducing eep.js w w S C Q w w What is eep.js? • Add aggregate functions and window operations to Node.js • 4 window types: tumbling, sliding, periodic, monotonic • Node makes evented IO easy. So just add windows. • Fast. 8-40 million events per second (upper bound).Copyright Push Technology 2012
    • 34. eep.js: Tumbling Windows x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5init() 2 3 4 5 init() init() t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... What is a tumbling window? • Every N events, give me an average of the last N events • Does not overlap windows • ‘Closing’ a window, ‘Emits’ a result (the average) • Closing a window, Opens a new window Copyright Push Technology 2012
    • 35. eep.js: Aggregate Functions What is an aggregate function? • A function that computes values over events. • The cost of calculations are ammortized per event • Just follow the above recipe • Example: Aggregate 2M events (equity prices), send to GPU on emit, receive 2M options put/call prices as a result.Copyright Push Technology 2012
    • 36. meh: Fumbling WindowsCopyright Push Technology 2012
    • 37. Lesser Fumbling WindowsCopyright Push Technology 2012
    • 38. Event Windows, tumbling.Copyright Push Technology 2012
    • 39. eep.js: Sliding Windowsinit() 1 2 3 4 5 .. .. .. .. x() 1 2 3 4 .. .. .. .. x() 1 2 3 .. .. .. .. x() 1 2 .. .. .. .. t0 t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 ... What is a sliding window? • Like tumbling, except can overlap. But O(N2), Keep N small. • Every event opens a new window. • After N events, every subsequent event emits a result. • Like all windows, cost of calculation ammortized over events Copyright Push Technology 2012
    • 40. Event Windows, sliding.Copyright Push Technology 2012
    • 41. eep.js: Periodic Windows x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5init() 2 3 4 5 init() init() t0 t1 t2 t3 ... What is a periodic window? • Driven by ‘wall clock time’ in milliseconds • Not monotonic, natch. Beware of NTP Copyright Push Technology 2012
    • 42. Event Windows, periodic.Copyright Push Technology 2012
    • 43. eep.js: Monotonic Windows my my my x() x() x() x() emit() x() x() x() x() emit() 1 2 3 4 x() x() x() x() emit() 2 3 4 5init() 2 3 4 5 init() init() t0 t1 t2 t3 ... What is a monotonic window? • Driven mad by ‘wall clock time’? Need a logical clock? • No worries. Provide your own clock! Eg. Vector clock Copyright Push Technology 2012
    • 44. Event Windows, monotonic.Copyright Push Technology 2012
    • 45. Event Windows, monotonic.Copyright Push Technology 2012
    • 46. Event Windows, monotonic.Copyright Push Technology 2012
    • 47. eep.jsEmbedded Event Processing:• Simple to use. Aggregates Functions and Windowed event processing.• Get it from GitHub/npm soon. Use it. Fork it.• Fast. CEP engines typically handle ~250K/sec.• For small N (most common) is 34x - 200x faster than commercial CEP engines.• But, at a small price. Simple. No multi-dimensional, infinite or predicate windows• Reduces a flood of events into a few in near real time• Can handle 8-40 million events per second (max, on my laptop). YMMV.• Combinators may be added. [Ugh, if I need combinators]Copyright Push Technology 2012
    • 48. Le Performance? 100 10Millions Java TumblingEvents Java Sliding 1 perSecond 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 Node Tumbling Node Sliding 0.1 0.01 0.001 Window Size [Fixed] Copyright Push Technology 2012
    • 49. Performance? In perspective• A 1 producer, 1 consumer lock-free wait-free full duplex queue implementation on a 2.3GHz intel Sandybridge can: • Distribute ~300M events between hyperthreads per second • Distribute ~50M events between two hardware threads on two cores on the same physical die • Distributed ~30M events between two hardware threads on two cores on separate physical dies • You can, with a fully lock-free wait-free system (and you bypass the operating system kernel), maybe, ~1M 1K events/second • There’s no point being capable of > 30M events/second on a thread if you’re going over a wire. • So, 8-40 million events/second in node is a pleasant sufficiency • It’s not the algorithm. It’s the mechanical sympathy, stoopid! • Lock free wait-free concurrency is easier than lock based concurrency. Try it.Copyright Push Technology 2012
    • 50. • Thank you for listening • Thank you for having me • Thank you Push for the beer budget • Le twitter: @darachennis • Expect eep.js in GitHub soon • I’ll hashtag it #nodedublin • Thank you @Waratek geeks. About me?Copyright Push Technology 2012 Darach@PushTechnology.com

    ×