Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Reactive Application Design for High Volume Multi-dimensional Temporal Data Series

762 views

Published on

Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1LyicGY.

Stuart Williams examines some of the problems faced, (and solutions developed), during the development of a real world application that uses Spring Integration, Spring Expression Language, Reactor (and the LMAX Disruptor) to process billions of events per day and demonstrates some the ways that the JVM can be (occasionally Unsafely) used to get there. Filmed at qconlondon.com.

With over 15 years of application development experience, Stuart Williams currently leads the development of a high performance stream processing application for Pivotal, called RTI.

Published in: Technology
  • Be the first to comment

Reactive Application Design for High Volume Multi-dimensional Temporal Data Series

  1. 1. Reactive Application Design For High-Volume Multi- dimensional Temporal Data Series @pidster
  2. 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /reactive-app-design-spring
  3. 3. Presented at QCon London www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  4. 4. Stuart Williams == ‘Pid’ • Lead Engineer, RTI • SpringSource / VMware / Pivotal • A little OSS: Apache, Eclipse @pidster
  5. 5. Mumbling Isn’t a Sign of Laziness — It’s a Clever Data-Compression Trick Source: http://nautil.us/blog/mumbling-isnt-a-sign-of-lazinessits-a-clever- data_compression-trick
  6. 6. Practical • We built a real-time* application (RTI) – Using Spring (yes! I know!) – Share some lessons – Do some demos – Reveal a secret go-faster switch! * for some definition of real-time
  7. 7. Built with… • Spring IO Platform – Boot, Data, Integration, Reactor, AMQP, SpEL, Shell (and a little Groovy) • GemFire, RabbitMQ • C24
  8. 8. Questions… • Do you know your system load & input rates? – No. – Yes! • Up to 1k/s? • Up to 5k/s? • Up to 10k/s? • Up to 100k/s? • Up to 1M/s? • Above 1M/s?
  9. 9. Questions… • Heard of Spring Integration? – Tried it? – Used it in production? • Heard of Spring Reactor? – Tried it? – Used it in production?
  10. 10. DESIGN Some discussion about…
  11. 11. Design Goals & Challenges • High throughput – versus • Low latency • Enable low-impact analysis on live streams – We don’t know in advance what this will be… – User accessible API for analytics
  12. 12. Many whiteboards later…
  13. 13. BIG PICTURE! Show me the…
  14. 14. The Big Picture Ingester Ingest Grid Distribution Analytics Stream AMQP Metrics Firehose HTTPHTTP Databases Analytics Queries Queries - Unstructured - Structured Query, Adapt, React feedback loop Reactive expressions End User Applications Ingestion & Filtering Analytics & Distribution End-user / Consumers
  15. 15. Input Data Rates RTI  100k/s baseline  ~120k/s daily peak  >1M/s annual peak Twitter*  6k/s average  9k/s daily peak  30k/s large events **Source @catehstn twitter.com/catehstn/status/494918021358813184 OK, so Twitter’s internal fan-out & timeline access rates & storage problems are vastly different! (see also Redis…)
  16. 16. Load Characteristics  Low numbers of inbound connections  High rates, micro-bursts  Occasional peaks of nearly 2x, rare peaks of 10x  Variable payload size (200B – 300KB)  Internal fan-outs multiply event rates
  17. 17. More statistics…  100k/s order of magnitude – 8,640,000,000 (per day) – An Integer based counter will ‘roll over’ in ~3 days  400Mbps of raw data – 10Gbps NICs required to support traffic peaks – Logging! Verbose errors can fill a disk quickly – Queues backing up == #fail  Upcoming 10x existing rates!
  18. 18. REACTIVE APPLICATIONS? What’s all this fuss about…
  19. 19. Reactive Applications www.reactivemanifesto.org Responsive ResilientScalable Event (or message) driven Depends on Depends on
  20. 20. Reactive Streams • Collaboration between key industry players 18
  21. 21. Reactive Streams: Specification 19 • Semantics – Single document listing all rules – Open enough to allow for various patterns • 4 API interfaces • TCK to verify implementation behaviour
  22. 22. Reactive Streams github.com/reactive-streams org.reactivestreams.Processor org.reactivestreams.Publisher org.reactivestreams.Subscriber org.reactivestreams.Subscription
  23. 23. REACTIVE STREAMS API A quick look at the…
  24. 24. Spring Reactor LMAX Disruptor – a RingBuffer reactor.bus.EventBus reactor.core.Dispatcher reactor.rx.Stream
  25. 25. REACTOR API A quick look at the…
  26. 26. Spring Integration Enterprise Integration Patterns See http://www.enterpriseintegrationpatterns.com/ by Hohpe & Woolf Messages Channels Endpoints (pipes & filters architecture)
  27. 27. SI Pipeline Example
  28. 28. SPRING INTEGRATION JAVA DSL A quick look at the…
  29. 29. Spring Integration Performance • 3.x – Take out all the SpEL • 4.x – 4.0 • Improved – 4.1 (Q4 2014) • Put back all the SpEL – 4.2 (late 2015) • Rather good
  30. 30. BIG PICTURE AGAIN Back to the…
  31. 31. The Big Picture Ingester Ingest Grid Distribution Analytics Stream AMQP Metrics Firehose HTTPHTTP Databases Analytics Queries Queries - Unstructured - Structured Query, Adapt, React feedback loop Reactive expressions End User Applications Ingestion & Filtering Analytics & Distribution End-user / Consumers Key Reactor usages
  32. 32. Reactor Usage • UDP/TCP Servers (or clients) • Outputs – batching • Dispatchers & Streams – Expression evaluation engine
  33. 33. Spring Integration + Reactor • Batching – Adaptive sizing • Routing – With … batching
  34. 34. SPRING INTEGRATION + REACTOR An example or two…
  35. 35. TEMPORAL DATA SERIES? But what about the…
  36. 36. Temporal Data RingBuffer New DataOld Data Expressions
  37. 37. SECRET ‘GO-FASTER’ SWITCH? And there was something about a…
  38. 38. Spring Expression Language (SpEL) • Powerful expression language • Supports querying and manipulating an object graph at runtime • Similar to Unified EL – Additional features, include method invocation and string templating.
  39. 39. SpEL is slow! 
  40. 40. Enter SpEL Compilation • 3 modes – Immediate – Mixed – Off • -Dspring.expression.compiler.mode=mixed
  41. 41. SPEL DEMO And now for a quick
  42. 42. SpEL is fast! 
  43. 43. Fin And relax
  44. 44. QUESTIONS? And now for some… @pidster @smaldini
  45. 45. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/reactive- app-design-spring

×