Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How Formula One are turbo-charging telemetry with real-time processing - Connect Europe 2017

420 views

Published on

Juan Manuel Ventura, speaking at Connect Europe 2017, reveals how Spindox has collaborated with Couchbase to build real time, scalable solutions for on demand analytics.

Visit our website for more information: https://www.couchbase.com

Published in: Software
  • Be the first to comment

  • Be the first to like this

How Formula One are turbo-charging telemetry with real-time processing - Connect Europe 2017

  1. 1. REAL-TIME EVENT STREAM PROCESSING WITH COUCHBASE
  2. 2. ABOUT SPINDOX • ~500 employees • 5 offices in Italy, 3 subsidiaries abroad • Spindox Labs as R&D hub in Trento (IT) • Couchbase’s strongest partner in the Country
  3. 3. ABOUT ME • Digital & Integration specialist, enjoying unpacking complex systems into manageable pieces • Passionate about databases • Focused on Digital Transformation, helping companies leveraging their data from technical constraints and creating engaging experiences
  4. 4. - A customer obsessed about performance and secrecy - Build a real time solution for on demand analytics able to scale from a laptop to a cross-region datacenter - Guarantee data availability and consistency while continuously ingesting data - Accommodate different data models and access patterns while keeping a simple architecture and a single database - Handle exceptions gracefully and consistently, e.g. out of order data, replayability, data immutability, etc. THE CHALLENGE
  5. 5. THE CHALLENGE – TIME SERIES ON COUCHBASE Time Series is a hot topic and a different and very specific paradigm
  6. 6. - Overtake current InfluxDB-based offline/test bed system, plus: - Real time data streaming, pushing events from server to clients for on demand analytics. - Real time stream processing, a rules engine that allows the end user to program the system, extending functionality like alerting, calculations and enabling new behaviors - Model Couchbase as a Time Series, a RDBMS and an Object Database all at once, leveraging its unique features, like XDCR, N1QL, Memory First Architecture, among others THE GOALS: PHASE 1
  7. 7. - Apply this technology to replace the current F1 Telemetry System - Handle +30.000 channels + n virtual channels (calculated) - Handle sample variable frequency from 1Hz to 100KHz - Handle full nanosecond resolution - Work locally (at the race location) while streaming data to the headquarters with a combination of application-level streaming (real time, but ephemeral) backed by XDCR (delayed, but persisted) THE GOALS: PHASE 2
  8. 8. - Support high performance, asymmetrical scalability, high availability, distributed replication and low cost of operation - A cluster-level Memory First Architecture (In Memory Data Grid) - Support different data models from a single database - Time Series for data points (maps) - Object Model (document) for metadata - Relational for master data WHY COUCHBASE, THE NEED FOR…
  9. 9. - Support different data access patterns from a single database - Key-Value for accessing maps deterministically - N1QL for issuing any kind of query on master and meta data - MapReduce for data aggregation • [x]: ms, ss, mm, hh… • [y]: count, min, max, mean, mode, stddev… - Full text Search for log messages and non-numerical values WHY COUCHBASE, THE NEED FOR…
  10. 10. THE SOLUTION: ARGUS
  11. 11. - Everything is seen and treated by the system as a stream of events - An event is just a value observed at a certain point in time [x, y] - A sequence of events with the same meta data makes a series - Both real time data and historical data (K/V + N1QL queries) THE SOLUTION: ARGUS
  12. 12. - Data segregation by access pattern, different buckets & indexes - Master Data & Series: GSI + N1QL queries - Events (points): Key-Value calculated deterministically - Logs: Full Text Search for matching patterns THE SOLUTION: ARGUS
  13. 13. - In order to consume data (any client), the application domain complexity is reduced to a two steps operation 1. Isolate the target series by a combination of N1QL and a deterministic algorithm for hash calculation (even from the front end without a server trip) 2. Subscribe to the resulting series to receive both real time and historical event streams (pushing data from the back + on demand queries when needed) THE SOLUTION: ARGUS
  14. 14. ARCHITECTURE - Argus is a reactive, micro services-based distributed system - Communication is handled by message passing using two patterns - request/reply: blocking, synchronous, command based - event-emitter: non-blocking asynchronous, fire and forget - Event Driven, every micro service emits & reacts to events through functions, signals and data packets
  15. 15. HIGH LEVEL DIAGRAM Source Source Source Source Source Event Bus Commands Events CRUD Query Events samples Notifications Query Authenticate Collector Authenticator API Rules Engine Action Dispatcher Data Gateway Query Query
  16. 16. DEMO
  17. 17. - Completely decoupled and modularized system with a flexible general-purpose database modelled as a time series - ~10x read improvement [preliminary benchmark] against InfluxDB - Mighty N1QL queries to overcome InfluxDB limitations - Unbeatable XDCR technology against Influx Relay (poor’s man sync) BENEFITS.
  18. 18. - Performance improvement (yes, we still have room for them) - Grafana plugin - Argus as a platform - Other application domains - medical trials - business events - IoT (telemetry, preventive and predictive maintenance) NEXT STEPS
  19. 19. - Distributed systems are hard - Modeling data is key and the hardest part (it took 3 months to get it right) - Apply replayability instead of guaranteed delivery - Apply impotency instead of exactly once delivery - Apply commutativity instead of ordered delivery - Use the right tool for the job: K/V, N1QL, FTS, MapReduce SUMMARY
  20. 20. THANK YOU

×