Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-Real Time Aggregation and Notifications


Published on

The financial sector is an exciting mix of challenges regarding throughput, high availability as well as specific constraints regarding latency and consistency. In the continuous evolution of its platform, Murex relies on open source technologies like Apache Geode and Apache Storm in a "kind of" lambda architecture to ensure storage, near-real time (around the milliseconds) aggregation of thousands of events per second, advanced notification mechanisms and on-demand deployments. This talk will focus on the technical architecture, the underlying principles as well as the technologies used to support this mix of functional and non-functional requirements.

Published in: Technology
  • Be the first to comment

#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-Real Time Aggregation and Notifications

  1. 1. Olivier MALLASSI / March 9, 2016
  2. 2. Combining Stream Processing and In-Memory Data Grids for Near-Real Time Aggregation and Notifications •  @omallassi •  Principal Architect @ Murex •  Backbone for Capital Markets; Front-Office to Back-Office to Risk, across multiple assets class
  3. 3. Post Trade and consolidated information are inputs for decision making Cycle time tends to be more and more « near real time » (depends on the asset class) Our mission: -  Process an increasing volume of trades/events -  Aggregate trade and event data based on use-case specific criteria -  Accommodate real-time and historical data inputs 10 000 foot view of finance… Decision Making Determination of the best investments based on market trends and existing investments in portfolio (FO) Acquisition & Verification Procurement of the assets. E.g. the gold, the equities, the futures etc. (MO) Operation and Maintenance Management and use of assets (BO) Risk Management Risk control on these decisions
  4. 4. Store immutable events Filter & Aggregate these events based on the demanded perspectives on these real-time or historical events Notify about updates on aggregates Solution Summary,
  5. 5. « As A Service »: Perspectives can be requested at any time, on any type of events Be Scalable, Resilient to failure and ensure Low Latency (sub milli second) And, of course…
  6. 6. Flexibility This is a framework to build and manage perspectives Historical and real-time events Are stored in an Event Log, each event is identified by a unique and strictly monotonic offset Are aggregated through the same graph of computation (DAG) Ensure horizontal scalability (and distribution) Avoid locking and move back to a single-threaded model (per aggregate) Limit the number of TCP hops Limit the usage of disk Key Architectural Principles
  7. 7. High Level Architecture Apache Geode (Continous Query) Aeron Apache Storm Perspectives are described using a custom DSL (on top of Storm Flux + JEXL) Apache Geode
  8. 8. Apache Storm: stream processing engine Not micro batch Aggregations are expressed as a (distributed) DAG The framework ensures routing of the events Based on groupBy, to well known threads Routing strategies can be custom Apache Storm 101 select * from source where …! group by x.y.z!
  9. 9. The framework on which the Event Log is built High Availability and resilience Horizontal scalability and distribution Control of data partitioning and regions collocation Advanced storage configuration: In-memory, overflow on disk, etc… Advanced notifications (via Continous Queries) Why Apache Geode?
  10. 10. Distributed & scalable « Query Engine » (DAG) Routing of Events through the DAG Cluster Management On demand perspective deployment Resilience (failed engine are automatically restarted) Why Apache Storm?
  11. 11. Storm / Geode are running in their dedicated JVMs Storm groupings ensure Distribution accross multiple threads and multiple JVMs Single threaded model Horizontal scalability with the number of threads / JVMs Multiple TCP hops « Usual » Deployment Pattern
  12. 12. « Low latency » Deployment Pattern Storm/Geode collocated inside the same JVM Events are routed to the right JVM based on a routing key Use first element of groupBy as Partition Resolver Storm custom groupings enable Multi-threading Single-threaded model (per aggregate) Regions (events, aggregates) are collocated Horizontal scalability with the number of threads / JVMs Limited and known number of TCP hops
  13. 13. Event Log provides a way to work on real-time and historical data with the same code Collocation of Storm and Geode is powerful This is a powerful and general pattern implementation which gives us an efficient and open framework Efficiency, Performance requirements are reached Openness, DAG can be easily extended with CEP engines, rules engines Notifications based on solutions like Aeron or Geode continuous queries To conclude
  14. 14. Join the Apache Geode Community Today! •  Check out: •  Subscribe: •  Download:
  15. 15. Thank you!