Thesis finalpresentation
Upcoming SlideShare
Loading in...5

Thesis finalpresentation



In recent years the need for distributed data storage has led the way to design new systems in a large-scale environment. The growth of unbounded stream of data, the necessity to store and analyze it ...

In recent years the need for distributed data storage has led the way to design new systems in a large-scale environment. The growth of unbounded stream of data, the necessity to store and analyze it in real time, reliably, scalably and fast are the reasons for appearance of such systems in financial sector, stock exchange Nasdaq OMX especially.
Futhermore, internally designed totally ordered reliable message bus is used in Nasdaq OMX for almost all internal subsystems. Theoretical and practical extensive studies on reliable totally ordered multicast were made in academia and it was proven to serve as a fundamental block in construction of distributed fault-tolerant applications.
In this work, we are leveraging NOMX low-latency reliable totally ordered message bus with a capacity of at least 2 million messages per second to build high performance distributed data store. The data operations consistency can be easily achieved by using the messaging bus as it forwards all messages in reliable total order fashion. Moreover, relying on the reliable totally ordered messaging, active in-memory replication support for fault tolerance and load balancing is integrated. Consequently, the prototype was developed using pro- duction environment requirements to demonstrate its feasibility.
Experimental results show a great scalability, and performace serving around 400,000 insert operations per second over 6 data nodes that can be served with 100 microseconds latency. Latency for single record read operations are bound to sub-half millisecond, while data ranges are retrieved with sub-100 Mbps ca- pacity from one node. Moreover, performance improvements under a greater number of data store nodes are shown for both writes and reads. It is con- cluded that uniform totally ordered sequenced input data can be used in real time for large-scale distributed data storage to maintain strong consistency, fault-tolernace and high performance.



Total Views
Views on SlideShare
Embed Views



6 Embeds 1,182 651 526 2 1 1 1



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Thesis finalpresentation Thesis finalpresentation Presentation Transcript

  • GDS: Genium Data Store Real Time, Low Latency, Reliable! Iuliia Proskurnia! EMDC! KTH! 2013!
  • 2!
  • 3! View slide
  • 4! 3900 companies! 39 countries! over 1500 corporate products! USE CASE! Write events! Retrieve ranges of records! View slide
  • 5! Fault-Tolerant?! Consistent?! Fast?! Scalable?!
  • 6! Approaches !   Consensus based! !   ...! !   Total Order Multicast! !   Symmetric! !   Token Site !
  • Uniform Reliable Total Order ◦  Validity ! ◦  Uniform Integrity ! ◦  Uniform Agreement ! ◦  Uniform Total Order ! ! 7!
  • 8! Genium INET Message Bus Uniform Reliable Total Order Multicast !   Similar to Amoeba protocol! !   However... Fault Tolerant!!
  • 9! GDS: Genium Data Store !   Uses Genium INET Message Bus abstraction! !   Clients, Sequencer, Data store! ! Rewinders and sequencer replication! !   Active replication! Client! Data store node! Data store node!
  • 10! GDS high level abstraction LEDS!
  • 11! LEDS !   Column based! !   BLOBS! !   Appends! !   Range Queries! !   Not Distributed! !   Not fault-tolerant!
  • 12! Properties !   Consistent! !   Failure Resilient! !   Replication! !   Rewinders! !   Cite Replication! Total Order!
  • 13! Possible Failure Scenarios ClientFailure! SequencerFailure!
  • 14! Scalability !   Natural Load Balancing! !   Partitioning (manual)!
  • 15! Evaluation !   Inserts (throughput/latency)! !   Range Queries (throughput)! !   Range transmission failure!
  • 16! Set Up
  • 17! Writes Throughput
  • 18! Writes Limits
  • 19! Writes Latency
  • 20! Range Queries Throughput
  • 21! Range Queries Scalability 8 Concurrent Users!
  • 22! Range Queries Link Failure
  • 23! Summary !   uniform reliable total order multicast ! !   scales fine! !   low latency! !   consistent, fault-tolerant!
  • 24! Future Work !   Generality! !   Send compressed chunks! !   Automated partitioning! !   Long-running tests!
  • 25! Comments? Questions? Thesis Writing Process!
  • 26! Single record read without load
  • 27! Single record read with load (10 000 inserts)
  • 28! Single record read scalability
  • 29! Discussion