Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Thesis finalpresentation

3,393 views

Published on

In recent years the need for distributed data storage has led the way to design new systems in a large-scale environment. The growth of unbounded stream of data, the necessity to store and analyze it in real time, reliably, scalably and fast are the reasons for appearance of such systems in financial sector, stock exchange Nasdaq OMX especially.
Futhermore, internally designed totally ordered reliable message bus is used in Nasdaq OMX for almost all internal subsystems. Theoretical and practical extensive studies on reliable totally ordered multicast were made in academia and it was proven to serve as a fundamental block in construction of distributed fault-tolerant applications.
In this work, we are leveraging NOMX low-latency reliable totally ordered message bus with a capacity of at least 2 million messages per second to build high performance distributed data store. The data operations consistency can be easily achieved by using the messaging bus as it forwards all messages in reliable total order fashion. Moreover, relying on the reliable totally ordered messaging, active in-memory replication support for fault tolerance and load balancing is integrated. Consequently, the prototype was developed using pro- duction environment requirements to demonstrate its feasibility.
Experimental results show a great scalability, and performace serving around 400,000 insert operations per second over 6 data nodes that can be served with 100 microseconds latency. Latency for single record read operations are bound to sub-half millisecond, while data ranges are retrieved with sub-100 Mbps ca- pacity from one node. Moreover, performance improvements under a greater number of data store nodes are shown for both writes and reads. It is con- cluded that uniform totally ordered sequenced input data can be used in real time for large-scale distributed data storage to maintain strong consistency, fault-tolernace and high performance.

Published in: Technology, Business
  • Be the first to comment

Thesis finalpresentation

  1. 1. GDS: Genium Data Store Real Time, Low Latency, Reliable! Iuliia Proskurnia! EMDC! KTH! 2013!
  2. 2. 2!
  3. 3. 3!
  4. 4. 4! 3900 companies! 39 countries! over 1500 corporate products! USE CASE! Write events! Retrieve ranges of records!
  5. 5. 5! Fault-Tolerant?! Consistent?! Fast?! Scalable?!
  6. 6. 6! Approaches !   Consensus based! !   ...! !   Total Order Multicast! !   Symmetric! !   Token Site !
  7. 7. Uniform Reliable Total Order ◦  Validity ! ◦  Uniform Integrity ! ◦  Uniform Agreement ! ◦  Uniform Total Order ! ! 7!
  8. 8. 8! Genium INET Message Bus Uniform Reliable Total Order Multicast !   Similar to Amoeba protocol! !   However... Fault Tolerant!!
  9. 9. 9! GDS: Genium Data Store !   Uses Genium INET Message Bus abstraction! !   Clients, Sequencer, Data store! ! Rewinders and sequencer replication! !   Active replication! Client! Data store node! Data store node!
  10. 10. 10! GDS high level abstraction LEDS!
  11. 11. 11! LEDS !   Column based! !   BLOBS! !   Appends! !   Range Queries! !   Not Distributed! !   Not fault-tolerant!
  12. 12. 12! Properties !   Consistent! !   Failure Resilient! !   Replication! !   Rewinders! !   Cite Replication! Total Order!
  13. 13. 13! Possible Failure Scenarios ClientFailure! SequencerFailure!
  14. 14. 14! Scalability !   Natural Load Balancing! !   Partitioning (manual)!
  15. 15. 15! Evaluation !   Inserts (throughput/latency)! !   Range Queries (throughput)! !   Range transmission failure!
  16. 16. 16! Set Up
  17. 17. 17! Writes Throughput
  18. 18. 18! Writes Limits
  19. 19. 19! Writes Latency
  20. 20. 20! Range Queries Throughput
  21. 21. 21! Range Queries Scalability 8 Concurrent Users!
  22. 22. 22! Range Queries Link Failure
  23. 23. 23! Summary !   uniform reliable total order multicast ! !   scales fine! !   low latency! !   consistent, fault-tolerant!
  24. 24. 24! Future Work !   Generality! !   Send compressed chunks! !   Automated partitioning! !   Long-running tests!
  25. 25. 25! Comments? Questions? Thesis Writing Process!
  26. 26. 26! Single record read without load
  27. 27. 27! Single record read with load (10 000 inserts)
  28. 28. 28! Single record read scalability
  29. 29. 29! Discussion

×