How to write your database: the story about Event Store


Published on

This story is about distributed open-source database called Event Store ( Event Store is developed by distributed team, part of which are ELEKS employees. I am going to talk about Event Store purpose and how it works, share some lessons we learned during the development and how it feels when you develop distributed high-performance system of that complexity. The talk will be interesting for technical people: software architects and engineers in general and .NET developers in particular as Event Store is written in C#.

Published in: Technology

How to write your database: the story about Event Store

  1. 1. HOW TO WRITE YOUR DATABASEThe story about By Victor Haydin, Head of R&D, ELEKS
  2. 2. Agenda 1.Problem definition 2.Event Store: introduction 3.Under the hood 4.Lessons learned 5.Q&A
  3. 3. Few notes before we start 1.This story is about the development, technology and lessons learned, not a product 2.Neither about Event Sourcing 3.Outsourcing point of view 4.I was there on pre-sales and initiation phases. When the team was formed I left the project.
  4. 4. Survey
  5. 5. How many of you can understand code?
  6. 6. Event Sourcing?
  7. 7. Event Store?
  8. 8. SEDA?
  9. 9. 140 characters long definition: Event sourcing - persisting entities by appending all business events to transaction log. To rebuild the state, we replay this log.
  10. 10. Example BankAccount GotSallary: +100 GotCashFromAtm: -10 MadeInternetPayment: -15 MadeTerminalPayment: -5 PutCashToAccount: +20 Account balance: 100-10-15-5+20=90
  11. 11. Event Sourcing Pros Cons Performance Performance Simplification Versioning Audit Trail Querying Integration with other systems Not common approach Troubleshooting Better tastes with CQRS Fixing errors Testing Flexibility
  12. 12. The marketing says
  13. 13. and time-series data
  14. 14. for single-node configuration
  15. 15. AtomPub protocol for HTTP
  16. 16. 33000 writes per second on single node
  17. 17. Master-Slave replication for Premium version
  18. 18. I’ll talk more about Mono later
  19. 19. web-based
  20. 20. Client API 1.Append to Stream 2.Read Events 3.Subscribe 4.Transactions 5.Stream Metadata 6.Delete Stream 7.System Settings 8.Connection Lifecycle
  21. 21. Under the hood
  22. 22. Disclaimer
  23. 23. SEDA
  24. 24. Time to take a look at code
  25. 25. SEDA outcome 1.Very few locks in code 2.Simple state management 3.Easy to understand system composition 4.Easy to extend 5.Easy to monitor (DEMO)
  26. 26. License is MIT, code is available on github. Enjoy!
  27. 27. Storage Engine
  28. 28. All events are stored in one transaction log (splitted into chunks)
  29. 29. Writer Chaser Readers Scavenger Storage pipeline
  30. 30. Writer 1.Single-threaded 2.Append-only 3.Write data in chunks 4.Flush depending on load and settings
  31. 31. Chaser 1.Checks what writer has appended to log and sends acknowledgements 2.Updates index
  32. 32. Readers 1.Read events from transaction log with help of Read Index
  33. 33. Read Index 1.Maps stream name and version to position in transaction log 2.Index = MemTable + PTables 3.MemTable = Dictionary of Lists 4.Index entry – record of fixed length 5.PTable = file with sorted sequence of entries and cached midpoints 6.Binary search FTW!
  34. 34. Scavenger 1.Background process that go through transaction log chunks and clean it from obsolete data (e.g. deleted streams, $maxCount, $maxAge etc.) 2.Simply writes data to a new chunk file 3.Last one out shut the lights off
  35. 35. High Availability
  36. 36. High Availability 1.Master-Slave replication 2.Automatic master elections (N/2 + 1 quorum) 3.Wins the one with most actual state 4.Write always to master, read from master or slave, depending on client’s choice 5.Byte stream replication 6.Available in commercial version
  37. 37. Projections
  38. 38. Projection – the process of taking an event stream and converting it to some other form (e.g. another event stream or state object)
  39. 39. Projections Engine 1.Built-in query language – Javascript 2.Based on Google’s v8
  40. 40. Demo: Projections Engine
  41. 41. Lessons Learned
  42. 42. Immutability is good
  43. 43. Pure async is fast, although hard for development
  44. 44. Lockless programming is hard
  45. 45. Distributed programming is even harder
  46. 46. Debugger is almost useless
  47. 47. Unit-testing is a limited tool
  48. 48. Verbose logs rock
  49. 49. Real-life testing rocks like nothing else
  50. 50. Bugs in .NET [System.Security.SecuritySafeCritical] public virtual void Flush(Boolean flushToDisk) { // This code is duplicated in Dispose if (_handle.IsClosed) __Error.FileNotOpen(); if (_writePos > 0) { FlushWrite(false); if (flushToDisk) { if (!Win32Native.FlushFileBuffers(_handle)) { __Error.WinIOError(); } } } else if (_readPos < _readLen && CanSeek) { FlushRead(); } _readPos = 0; _readLen = 0; } *Appears to be fixed in 4.5
  51. 51. Bugs in Mono TCP/IP Stack deadlocks Concurrent Stack/Queue XML Serializer File Stream issues General slowness: 20K w/s
  52. 52. Disk caches: fake flush and reordering *By default, can be disabled
  53. 53. Mind the HDD/SSD difference *It is possible to optimize for both
  54. 54. There is a long way to simplicity
  55. 55. Experiments: The way it works
  56. 56. Further plans 1.Horizontal scalability 2.Hosted service 3.Adding other features
  57. 57. Links 1. – official web site 2. – source code 3. – A deep look into The Event Store by Greg Young 4. – Event Sourcing overview by Martin Fowler 5. Google <Event Sourcing|Event Store>
  58. 58. Got a question? Ask! @victor_haydin