How to write your database: the story about Event Store
Upcoming SlideShare
Loading in...5

How to write your database: the story about Event Store



This story is about distributed open-source database called Event Store ( Event Store is developed by distributed team, part of which are ELEKS employees. I am going to talk ...

This story is about distributed open-source database called Event Store ( Event Store is developed by distributed team, part of which are ELEKS employees. I am going to talk about Event Store purpose and how it works, share some lessons we learned during the development and how it feels when you develop distributed high-performance system of that complexity. The talk will be interesting for technical people: software architects and engineers in general and .NET developers in particular as Event Store is written in C#.



Total Views
Views on SlideShare
Embed Views



1 Embed 16 16



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

How to write your database: the story about Event Store How to write your database: the story about Event Store Presentation Transcript

  • HOW TO WRITE YOUR DATABASEThe story about By Victor Haydin, Head of R&D, ELEKS
  • Agenda 1.Problem definition 2.Event Store: introduction 3.Under the hood 4.Lessons learned 5.Q&A
  • Few notes before we start 1.This story is about the development, technology and lessons learned, not a product 2.Neither about Event Sourcing 3.Outsourcing point of view 4.I was there on pre-sales and initiation phases. When the team was formed I left the project.
  • Survey
  • How many of you can understand code?
  • Event Sourcing?
  • Event Store?
  • SEDA?
  • 140 characters long definition: Event sourcing - persisting entities by appending all business events to transaction log. To rebuild the state, we replay this log.
  • Example BankAccount GotSallary: +100 GotCashFromAtm: -10 MadeInternetPayment: -15 MadeTerminalPayment: -5 PutCashToAccount: +20 Account balance: 100-10-15-5+20=90
  • Event Sourcing Pros Cons Performance Performance Simplification Versioning Audit Trail Querying Integration with other systems Not common approach Troubleshooting Better tastes with CQRS Fixing errors Testing Flexibility
  • The marketing says
  • and time-series data
  • for single-node configuration
  • AtomPub protocol for HTTP
  • 33000 writes per second on single node
  • Master-Slave replication for Premium version
  • I’ll talk more about Mono later
  • web-based
  • Client API 1.Append to Stream 2.Read Events 3.Subscribe 4.Transactions 5.Stream Metadata 6.Delete Stream 7.System Settings 8.Connection Lifecycle
  • Under the hood
  • Disclaimer
  • SEDA
  • Time to take a look at code
  • SEDA outcome 1.Very few locks in code 2.Simple state management 3.Easy to understand system composition 4.Easy to extend 5.Easy to monitor (DEMO)
  • License is MIT, code is available on github. Enjoy!
  • Storage Engine
  • All events are stored in one transaction log (splitted into chunks)
  • Writer Chaser Readers Scavenger Storage pipeline
  • Writer 1.Single-threaded 2.Append-only 3.Write data in chunks 4.Flush depending on load and settings
  • Chaser 1.Checks what writer has appended to log and sends acknowledgements 2.Updates index
  • Readers 1.Read events from transaction log with help of Read Index
  • Read Index 1.Maps stream name and version to position in transaction log 2.Index = MemTable + PTables 3.MemTable = Dictionary of Lists 4.Index entry – record of fixed length 5.PTable = file with sorted sequence of entries and cached midpoints 6.Binary search FTW!
  • Scavenger 1.Background process that go through transaction log chunks and clean it from obsolete data (e.g. deleted streams, $maxCount, $maxAge etc.) 2.Simply writes data to a new chunk file 3.Last one out shut the lights off
  • High Availability
  • High Availability 1.Master-Slave replication 2.Automatic master elections (N/2 + 1 quorum) 3.Wins the one with most actual state 4.Write always to master, read from master or slave, depending on client’s choice 5.Byte stream replication 6.Available in commercial version
  • Projections
  • Projection – the process of taking an event stream and converting it to some other form (e.g. another event stream or state object)
  • Projections Engine 1.Built-in query language – Javascript 2.Based on Google’s v8
  • Demo: Projections Engine
  • Lessons Learned
  • Immutability is good
  • Pure async is fast, although hard for development
  • Lockless programming is hard
  • Distributed programming is even harder
  • Debugger is almost useless
  • Unit-testing is a limited tool
  • Verbose logs rock
  • Real-life testing rocks like nothing else
  • Bugs in .NET [System.Security.SecuritySafeCritical] public virtual void Flush(Boolean flushToDisk) { // This code is duplicated in Dispose if (_handle.IsClosed) __Error.FileNotOpen(); if (_writePos > 0) { FlushWrite(false); if (flushToDisk) { if (!Win32Native.FlushFileBuffers(_handle)) { __Error.WinIOError(); } } } else if (_readPos < _readLen && CanSeek) { FlushRead(); } _readPos = 0; _readLen = 0; } *Appears to be fixed in 4.5
  • Bugs in Mono TCP/IP Stack deadlocks Concurrent Stack/Queue XML Serializer File Stream issues General slowness: 20K w/s
  • Disk caches: fake flush and reordering *By default, can be disabled
  • Mind the HDD/SSD difference *It is possible to optimize for both
  • There is a long way to simplicity
  • Experiments: The way it works
  • Further plans 1.Horizontal scalability 2.Hosted service 3.Adding other features
  • Links 1. – official web site 2. – source code 3. – A deep look into The Event Store by Greg Young 4. – Event Sourcing overview by Martin Fowler 5. Google <Event Sourcing|Event Store>
  • Got a question? Ask! @victor_haydin