Building FoundationDB


Published on

Presented by our co-founder David Rosenthal at a NYC Database Month meetup on October 24th, 2012. Apply for alpha at

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Building FoundationDB

  1. 1. Building a next-generation database david [dot] Twitter: @FoundationDB
  2. 2. MotivationEase of building successful applications:• High performance• Ease scaling out• Ease of building abstractions• Ease of operation
  3. 3. HistoryToolsDesignResults
  4. 4. Historical Perspective: 2008 Future NoSQL doesn’t really exist yet
  5. 5. Databases in 2008Relational is entrenched; NoSQL emergingwith some interesting advantages:• Voldemort• Cassandra• HBase …but the fine print about data guarantees doesn’t look so good.
  6. 6. The CAP2008 theorem• Brewer: Pick 2 out of 3• Werner Vogels (CTO “Data inconsistency in large-scale reliable distributed systems has to be tolerated … [for performance and to handle faults]”• Wrong descriptions all over the web: “The availability property means that the system is ‘online’ and the client of the system can expect to receive a response for its request.”
  7. 7. CAP2008 Conclusions?• Scaling requires distributed design• Distributed requires high availability• Availability requires no C So, if we want scalability we have to give up C, the cornerstone of ACID. Right?
  8. 8. Thinking about CAP2008• Is a partition worse than a failure?• Three computers can’t agree?• Keyword: Availability… Availability != high availability
  9. 9. Flash forward to CAP2012• Brewer: “Why ‘2 of 3’ is misleading”• Brewer: “CAP prohibits … perfect availability”• Vogles: “Achieving strict consistency can come at a cost in update or read latency, and may result in lower throughput…”• Google (Spanner): “…it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.“
  10. 10. The FoundationDB concept• Attack CAP2008 and deliver transactions at NoSQL performance and scale• Reduce core to minimal feature set• Add features back with higher-level abstractions—“Layers”• Decouple choice of data model and choice of storage technology
  11. 11. FoundationDBDatabase software: Application•Ordered key-value API Layer•Scalable Key-value API•Transactional•Fault tolerant
  12. 12. HistoryToolsDesignResults
  13. 13. Engineering pressuresEngineering Challenge StrategyEngineering for extreme reliability Simulationand fault tolerance of large clustersunder adverse conditionsMany asynchronous Erlang?communicating processesFast algorithms; efficient I/O C++ We need new tools!
  14. 14. First tool: Flow• A new programming language• Adds actor-model concurrency to C++11• New keywords: ACTOR, future, promise, wait, choose, when, streams• Flow code -> C++11 code -> binary Seriously?
  15. 15. Flow allows…• Testability by enabling simulation.• Performance by compiling to native.• Easier ACTOR-model coding.
  16. 16. Flow eases development
  17. 17. Flow output
  18. 18. Flow performanceJoe Armstrong (author of “Programming Erlang”):“Write a ring benchmark. Create N processes in a ring.Send a message round the ring M times so that a totalof N * M messages get sent. Time how long this takesfor different values of N and M. Write a similarprogram in some other programming language you arefamiliar with. Compare the results. Write a blog, andpublish the results on the internet!”
  19. 19. Flow performance (N=1000, M=1000)• Ruby (using threads): 1990 seconds• Ruby (queues): 360 seconds• Objective C (using threads): 26 seconds• Java (threads): 12 seconds• Stackless Python: 1.68 seconds• Erlang: 1.09 seconds• Google Go: 0.87 seconds• Flow: 0.075 seconds
  20. 20. Second Tool: Lithium• Enabled by Flow• Simulate physical interfaces• Simulate failures modes• Deterministic simulation of entire system
  21. 21. Testability: Quicksand
  22. 22. Third tool: Magnesium
  23. 23. HistoryToolsDesignResults
  24. 24. Traditional approaches• Glue together smaller transactional systems – Two-phase-commit (Open/X XA) – Paxos• Build on a distributed file system – BigTable/HBase
  25. 25. The FoundationDB approach• Deconstruct a traditional transactional database and scale the individual parts• Each part must also be fault tolerant• The parts: – Accept requests – Check for transaction conflicts – Log transactions – Store data
  26. 26. Key insightChecking for transaction conflicts• Problem is scalable• When highly optimized, is a small amount of the total % of work.• Is tricky to make fault tolerant…
  27. 27. Training montage• Paxos coordination algorithm• Multi-versioned data structures• SSD optimizations• Application-managed page cache• Prioritization deeply integrated• Control theory for queue sizes• Testing, testing, testing
  28. 28. HistoryToolsDesignResults
  29. 29. Did we reach our big goals?• High performance• Ease scaling out• Ease of building abstractions• Ease of operation
  30. 30. High performanceFoundationDBdelivers performanceexceeding otherNoSQL databases, butwith transactions!
  31. 31. Ease of scaling out• Add and remove nodes on-the-fly• Single key-space with global transactions• Validated to 96-cores, 48-SSDs
  32. 32. Ease of building abstractions• Transactions enable abstraction• Abstractions very hard to build on non- transactional systems• Ordered data model for performance Abstractions built on a scalable, faulttolerant, transactional foundation inherit those properties.
  33. 33. Examples of “ease”• SQL database in one day• Indexed table layer (3 days * 1 intern)• Fractal spatial index in 200 lines:
  34. 34. Ease of operation• Automatic data partitioning/replication• Highly fault-tolerant• Minimal management Try to break it yourself!
  35. 35. Conclusion• Our mission is to solve the problem of state management so that developers can focus on building their applications• 3+ years in the making, now ready for your applications• Bindings for C, Python, JVM, Node.js, Ruby
  36. 36. Free at
  37. 37. Join our Alpha community
  38. 38. Building a next-generation database david [dot] Twitter: @FoundationDB