Arakoon: A distributed consistent key-value store


Published on

Talk at the OCaml Users and Developers (OUD) workshop during ICFP 2012

Published in: Technology, Education
1 Comment
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Arakoon: A distributed consistent key-value store

  1. 1. Arakoon A distributed consistent key-value store Romain Slootmaekers Nicolas Trangez Incubaid BVBA {romain,nicolas} Twitter: @incubaid Team Blog: September 14, 2012Romain Slootmaekers, Nicolas Trangez Arakoon
  2. 2. Introduction Researchers at Incubaid Incubaid is a technology incubator, active in datacenter & cloud computing Prior exits through Terremark, Telenet (Belgian telco), Veritas/Symantec, Sun Microsystems,. . . Talk about general use of FP in our companies tomorrow at CUFP Romain Slootmaekers, Nicolas Trangez Arakoon
  3. 3. Arakoon Distributed, consistent, persistent key-value store OCaml using Lwt Multi-Paxos consensus protocol implementation Guaranteed consistency across cluster nodes Available as long as a majority (N/2 + 1) members is reachable Handles message loss or duplication, split-brain networking,. . . TokyoCabinet backend Open Source (AGPL-3), see Romain Slootmaekers, Nicolas Trangez Arakoon
  4. 4. Arakoon Features Arakoon feature-set goes beyond basic key-value CRUD interface: Range & prefix lookups on keys (incl. paging) Transactional sequences Test-and-set / CAS Server-side extensions, “user functions” Simple binary protocol with clients in OCaml, C, Python, PHP Romain Slootmaekers, Nicolas Trangez Arakoon
  5. 5. Arakoon Deployments X000 deployments in several products by different companies Created primarily to store metadata of large-scale storage Also used as “NoSQL” store for IAAS platforms Romain Slootmaekers, Nicolas Trangez Arakoon
  6. 6. Baardskeerder Append-only B-tree(ish) database OCaml Replace TokyoCabinet in Arakoon 2.x cycle “SSD-friendly” LGPL-3, Romain Slootmaekers, Nicolas Trangez Arakoon
  7. 7. Why OCaml? Short prototype-to-production cycle FP suits problem domain Availability of cooperative threads (Lwt) “Async” was released after project incubation Fast compiler Native binaries, good performance Prior experience in Amplidata storage product (CUFP talk tomorrow) Romain Slootmaekers, Nicolas Trangez Arakoon
  8. 8. Experiences Fairly easy to get contributors up to speed . . . but does require some mental effort Hard (if possible at all) to hire people with prior OCaml knowledge . . . but not strictly necessary Got to stable version within a couple of months (used in production deployments) Fixing bugs or adding features didn’t introduce more bugs . . . yet this is mostly thanks to an extensive “system” testsuite Lots of “bugs” caused by deployment issues or sysadmin interventions Romain Slootmaekers, Nicolas Trangez Arakoon
  9. 9. Lessons learned Don’t let “developer familiarity” influence app design: using OCaml OO features doesn’t make it easier to contribute (seems to add confusion) Providing a single script to bootstrap an OCaml environment + dependencies is a big plus Since “monadic threading” is new to most contributors, using the Lwt syntax extension might be a bad idea (as experienced prior to Arakoon incubation) Don’t let “RealWorld# IO” creep into the parts which could/should be kept pure Most likely the #1 mistake in Arakoon 1.x Fixed in 2.x: Paxos state-machine is pure Helps in testing & manual correctness validation AGPL-3 might not be an ideal license, yet driven by business-needs Romain Slootmaekers, Nicolas Trangez Arakoon
  10. 10. Experiences: OCaml Language not too hard to grok Stable & fast compiler Non-trivial but not impossible to debug binaries using GDB & read/interpret assembly, if required on-segmentation-faults-stack-overflows-gdb-and-ocaml Stable runtime, except e.g. select fdset bug causing runtime memory corruption ( print_bug_page.php?bug_id=5563) Memory (leak) issues hard to pinpoint and debug Limited standard library, many “basic” functions need ad-hoc implementation Standard library and Lwt provide lots of bindings to low-level procedures for system-level programming Not “open”ing some module at compile time can result in segfaults at runtime?!? ( cgi-bin/bugreport.cgi?bug=602170) Romain Slootmaekers, Nicolas Trangez Arakoon
  11. 11. Experiences: Infrastructure ocamlbuild works OK for the basic cases, but once you need, you’re on your own. Ever tried including C++ code in an ocamlbuild setup? Not convinced oasis improves things (“Oasis is the new Maven”) Preliminary experiments with OPAM on Monday were promising/encouraging! Romain Slootmaekers, Nicolas Trangez Arakoon
  12. 12. Experiences: Lwt - The upside OK to work with, API-wise Lots of built-in functions Active maintainers on mailing list, bugs reports are handled quickly Documentation OK’ish - unless it’s completely missing (e.g. Lwt_pqueue) Romain Slootmaekers, Nicolas Trangez Arakoon
  13. 13. Experiences: Lwt - The downside Irregular and unpredictable release schedule Regressions! Native binary stack-overflow at runtime: not reproducible, only corrupt core dumps available Took > 2 man-weeks to pinpoint Reduced to 2 very small test-cases both exposing the bug in different ways Fixed using work-around in Arakoon, reported to Lwt list including tests, quickly fixed in Lwt-darcs and next release 2 releases afterwards, the exact same issue was re-introduced, both test-cases failing again Romain Slootmaekers, Nicolas Trangez Arakoon
  14. 14. Experiences: Lwt - The downside Significant refactoring/re-implementation changes in-between releases Hard to test Hard to validate correctness (Performance impact: Baardskeerder IO abstracted, can use “Unix” or “Lwt_unix” (or others). Lwt-backed benchmark is 20x slower than sync version) src/unix/lwt_libev_stubs.c: 93 /* Extract the event loop now. 94 95 It seems to crash if we don’t do that (??). */ 96 struct ev_loop *loop = Ev_loop_val(val_loop); Romain Slootmaekers, Nicolas Trangez Arakoon
  15. 15. Conclusion Overall: positive experience Build infrastructure could use some love Lwt is a great project, but releng/testing/(docs) could improve Convincing others a non-standard language like OCaml is/was a good choice for Arakoon, especially to non-coders, can be hard, but it’s worth the effort Still unclear how to get more contributors & users on board. Ideas welcome! Questions? Romain Slootmaekers, Nicolas Trangez Arakoon