Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Implementing data center to data center replication for a distributed database

80 views

Published on

Implementing data center to data center replication for a distributed database by Ewout Prangsma

ArangoDB is a scalable, distributed multi-model database. However, for this talk, it is not necessary to know what this means. Rather the only crucial fact is that it is distributed and written in C++.

Before you stop reading: This talk is about a golang success story. Namely, we had to implement resilient data center to data center (DC2DC) replication for ArangoDB clusters from scratch within 6 weeks (plus some time for testing and debugging). Therefore, we built upon – ArangoDB’s HTTP-based API for asynchronous replication, – the existing golang driver, – the fault tolerant scalable message queue system Kafka, – a lot of existing golang libraries and – golang’s fantastic capabilities for parallelism, communication and data manipulation and pulled this task off. This talk is the story of this project with its many challenges and successes and ends with a surprising revelation about which of the above we did not actually need in the end.



Published in: Technology
  • Be the first to comment

  • Be the first to like this

Implementing data center to data center replication for a distributed database

  1. 1. Copyright © ArangoDB GmbH, 2018 - Confidential One Engine, one Query Language. Multiple Data Models.
  2. 2. Copyright © ArangoDB GmbH, 2018 - Confidential ArangoDB in Large Scale Enterprise Environments Ewout Prangsma @ewoutp
  3. 3. Copyright © ArangoDB GmbH, 2018 - Confidential Contents ● Introduction to ArangoDB ● ArangoDB Revolutions ○ Single to Cluster ○ Cluster to Datacenter ○ Orchestrated Deployments ● Demo ● Questions
  4. 4. Copyright © ArangoDB GmbH, 2018 - Confidential ► Native multi-model database ► Documents, Graphs, Key-Value ► One Query Language for all of them ► Some stats: ► Almost 5M downloads ► Many community contributors ► > 40K commits ► Fortune 10 & F500 customers ► 5600+ Stars on GH Introduction into ArangoDB
  5. 5. Copyright © ArangoDB GmbH, 2018 - Confidential Single Model World 5 Documents - JSON { “type“: "pants", “waist": 32, “length”: 34, “color": "blue", “material”: “cotton" } { “type“: “television", “diagonal size": 46, “hdmi inputs": 3, “wall mountable": true, “built-in tuner": true, “dynamic contrast”: “50,000:1”, “Resolution”: “1920x1080” } Graphs Key Values K => V K => V K => V K => V K => V K => V K => V K => V K => V K => V K => V K => V Relational ID Branch Type Count SalesAssistant Product 2313 Boston Shoes 36 324 756 2654 Worcester Shoes 39 354 567 6546 Weymouth ShoeCareProducts 127 654 445 ID Color Size 756 red 20-28 567 blue 30-41 787 brown 36-48 ID Type LeatherType 445 Cream Velour 676 Brush Smooth leather 987 Spray all Columnar Row Key Column A Column B Column C Column n Row Key Column A Column D Column F Column T Column n Row Key Column D Column F Column U Column n Row Key Column B Column F Column T Column U Column n Time Series { “_inital_time“: 1252546736436, “event”: “start successful" } {“event”: “start subsystem”, “name”: “explorer” } { “_delta”: 124 } {“event”: “start subsystem”, “type”: “webserver” } { “_delta”: 23 }
  6. 6. Copyright © ArangoDB GmbH, 2018 - Confidential Native Multi-Model Approach 6 GraphsDocuments - JSON { “type“: "pants", “waist": 32, “length”: 34, “color": "blue", “material”: “cotton" } { “type“: “television", “diagonal size": 46, “hdmi inputs": 3, “wall mountable": true, “built-in tuner": true, “dynamic contrast”: “50,000:1”, “Resolution”: “1920x1080” } Key Values K => V K => V K => V K => V K => V K => V K => V K => V K => V K => V K => V K => V One Engine. One Query Language. Multiple Data Models.
  7. 7. Copyright © ArangoDB GmbH, 2018 - Confidential Tame Complexity 7 Different Query Languages to Learn Many Databases to Administer Complex Code Base Increased Costs Productivity Hindered One Query Language to Learn One Database to Administer Streamlined Code Base Lower Total Cost of Ownership Improved Productivity Shopping Cart Product Catalog Recommendation Transactions Shopping Cart Product Catalog Recommendation Transactions
  8. 8. Copyright © ArangoDB GmbH, 2018 - Confidential ‣ For a native multi-model database a common query language for all data-models is crucial ‣ AQL aims to be human-readable ‣ Same language for all clients, no matter what programming language people use ‣ Easy to understand for anyone with a SQL background AQL - A Query Language That Feels like Coding 8 SELECT customers.name FROM customers WHERE customers.id = ? and customers.isActive SELECT customer.name, orders.date, orders.amount FROM customers JOIN orders ON customers.customer_id = orders.customer_id WHERE customer.id = ? FOR c IN customers FILTER c.id == @id and c.isActive RETURN c.name FOR c IN customers FILTER c.customer_id == @id RETURN { customer: c.name, orders: ( FOR o IN orders FILTER o.customer_id == c.customer_id RETURN { date: o.date, amount: o.amount }) }
  9. 9. Copyright © ArangoDB GmbH, 2018 - Confidential Recent years ArangoDB is attracting Enterprise customers
  10. 10. Copyright © ArangoDB GmbH, 2018 - Confidential ► Why ► Scale beyond single machine ► High availability ► Tolerate single machine failures ► What is it ► Data in every collection is sharded ► Shards “live” in a dbserver ► Coordinators coordinate data modification & retrieval Revolution 1: ArangoDB Cluster
  11. 11. Copyright © ArangoDB GmbH, 2018 - Confidential 2017 ArangoDB is attracting global Enterprise customers
  12. 12. Copyright © ArangoDB GmbH, 2018 - Confidential ► ArangoDB cluster has grown up ► Customers want multiple clusters ► Geographic reasons ► Disaster recovery reasons ► Functional reasons ► Customers want to share / replicate data between clusters ► Some latency is acceptable Revolution 2: ArangoDB at Global Scale
  13. 13. Copyright © ArangoDB GmbH, 2018 - Confidential ► Summer of 2017: DC2DC as a concept was born ► Lead customer was identified ► How to build it … quickly … very quickly ► What did we have ArangoDB Datacenter 2 Datacenter Replication
  14. 14. Copyright © ArangoDB GmbH, 2018 - Confidential ► Synchronize an entire cluster ► With all its data ► With all its structure & metadata ► Don’t delay local write ► Asynchronous replication ► Ready for testing by end of October 2017 DC2DC Requirements
  15. 15. Copyright © ArangoDB GmbH, 2018 - Confidential ► Shard synchronization protocol ► Drivers for various programming languages ► Large C++ codebase ► Many very experienced C++ programmers ► Small Go codebase ► Few very experienced Go programmers DC2DC Building blocks
  16. 16. Copyright © ArangoDB GmbH, 2018 - Confidential ► Which language ► C++ was well known, but its development slow ► Go was known by less, but suitable ► Something else ► 2018 on the horizon -> Security first ► TLS all the way ► Authentication, yes, always ► Given TLS experience in C++, this favored Go DC2DC Decision time
  17. 17. Copyright © ArangoDB GmbH, 2018 - Confidential DC2DC Initial Design Draft Datacenter A ArangoDB Cluster ArangoSync Kafka Datacenter B ArangoDB Cluster ArangoSync Kafka Control Bulk data
  18. 18. Copyright © ArangoDB GmbH, 2018 - Confidential ► ArangoDB Cluster ► No changes needed … so we hoped ► ArangoSync ► “All” go standalone binary ► Implements Master & Worker role ► Kafka (& Zookeeper) ► Message queue ► Buffer for downtime, link failure etc. DC2DC Components
  19. 19. Copyright © ArangoDB GmbH, 2018 - Confidential ► 1 Go programmer ► 1 Go reviewer ► 1 Lead customer ► VERY helpful ► Later: ► 2 dedicated testers ► Various C++ programmers for cluster additions Development team
  20. 20. Copyright © ArangoDB GmbH, 2018 - Confidential ► Time pressure ► Missing database APIs ► Identify missing API ► Develop API in database (build, review, merge) ► Deliver test version ► Repeat for next API ► Automated testing ► Reproducible cluster setup takes time (we needed 2 each time) ► Testing failure scenarios is hard Development challenges (1)
  21. 21. Copyright © ArangoDB GmbH, 2018 - Confidential ► Kafka & Go ► Several drivers exist ► Either pure Go, but not 100% stable and/or recent ► Confluent Go-C++ hybrid, stable, but C++ Development challenges (2)
  22. 22. Copyright © ArangoDB GmbH, 2018 - Confidential ► Lot of experience with dynamically linked C++ binaries ► Large build times, distribution dependent ► Go binary used to be statically compiled ► Now it was not. ► Solution: ► Build Confluent Kafka client library ► In Alpine docker image ► With musl libc ► Build combined binary in dedicated build container ► Side effect: Server binary is Linux only Static Go & C++ binary
  23. 23. Copyright © ArangoDB GmbH, 2018 - Confidential ► We used: ► Lots of “small” cloud machines (e.g. Scaleway) ► Ansible for provisioning machine ► Jenkins for triggering & running tests ► Lot’s of integration tests (go test …) ► Too few unit tests ► Split tests up: ► Short: Run for every PR ► Long: Nightly taking hours Automated testing (1)
  24. 24. Copyright © ArangoDB GmbH, 2018 - Confidential ► Debug trace of all messages proved a lifesaver ► Synchronizing clocks on machines also helps… ► Monitoring (Prometheus) can be a debug tool ► Cloud machines spread all over EU shows real world problems Automated testing (2)
  25. 25. Copyright © ArangoDB GmbH, 2018 - Confidential ► We made the “end of October” deadline ► We underestimated testing effort ► kill -STOP is nasty ► Stable release at Jan 5, 2018 November 2017 - Success ...
  26. 26. Copyright © ArangoDB GmbH, 2018 - Confidential ► It was complex ► Has lots of moving parts ► Significant outside dependency (kafka) ► Triggers additional validation & certification at customer But ...
  27. 27. Copyright © ArangoDB GmbH, 2018 - Confidential ► Let’s drop Kafka ► Replace with in-memory message queue ► Are you out of your mind? Simplification was needed!
  28. 28. Copyright © ArangoDB GmbH, 2018 - Confidential ► Direct message queue: ► In-memory queue of messages ► Send over “open” HTTPS connection ► In case of failure, re-synchronize ► Shard synchronization protocol is quick ► Only send over missing / modified data Direct Message Queue was born
  29. 29. Copyright © ArangoDB GmbH, 2018 - Confidential DC2DC Direct Message Queue Design Draft Datacenter A ArangoDB Cluster ArangoSync Master Datacenter B ArangoDB Cluster ArangoSync Master Control& Data ArangoSync Worker ArangoSync Worker
  30. 30. Copyright © ArangoDB GmbH, 2018 - Confidential ► Less moving parts ► No more Kafka & Zookeeper ► HTTPS for all ► No protocol distinction, easy operations ► Server can run on “any” OS again Direct message queue benefits
  31. 31. Copyright © ArangoDB GmbH, 2018 - Confidential ► The project was amazing ► Fast, furious, late ► Lead customer brings ► Focus & dedication ► Invaluable input ► Choosing Go was the right choice ► Good libraries (except Kafka) ► Very good TLS & Certificate support ► Quick to develop DC2DC Conclusions
  32. 32. Copyright © ArangoDB GmbH, 2018 - Confidential 2018 Kubernetes is everywhere
  33. 33. Copyright © ArangoDB GmbH, 2018 - Confidential ► Deployment options until early 2018 ► Per machine. E.g. systemd ► Per machine but easier: ArangoDB Starter ► Managed cluster: DC/OS ► Clearly Kubernetes is missing Revolution 3: ArangoDB Deployment
  34. 34. Copyright © ArangoDB GmbH, 2018 - Confidential ► Kubernetes is “hot” ► and winning... ► Customers are asking for it ► Not always for the right reasons :-) ► It simplifies deployment & operations Kubernetes Deployment: Why
  35. 35. Copyright © ArangoDB GmbH, 2018 - Confidential ► Kubernetes is easy for stateless apps ► But a database is … not stateless ► StatefulSets to the rescue ► Or not... Kubernetes Deployment: How
  36. 36. Copyright © ArangoDB GmbH, 2018 - Confidential ► Good because ► It is really stateful ► Not good because ► A database is complex ► A database has special upgrade requirements ► Additionally ► Writing lots of YAML is not funny Stateful Sets
  37. 37. Copyright © ArangoDB GmbH, 2018 - Confidential ► “Simple” cluster using StatefulSet takes more than 500 lines! ► Upgrade process is a nightmare ► Does generating YAML help? ► Yes .... initially ► No ... for scaling, upgrading etc. Simplification is needed
  38. 38. Copyright © ArangoDB GmbH, 2018 - Confidential ► Introducing Kube-ArangoDB ► An operator for running ArangoDB deployments ► A CustomResource Kube-ArangoDB: Introduction apiVersion: "database.arangodb.com/v1alpha" kind: "ArangoDeployment" metadata: name: "example-simple-cluster" spec: mode: Cluster
  39. 39. Copyright © ArangoDB GmbH, 2018 - Confidential ► Modify the deployment resource ► kubectl apply ... Kube-ArangoDB: Scaling apiVersion: "database.arangodb.com/v1alpha" kind: "ArangoDeployment" metadata: name: "example-simple-cluster" spec: mode: Cluster dbservers: count: 7
  40. 40. Copyright © ArangoDB GmbH, 2018 - Confidential ► Modify the deployment resource ► kubectl apply ... Kube-ArangoDB: Upgrading apiVersion: "database.arangodb.com/v1alpha" kind: "ArangoDeployment" metadata: name: "example-simple-cluster" spec: mode: Cluster image: arangodb/arangodb:3.3.9
  41. 41. Copyright © ArangoDB GmbH, 2018 - Confidential ► Supports all deployment modes ► Single, Active Failover, Cluster ► Automatic TLS ► Automatic Authentication ► Locally attached persistent storage with custom resource ► DC2DC support ► sync.enabled: true ► Supports Kubernetes Federation ► Easy configuration with additional custom resource Kube-ArangoDB: Features
  42. 42. Copyright © ArangoDB GmbH, 2018 - Confidential ► Deployment operator ► Not ready for production ► But … why not try it anyway ► DC2DC Support ► Not released yet ► Deployment Replication operator ► Brand new Kube-ArangoDB: Status
  43. 43. Copyright © ArangoDB GmbH, 2018 - Confidential DEMO
  44. 44. Copyright © ArangoDB GmbH, 2018 - Confidential ● ArangoDB feature set available everywhere ○ Small machine … Cluster … Datacenter … Multiple DCs ● Deploy easily in Kubernetes ● Mixing languages & developers can be done … ● Distributed systems are fun! Closing words
  45. 45. Copyright © ArangoDB GmbH, 2018 - Confidential Links ► ArangoDB: https://www.arangodb.com ► Documentation: https://docs.arangodb.com ► Github: https://github.com/arangodb/ArangoDB (please star us!) ► Twitter: @arangodb (please follow us!)
  46. 46. Copyright © ArangoDB GmbH, 2018 - Confidential Thank you!

×