Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Building Robust
Systems With Consul
I’m Mitchell Hashimoto
Also known as @mitchellh
HashiCorp
Towards a
Software Managed Datacenter
Vagrant
http://www.vagrantup.com
Packer
http://www.packer.io
SERF
http://www.serfdom.io
Consul
http://www.consul.io
Consul
Take a Step Back
Taking a look at the big picture.
Node
Service Service Service
Hypervisor
Node Node Node
S S S S S S S S S
Hypervisor
Node Node Node
Container S S Container S Container
S S S S S S
Hypervisor
Node Node Node
Container S S Container S Container
S S S S S S
Modern Ops
More everything, more problems.
• Where is service foo?
• Is service foo healthy/available?
• What is service foo’s
configuration?
• Where is the service ...
Meta:
What happens when the thing that
answers these questions is
unavailable?
Robust Systems
Stem from the ability to answer these
questions.
• Start services in any order
• Destroy services with confidence
• Restart servers safely
• Reconfigure services easily
Pr...
• Where is service foo?
• Is service foo healthy/available?
• What is service foo’s
configuration?
• Where is the service ...
Where is service foo?
Maybe here: 127.0.0.1
Maybe close: 10.0.1.35
Maybe there: foo.foohost.com
Is service foo healthy/available?
Yes: Great!
No: Avoid or handle gracefully.
What is service foo’s
configuration?
Access information, supported
features, enabled/disabled.
What is my configuration?
Expect it to be modifiable.
Where is the service foo leader or
best choice?
Locality, master/slave, versions.
Meta: Is the thing answering these
questions stable/available?
Critical infrastructure component,
you want “yes” as often ...
Robust! Can find services, can avoid
and handle unhealthy services, can
be configured externally, and can
trust that it ca...
• Start services in any order
• Destroy services with confidence
• Restart servers safely
• Reconfigure services easily
Pr...
Consul
Solution Attempts
In a world… before Consul...
Manual/Hardcoded
• Doesn’t scale with services/nodes
• Not resilient to failures
• Localized visibility/auditability
• Man...
Config Mgmt Problem
• Slow to react to changes
• Not resilient to failures
• Not really configurable by
developers
• Local...
LB Fronted Services
• Introduces different SPOF
• How does LB find service
addresses/configure?
• Solves some problems, th...
ZooKeeper
• Complicated
• Heavy clients
• Building block, very manual
Consul
Service Discovery
Where is service foo?
Service Discovery
$ dig web-frontend.service.consul. +short
10.0.3.89
10.0.1.46
$ curl http://localhost:8500/v1/catalog/se...
Service Discovery
• DNS is legacy-friendly. No
application changes required.
• HTTP returns rich metadata.
Failure Detection
Is service foo healthy/available?
Failure Detection
Failure Detection
• DNS won’t return non-healthy
services or nodes.
• HTTP has endpoints to list health
state of catalog.
Key/Value Storage
What is the config of service foo?
Key/Value Storage
$ curl –X PUT –d ‘bar’
http://localhost:8500/v1/kv/foo
true
$ curl http://localhost:8500/v1/kv/foo?raw
b...
Key/Value Storage
• Highly available storage of
configuration.
• Turn knobs without big
configuration management
process.
Multi-Datacenter
Multi-Datacenter
$ dig web-frontend.singapore.service.consul. +short
10.3.3.33
10.3.1.18
$ dig web-frontend.germany.servic...
Multi-Datacenter
$ curl http://localhost:8500/v1/kv/foo?raw&dc=asia
true
$ curl http://localhost:8500/v1/kv/foo?raw&dc=eu
...
Multi-Datacenter
• Local by default
• Can query other datacenters
however you may need to
Web UI
Web UI
• Node, service, health check, and
K/V management and visibility
for every datacenter in a single
UI.
Operations
Consul Availability / Scalability
The Meta Question
Architecture
Server Cluster
• 3, 5, 7 servers
• (n/2) + 1 for
availability
• Replicated writes
• Automatic leader
election, leader
forw...
Lightweight Clients
• Ephemeral state
• Health checks
• Optional (but
recommended). Legacy
machines don’t need
them.
• Aut...
Cheap Gossip
• Health check and
membership info.
• Very cheap
• No guaranteed
reliability, but only used
for data that can...
Multi-DC
• Independent server
clusters
• Request forwarding
• WAN gossip for
membership
General Points: Servers
• (n+1)/2 servers for write avail
• More servers means higher write latency
because of replication...
General Points: Clients
• Clients can be removed/added at will
without issue.
• Clients don’t currently affect read/write
...
Throughput
• On virtualized cloud systems with
spinning disks: thousands of
reads and writes per second
• Practically won’...
Scalable and available. Consul’s
architecture makes it incredibly
scalable and highly unlikely to
become unavailable.
Robust Systems
Consul configured, monitored, discovered
• Consul KV for configuration.
• Consul DNS for service
coupling/discovery.
• Consul Health Checks for
monitoring.
Consul KV: Configuration
Consul KV: Configuration
$ envconsul –reload myapp/config bin/myapp
…
Consul KV: Configuration
• envconsul turns K/V into
environmental variables and
restarts on change.
• No application chang...
Consul DNS: Service Discovery
$ envconsul myapp/config env
ELASTICSEARCH_HOST=elasticsearch.service.consul.
POSTGRESQL_HOS...
Consul DNS: Service Discovery
• Configuration to point to other
services uses DNS.
• No application changes!
Consul Health Checks: Monitoring
$ cat /etc/consul.d/web.json
{
“check”: {
“name”: “http”,
“script”: “curl localhost:80”,
...
Consul Health Checks: Monitoring
Consul Health Checks: Monitoring
• Simple shell scripts (UNIXy)
• Logged output
• Won’t show as result in service
discover...
Robust! Add/remove services,
reconfigure services, see global
state of services without
complicated logic. And without
mod...
Thank You
http://www.consul.io
Mitchell Hashimoto: Building Robust Systems w/ Service Discovery & Configuration
Upcoming SlideShare
Loading in …5
×

Mitchell Hashimoto: Building Robust Systems w/ Service Discovery & Configuration

4,012 views

Published on

Building Robust Systems with Service Discovery and Configuration

There is no scenario in the future where we have less servers. Whether you consider a server a physical machine, a virtual machine, or even a container, the number of each is growing at an extremely fast rate. It is becoming increasingly important in this view of the world to build robust systems that can ideally run anywhere, recover from crashes, distribute load, etc.

In this talk, I discuss these problems and how having a powerful system for service discovery and configuration can actually get you a fairly robust system without additional modifications. With this knowledge equipped, it becomes much easier to imagine migrating legacy and new infrastructures over to this modern world of many commodity machines.

https://twitter.com/mitchellh
http://mitchellh.com

Published in: Technology
  • Be the first to comment

Mitchell Hashimoto: Building Robust Systems w/ Service Discovery & Configuration

  1. 1. Building Robust Systems With Consul
  2. 2. I’m Mitchell Hashimoto Also known as @mitchellh
  3. 3. HashiCorp Towards a Software Managed Datacenter
  4. 4. Vagrant http://www.vagrantup.com Packer http://www.packer.io SERF http://www.serfdom.io Consul http://www.consul.io
  5. 5. Consul
  6. 6. Take a Step Back Taking a look at the big picture.
  7. 7. Node Service Service Service
  8. 8. Hypervisor Node Node Node S S S S S S S S S
  9. 9. Hypervisor Node Node Node Container S S Container S Container S S S S S S
  10. 10. Hypervisor Node Node Node Container S S Container S Container S S S S S S
  11. 11. Modern Ops More everything, more problems.
  12. 12. • Where is service foo? • Is service foo healthy/available? • What is service foo’s configuration? • Where is the service foo leader?
  13. 13. Meta: What happens when the thing that answers these questions is unavailable?
  14. 14. Robust Systems Stem from the ability to answer these questions.
  15. 15. • Start services in any order • Destroy services with confidence • Restart servers safely • Reconfigure services easily Practical Goals
  16. 16. • Where is service foo? • Is service foo healthy/available? • What is service foo’s configuration? • Where is the service foo leader?
  17. 17. Where is service foo? Maybe here: 127.0.0.1 Maybe close: 10.0.1.35 Maybe there: foo.foohost.com
  18. 18. Is service foo healthy/available? Yes: Great! No: Avoid or handle gracefully.
  19. 19. What is service foo’s configuration? Access information, supported features, enabled/disabled.
  20. 20. What is my configuration? Expect it to be modifiable.
  21. 21. Where is the service foo leader or best choice? Locality, master/slave, versions.
  22. 22. Meta: Is the thing answering these questions stable/available? Critical infrastructure component, you want “yes” as often as possible.
  23. 23. Robust! Can find services, can avoid and handle unhealthy services, can be configured externally, and can trust that it can retrieve all of this information.
  24. 24. • Start services in any order • Destroy services with confidence • Restart servers safely • Reconfigure services easily Practical Goals
  25. 25. Consul
  26. 26. Solution Attempts In a world… before Consul...
  27. 27. Manual/Hardcoded • Doesn’t scale with services/nodes • Not resilient to failures • Localized visibility/auditability • Manual locality of services
  28. 28. Config Mgmt Problem • Slow to react to changes • Not resilient to failures • Not really configurable by developers • Locality, monitoring, etc. manual
  29. 29. LB Fronted Services • Introduces different SPOF • How does LB find service addresses/configure? • Solves some problems, though.
  30. 30. ZooKeeper • Complicated • Heavy clients • Building block, very manual
  31. 31. Consul
  32. 32. Service Discovery Where is service foo?
  33. 33. Service Discovery $ dig web-frontend.service.consul. +short 10.0.3.89 10.0.1.46 $ curl http://localhost:8500/v1/catalog/service/web- frontend [{ “Node”: “node-e818f1”, “Address”: “10.0.3.89”, “ServiceID”: “web-frontend”, … }]
  34. 34. Service Discovery • DNS is legacy-friendly. No application changes required. • HTTP returns rich metadata.
  35. 35. Failure Detection Is service foo healthy/available?
  36. 36. Failure Detection
  37. 37. Failure Detection • DNS won’t return non-healthy services or nodes. • HTTP has endpoints to list health state of catalog.
  38. 38. Key/Value Storage What is the config of service foo?
  39. 39. Key/Value Storage $ curl –X PUT –d ‘bar’ http://localhost:8500/v1/kv/foo true $ curl http://localhost:8500/v1/kv/foo?raw bar
  40. 40. Key/Value Storage • Highly available storage of configuration. • Turn knobs without big configuration management process.
  41. 41. Multi-Datacenter
  42. 42. Multi-Datacenter $ dig web-frontend.singapore.service.consul. +short 10.3.3.33 10.3.1.18 $ dig web-frontend.germany.service.consul. +short 10.7.3.41 10.7.1.76
  43. 43. Multi-Datacenter $ curl http://localhost:8500/v1/kv/foo?raw&dc=asia true $ curl http://localhost:8500/v1/kv/foo?raw&dc=eu false
  44. 44. Multi-Datacenter • Local by default • Can query other datacenters however you may need to
  45. 45. Web UI
  46. 46. Web UI • Node, service, health check, and K/V management and visibility for every datacenter in a single UI.
  47. 47. Operations Consul Availability / Scalability
  48. 48. The Meta Question
  49. 49. Architecture
  50. 50. Server Cluster • 3, 5, 7 servers • (n/2) + 1 for availability • Replicated writes • Automatic leader election, leader forwarding.
  51. 51. Lightweight Clients • Ephemeral state • Health checks • Optional (but recommended). Legacy machines don’t need them. • Automatic request forwarding to servers.
  52. 52. Cheap Gossip • Health check and membership info. • Very cheap • No guaranteed reliability, but only used for data that can be lost • (See Serf)
  53. 53. Multi-DC • Independent server clusters • Request forwarding • WAN gossip for membership
  54. 54. General Points: Servers • (n+1)/2 servers for write avail • More servers means higher write latency because of replication. Throughput marginally affected. • Can leave/add at will, keeping in mind min. node requirement.
  55. 55. General Points: Clients • Clients can be removed/added at will without issue. • Clients don’t currently affect read/write throughput in a meaningful way. • Although technically optional, they’re highly recommended for delegated health checks.
  56. 56. Throughput • On virtualized cloud systems with spinning disks: thousands of reads and writes per second • Practically won’t hit read/write limit
  57. 57. Scalable and available. Consul’s architecture makes it incredibly scalable and highly unlikely to become unavailable.
  58. 58. Robust Systems Consul configured, monitored, discovered
  59. 59. • Consul KV for configuration. • Consul DNS for service coupling/discovery. • Consul Health Checks for monitoring.
  60. 60. Consul KV: Configuration
  61. 61. Consul KV: Configuration $ envconsul –reload myapp/config bin/myapp …
  62. 62. Consul KV: Configuration • envconsul turns K/V into environmental variables and restarts on change. • No application changes!
  63. 63. Consul DNS: Service Discovery $ envconsul myapp/config env ELASTICSEARCH_HOST=elasticsearch.service.consul. POSTGRESQL_HOST=master.postgresql.service.consul. REDIS_HOST=redis.service.consul.
  64. 64. Consul DNS: Service Discovery • Configuration to point to other services uses DNS. • No application changes!
  65. 65. Consul Health Checks: Monitoring $ cat /etc/consul.d/web.json { “check”: { “name”: “http”, “script”: “curl localhost:80”, “interval”: “5s” } }
  66. 66. Consul Health Checks: Monitoring
  67. 67. Consul Health Checks: Monitoring • Simple shell scripts (UNIXy) • Logged output • Won’t show as result in service discovery queries if failing.
  68. 68. Robust! Add/remove services, reconfigure services, see global state of services without complicated logic. And without modifying application code.
  69. 69. Thank You http://www.consul.io

×