Service Discovery like a Pro
Eran Harel
@eran_ha
Motivation
Consul Overview
Consul Architecture
API
Alternatives
Do we have a problem at all?
my.target.service.url=http://host:port/contextBase/...
memcached.cluster=mem1:11211,mem1:11311,mem2:11211,mem2:111311
db.host=mysql:3304
What do we do if the nodes are dynamically allocated?
What do we do if nodes are unhealthy?
How can we scale (up/down) our clusters?
What happens if the port is dynamic?
The Traditional Approaches
HTTP Load Balancers (e.g. HAProxy)
TCP Keepalive
VIP
etc...
What is Consul?
Service discovery and configuration
made easy. Distributed, highly
available, and datacenter-aware.
Developed @ Hashicorp
Open source - https://github.com/hashicorp/consul
Written in Go
Current version 0.6
https://consul.io/
Features
Service Discovery
Failure Detection (Health Checks)
K/V Store
Multi Datacenter
consul-template (external project)
Basic Architecture
DNS API
DNS API
$ host memcached.service.consul
memcached.service.consul has address 10.xx.xx.01
memcached.service.consul has address 10.xx.xx.02
memcached.service.consul has address 10.xx.xx.03
$
$ host test.memcached.service.consul
memcached.service.consul has address 10.xx.xx.51
memcached.service.consul has address 10.xx.xx.52
$
$ host prod.memcached-legacy.service.dc2.consul
memcached.service.consul has address 10.yy.xx.01
memcached.service.consul has address 10.yy.xx.02
REST API
http://localhost:8500/v1/agent/service/register
http://localhost:8500/v1/agent/service/deregister/<MyService>
http://localhost:8500/v1/catalog/services/service/<MyService>
http://localhost:8500/v1/catalog/nodes
http://localhost:8500/v1/health/service/<MyService>
Certain endpoints support a feature called a "blocking query." A blocking query is
used to wait for a potential change using long polling.
Consul CLI
$ consul
usage: consul [--version] [--help] <command> [<args>]
Available commands are:
agent Runs a Consul agent
configtest Validate config file
event Fire a new event
exec Executes a command on Consul nodes
force-leave Forces a member of the cluster to enter the "left" state
info Provides debugging information for operators
join Tell Consul agent to join cluster
keygen Generates a new encryption key
keyring Manages gossip layer encryption keys
leave Gracefully leaves the Consul cluster and shuts down
lock Execute a command holding a lock
maint Controls node or service maintenance mode
members Lists the members of a Consul cluster
monitor Stream logs from a Consul agent
reload Triggers the agent to reload configuration files
version Prints the Consul version
watch Watch for changes in Consul
Consul CLI (cont)
$ consul maint -service Hello0 -enable
Service maintenance is now enabled for "Hello0"
On the server log:
2015/12/09 21:51:13 [INFO] agent: Service "Hello0" entered maintenance mode
2015/12/09 21:51:13 [INFO] agent: Synced check '_service_maintenance:Hello0'
What are the alternatives?
ZooKeeper, doozerd, etcd
Chef, Puppet, etc
Nagios, Sensu
SkyDNS
SmartStack
Serf
Discovery and Client Side LB Demo
Hello2
Hello0
Hello1
http://localhost:8500/v1/register
{
"ID": "Hello0",
"Name": "Hello",
"Port": 8080,
"Tags": [
"instance0",
"production",
"httpPort-8080",
"contextPath-/api",
],
"Check": {
"HTTP": "http://localhost:8080/api/hello/instance",
"Interval": "1s",
"TTL": "1s"
}
}
http://localhost:8500/v1/health/service/Hello
?passing=true&tag=production&stale=true
&index={index}&wait=30s
Demo Preview
How do we implement discovery and client side LB?
Each module registers itself to the local consul agent upon startup, and provides
enough metadata to allow filtering
http://localhost:8500/v1/register
{
"ID": "Hello0",
"Name": "Hello",
"Port": 8080,
"Tags": [
"instance0",
"production",
"httpPort-8080",
"contextPath-/api",
],
"Check": {
"HTTP": "http://localhost:8080/api/hello/instance",
"Interval": "1s",
"TTL": "1s"
}
}
How do we implement discovery and client side LB?
The local consul agent calls the provided health check(s) and verifies the instances are
healthy.
Don’t forget to add proper timeouts!
curl --fail --max-time 1 “http://localhost:8080/api/hello/instance”
How do we implement discovery and client side LB?
Clients perform long polling queries to the health API, maintain a list of healthy
instances, and build target URLs.
At Outbrain we use the ConsulBasedTargetProvider with HealthTargetsList
to achieve this.
http://localhost:8500/v1/health/service/Hello?passing=true&tag=production&stale=true
&index={index}&wait=30s
X-Consul-Index=4245721
How do we implement discovery and client side LB?
Upon client request, we select a target based on some strategy (e.g. round-robin).
How do we implement discovery and client side LB?
Clients need to implement resilience logic such as retries, timeouts, circuit-breakers,
etc
final HelloService helloService = new ClientBuilder<>(HelloService.class).
setProtocol(ContentType.JSON).
setConnectionTimeout(100).
setRequestTimeout(100).
setRetries(3).
setTargetProvider(new ConsulBasedTargetProvider(healthyTargetsList, "/hello", null)).
build();
References & Links
Consul Docs - https://consul.io/docs/index.html
Example Source Code - https://github.com/outbrain/ob1k/tree/master/ob1k-
example/src/main/java/com/outbrain/ob1k/example/hello
We are recruiting...

Service Discovery Like a Pro

  • 1.
    Service Discovery likea Pro Eran Harel @eran_ha
  • 2.
  • 3.
    Do we havea problem at all?
  • 4.
  • 5.
    What do wedo if the nodes are dynamically allocated? What do we do if nodes are unhealthy? How can we scale (up/down) our clusters? What happens if the port is dynamic?
  • 6.
  • 7.
    HTTP Load Balancers(e.g. HAProxy) TCP Keepalive VIP etc...
  • 8.
  • 9.
    Service discovery andconfiguration made easy. Distributed, highly available, and datacenter-aware.
  • 10.
    Developed @ Hashicorp Opensource - https://github.com/hashicorp/consul Written in Go Current version 0.6 https://consul.io/
  • 11.
    Features Service Discovery Failure Detection(Health Checks) K/V Store Multi Datacenter consul-template (external project)
  • 12.
  • 13.
  • 14.
    DNS API $ hostmemcached.service.consul memcached.service.consul has address 10.xx.xx.01 memcached.service.consul has address 10.xx.xx.02 memcached.service.consul has address 10.xx.xx.03 $ $ host test.memcached.service.consul memcached.service.consul has address 10.xx.xx.51 memcached.service.consul has address 10.xx.xx.52 $ $ host prod.memcached-legacy.service.dc2.consul memcached.service.consul has address 10.yy.xx.01 memcached.service.consul has address 10.yy.xx.02
  • 15.
  • 16.
    Consul CLI $ consul usage:consul [--version] [--help] <command> [<args>] Available commands are: agent Runs a Consul agent configtest Validate config file event Fire a new event exec Executes a command on Consul nodes force-leave Forces a member of the cluster to enter the "left" state info Provides debugging information for operators join Tell Consul agent to join cluster keygen Generates a new encryption key keyring Manages gossip layer encryption keys leave Gracefully leaves the Consul cluster and shuts down lock Execute a command holding a lock maint Controls node or service maintenance mode members Lists the members of a Consul cluster monitor Stream logs from a Consul agent reload Triggers the agent to reload configuration files version Prints the Consul version watch Watch for changes in Consul
  • 17.
    Consul CLI (cont) $consul maint -service Hello0 -enable Service maintenance is now enabled for "Hello0" On the server log: 2015/12/09 21:51:13 [INFO] agent: Service "Hello0" entered maintenance mode 2015/12/09 21:51:13 [INFO] agent: Synced check '_service_maintenance:Hello0'
  • 18.
    What are thealternatives? ZooKeeper, doozerd, etcd Chef, Puppet, etc Nagios, Sensu SkyDNS SmartStack Serf
  • 19.
  • 20.
    Hello2 Hello0 Hello1 http://localhost:8500/v1/register { "ID": "Hello0", "Name": "Hello", "Port":8080, "Tags": [ "instance0", "production", "httpPort-8080", "contextPath-/api", ], "Check": { "HTTP": "http://localhost:8080/api/hello/instance", "Interval": "1s", "TTL": "1s" } } http://localhost:8500/v1/health/service/Hello ?passing=true&tag=production&stale=true &index={index}&wait=30s Demo Preview
  • 21.
    How do weimplement discovery and client side LB? Each module registers itself to the local consul agent upon startup, and provides enough metadata to allow filtering http://localhost:8500/v1/register { "ID": "Hello0", "Name": "Hello", "Port": 8080, "Tags": [ "instance0", "production", "httpPort-8080", "contextPath-/api", ], "Check": { "HTTP": "http://localhost:8080/api/hello/instance", "Interval": "1s", "TTL": "1s" } }
  • 22.
    How do weimplement discovery and client side LB? The local consul agent calls the provided health check(s) and verifies the instances are healthy. Don’t forget to add proper timeouts! curl --fail --max-time 1 “http://localhost:8080/api/hello/instance”
  • 23.
    How do weimplement discovery and client side LB? Clients perform long polling queries to the health API, maintain a list of healthy instances, and build target URLs. At Outbrain we use the ConsulBasedTargetProvider with HealthTargetsList to achieve this. http://localhost:8500/v1/health/service/Hello?passing=true&tag=production&stale=true &index={index}&wait=30s X-Consul-Index=4245721
  • 24.
    How do weimplement discovery and client side LB? Upon client request, we select a target based on some strategy (e.g. round-robin).
  • 25.
    How do weimplement discovery and client side LB? Clients need to implement resilience logic such as retries, timeouts, circuit-breakers, etc final HelloService helloService = new ClientBuilder<>(HelloService.class). setProtocol(ContentType.JSON). setConnectionTimeout(100). setRequestTimeout(100). setRetries(3). setTargetProvider(new ConsulBasedTargetProvider(healthyTargetsList, "/hello", null)). build();
  • 26.
    References & Links ConsulDocs - https://consul.io/docs/index.html Example Source Code - https://github.com/outbrain/ob1k/tree/master/ob1k- example/src/main/java/com/outbrain/ob1k/example/hello
  • 27.