Clusters
and where to find
them
Eugene Pirogov
gmile
Databases
Tools
Theory
Takeaways
Databases
Tools
Theory
Takeaways
What is
a cluster?
set of loosely or tightly
connected computers that work
together so that, in many
respects, they can be viewed as
a single system
When to use
a cluster?
1. Fail-over
clusters
2. Load balancing
clusters
What typical
Erlang cluster
is built with?
1. A node
node()
2. An RPC call
:rpc.call(:nodex, M, :f, [“a”])
3. send call
send({MyProcess, :mynode}, :msg)
Example 1:
Starting a node
iex
~> iex
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(1)> node()
:nonode@nohost
iex(2)>
iex --name eugene
~> iex --name eugene
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(eugene@Eugenes-MacBook-Pro-2.local)1>
iex --sname eugene
~> iex --sname eugene
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(eugene@Eugenes-MacBook-Pro-2)1>
iex --name eugene@host
~> iex --name eugene@host
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(eugene@host)1>
Example 2:
starting two nodes
iex --name node1@127.0.0.1
iex --name node2@127.0.0.1
~> iex --name node1@127.0.0.1
iex(node1@127.0.0.1)1>
~> iex --name node2@127.0.0.1
iex(node2@127.0.0.1)1>
# On node1
iex(1)> :net_adm.ping(:’node2@127.0.0.1’)
:pong
Example 3:
sending a message
iex --name node1
iex --name node2
# On node2
iex(1)> Process.register(Terminal, self())
# On node1
iex(1)> send({Terminal, :’node2@127.0.0.1’}, “Hello!”)
# On node2
iex(2)> flush()
“Hello!”
Example 4:
calling remotely
# On node1
iex(node1@127.0.0.1)1> :rpc.call(:'node2@127.0.0.1', Enum, :reverse, [100..1])
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, …]
iex(node1@127.0.0.1)2>
REST/JSON/XML
Binary protocol
REST/JSON/XML
Binary protocol
REST/JSON/XML
Binary protocol
Bonus track:
Magic cookie!
cat ~/.erlang.cookie
iex --name node1 --cookie abc
iex --name node2 --cookie xyz
# On node1
iex(1)> :erlang.get_cookie()
:abc
# On node2
iex(1)> :erlang.get_cookie()
:xyz
# On node1
iex(2)> :net_kernel.connect(:'node2@127.0.01')
false
# On node1
iex(3)> :erlang.set_cookie(:’node1@127.0.01’, :xyz)
true
# On node1
iex(4)> :net_kernel.connect(:'node2@127.0.01')
true
Databases
Tools
Theory
Takeaways
epmd
Erlang Port
Mapper Daemon
runs on system
startup
~> ps ax | grep epmd
25502 ?? S 0:11.53 /usr/local/Cellar/erlang/19.1/lib/
erlang/erts-8.1/bin/epmd -daemon
maps ports
to node names
net_kernel
Example 5:
starting a
distributed node
iex
~> iex
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(1)>
~> iex
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(1)> node()
:nonode@nohost
iex(2)>
~> iex
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(1)> node()
:nonode@nohost
iex(2)> Process.registered() |> Enum.find(&(&1
== :net_kernel))
nil
~> iex
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(1)> node()
:nonode@nohost
iex(2)> Process.registered() |> Enum.find(&(&1
== :net_kernel))
nil
iex(3)> :net_kernel.start([:’mynode@127.0.0.1’])
{:ok, #PID<0.84.0>}
iex(mynode@127.0.0.1)4>
~> iex
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:8:8] [async-
threads:10] [hipe] [kernel-poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to exit (type h()
ENTER for help)
iex(1)> node()
:nonode@nohost
iex(2)> Process.registered() |> Enum.find(&(&1
== :net_kernel))
nil
iex(3)> :net_kernel.start([:’mynode@127.0.0.1’])
{:ok, #PID<0.84.0>}
iex(mynode@127.0.0.1)4> Process.registered() |>
Enum.find(&(&1 == :net_kernel))
:net_kernel
Example 6:
monitoring a
node
iex --name node1@127.0.0.1
iex --name node2@127.0.0.1
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_kernel.monitor_nodes(true)
:ok
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_kernel.monitor_nodes(true)
:ok
iex(node2@127.0.0.1)3> :net_kernel.connect(:'node1@127.0.0.1')
true
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_kernel.monitor_nodes(true)
:ok
iex(node1@127.0.0.1)3> :net_kernel.connect(:'node2@127.0.0.1')
true
iex(node1@127.0.0.1)4> flush()
{:nodeup, :"node2@127.0.0.1"}
:ok
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_kernel.monitor_nodes(true)
:ok
iex(node1@127.0.0.1)3> :net_kernel.connect(:'node2@127.0.0.1')
true
iex(node1@127.0.0.1)4> flush()
{:nodeup, :"node2@127.0.0.1"}
:ok
# Ctrl+C on node2
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_kernel.monitor_nodes(true)
:ok
iex(node1@127.0.0.1)3> :net_kernel.connect(:'node2@127.0.0.1')
true
iex(node1@127.0.0.1)4> flush()
{:nodeup, :"node2@127.0.0.1"}
:ok
# Ctrl+C on node2
iex(node1@127.0.0.1)5> flush()
{:nodedown, :"node2@127.0.0.1"}
:ok
iex(node1@127.0.0.1)5>
net_adm
Example 7:
ping
iex --name node1@127.0.0.1
iex --name node2@127.0.0.1
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_adm.ping(:'node3@127.0.0.1')
pang
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_adm.ping(:'node3@127.0.0.1')
pang
iex(node1@127.0.0.1)2> :net_adm.ping(:'node2@127.0.0.1')
pong
Example 8:
names
iex --name node1@127.0.0.1
iex --name node2@127.0.0.1
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_adm.names()
{:ok, [{'rabbit', 25672}, {'node1', 51813}, {'node2', 51815}]}
iex(node1@127.0.0.1)3>
iex(node1@127.0.0.1)1>
iex(node1@127.0.0.1)2> :net_adm.names()
{:ok, [{'rabbit', 25672}, {'node1', 51813}, {'node2', 51815}]}
iex(node1@127.0.0.1)3> Node.list()
[]
iex(node1@127.0.0.1)4>
Example 9:
world
# /etc/hosts
127.0.0.1 host1.com
127.0.0.1 host2.com
127.0.0.1 host3.com
# /Users/gmile/.hosts.erlang
host1.com.
host2.com.
host3.com.
iex --name node1@host1.com
iex --name node2@host1.com
iex --name node3@host2.com
iex --name node4@host2.com
iex --name node5@host3.com
iex --name node6@host3.com
iex(node1@host1.com)1>
iex(node1@host1.com)1>
iex(node1@host1.com)1> :net_adm.world()
[:"node1@host1.com", :"node2@host1.com", :"node3@host2.com", :”no
de4@host2.com", :"node5@host3.com", :"node6@host3.com"]
iex(node1@host1.com)2>
Bonus track:
Connecting
to a node running
in production
iex --name node1@127.0.0.1 --cookie abc
$ iex --remsh foo@127.0.0.1 --cookie abc --
name bar@localhost
Erlang/OTP 19 [erts-8.1] [source] [64-bit]
[smp:8:8] [async-threads:10] [hipe] [kernel-
poll:false] [dtrace]
Interactive Elixir (1.3.4) - press Ctrl+C to
exit (type h() ENTER for help)
iex(foo@127.0.0.1)1>
$ kubectl get pods -l app=matcher -o template --
template="{{range.items}}{{.metadata.name}}{{end}}" | xargs -o
-I my_pod kubectl exec my_pod -i -t -- iex --name
debugging@127.0.0.1 --remsh marketplace@127.0.0.1 --cookie
marketplace
$ kubectl get pods -l app=matcher -o template --
template="{{range.items}}{{.metadata.name}}{{end}}" | xargs -o
-I my_pod kubectl exec my_pod -i -t -- iex --name
debugging@127.0.0.1 --remsh marketplace@127.0.0.1 --cookie
marketplace
$ kubectl get pods -l app=matcher -o template --
template="{{range.items}}{{.metadata.name}}{{end}}" | xargs -o
-I my_pod kubectl exec my_pod -i -t -- iex --name
debugging@127.0.0.1 --remsh marketplace@127.0.0.1 --cookie
marketplace
slave
3. Transfer configuration
to slave nodes
2. Add code path to slave nodes
4. Ensure apps
are started on slave
1. Start slave
What else?
Node
bitwalker/swarm
Easy clustering, registration, and
distribution of worker processes for
Erlang/Elixir
…a simple case where workers are
dynamically created in response to
some events under a supervisor, and
we want them to be distributed across
the cluster and be discoverable by
name from anywhere in the cluster
bitwalker/
libcluster
What next?
…I didn’t want to resort to
something like Docker,
because I wanted to see how
far I could push Elixir and its
tooling.
Databases
Tools
Theory
Takeaways
mnesia
Example 10:
initialize mnesia
iex --name node1@127.0.0.1
iex --name node2@127.0.0.1
iex --name node3@127.0.0.1
iex(node1@127.0.0.1)1> :mnesia.create_schema([:'node1@127.0.0.1'])
:ok
iex(node1@127.0.0.1)1> :mnesia.create_schema([:'node1@127.0.0.1'])
:ok
iex(node1@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)1> :mnesia.create_schema([:'node1@127.0.0.1'])
:ok
iex(node1@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)3> :mnesia.info()
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
schema : with 1 records occupying 413 words of mem
===> System info in version "4.14.1", debug level = none <===
opt_disc. Directory "/Users/gmile/Mnesia.node1@127.0.0.1" is used.
use fallback at restart = false
running db nodes = ['node1@127.0.0.1']
stopped db nodes = []
master node tables = []
remote = []
ram_copies = []
disc_copies = [schema]
disc_only_copies = []
[{'node1@127.0.0.1',disc_copies}] = [schema]
2 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
:ok
iex(node1@127.0.0.1)4>
iex(node1@127.0.0.1)1> :mnesia.create_schema([:'node1@127.0.0.1'])
:ok
iex(node1@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)3> :mnesia.info()
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
schema : with 1 records occupying 413 words of mem
===> System info in version "4.14.1", debug level = none <===
opt_disc. Directory "/Users/gmile/Mnesia.node1@127.0.0.1" is used.
use fallback at restart = false
running db nodes = ['node1@127.0.0.1']
stopped db nodes = []
master node tables = []
remote = []
ram_copies = []
disc_copies = [schema]
disc_only_copies = []
[{'node1@127.0.0.1',disc_copies}] = [schema]
2 transactions committed, 0 aborted, 0 restarted, 0 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
:ok
iex(node1@127.0.0.1)4>
“schema” table exists
as a disk_copy (RAM + disk)
on node1@127.0.0.1
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node3@127.0.0.1)2> :mnesia.start()
:ok
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node3@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)2> :mnesia.change_config(:extra_db_nodes, [:’node2@127.0.0.1’])
{:ok, [:"node2@127.0.0.1"]}
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node3@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)2> :mnesia.change_config(:extra_db_nodes, [:’node2@127.0.0.1’])
{:ok, [:"node2@127.0.0.1"]}
iex(node1@127.0.0.1)3> :mnesia.change_config(:extra_db_nodes, [:’node3@127.0.0.1’])
{:ok, [:"node3@127.0.0.1"]}
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node3@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)2> :mnesia.change_config(:extra_db_nodes, [:’node2@127.0.0.1’])
{:ok, [:"node2@127.0.0.1"]}
iex(node1@127.0.0.1)3> :mnesia.change_config(:extra_db_nodes, [:’node3@127.0.0.1’])
{:ok, [:"node3@127.0.0.1"]}
iex(node1@127.0.0.1)1> :mnesia.create_table(:books, [disc_copies: [:'node1@127.0.0.1'],
attributes: [:id, :title, :year]])
:ok
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node3@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)2> :mnesia.change_config(:extra_db_nodes, [:’node2@127.0.0.1’])
{:ok, [:"node2@127.0.0.1"]}
iex(node1@127.0.0.1)3> :mnesia.change_config(:extra_db_nodes, [:’node3@127.0.0.1’])
{:ok, [:"node3@127.0.0.1"]}
iex(node1@127.0.0.1)1> :mnesia.create_table(:books, [disc_copies: [:'node1@127.0.0.1'],
attributes: [:id, :title, :year]])
:ok
iex(node1@127.0.0.1)4> :mnesia.add_table_copy(:books, :'node2@127.0.0.1', :ram_copies)
:ok
iex(node2@127.0.0.1)2> :mnesia.start()
:ok
iex(node3@127.0.0.1)2> :mnesia.start()
:ok
iex(node1@127.0.0.1)2> :mnesia.change_config(:extra_db_nodes, [:’node2@127.0.0.1’])
{:ok, [:"node2@127.0.0.1"]}
iex(node1@127.0.0.1)3> :mnesia.change_config(:extra_db_nodes, [:’node3@127.0.0.1’])
{:ok, [:"node3@127.0.0.1"]}
iex(node1@127.0.0.1)1> :mnesia.create_table(:books, [disc_copies: [:'node1@127.0.0.1'],
attributes: [:id, :title, :year]])
:ok
iex(node1@127.0.0.1)4> :mnesia.add_table_copy(:books, :'node2@127.0.0.1', :ram_copies)
:ok
iex(node1@127.0.0.1)5> :mnesia.add_table_copy(:books, :'node3@127.0.0.1', :ram_copies)
:ok
iex(node1@127.0.0.1)6> :mnesia.info()
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
books : with 0 records occupying 304 words of mem
schema : with 2 records occupying 566 words of mem
===> System info in version "4.14.1", debug level = none <===
opt_disc. Directory "/Users/gmile/Mnesia.node1@127.0.0.1" is used.
use fallback at restart = false
running db nodes = ['node3@127.0.0.1','node2@127.0.0.1','node1@127.0.0.1']
stopped db nodes = []
master node tables = []
remote = []
ram_copies = []
disc_copies = [books,schema]
disc_only_copies = []
[{'node1@127.0.0.1',disc_copies},
{'node2@127.0.0.1',ram_copies},
{'node3@127.0.0.1',ram_copies}] = [schema,books]
12 transactions committed, 0 aborted, 0 restarted, 10 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
:ok
iex(node1@127.0.0.1)32>
iex(node1@127.0.0.1)6> :mnesia.info()
---> Processes holding locks <---
---> Processes waiting for locks <---
---> Participant transactions <---
---> Coordinator transactions <---
---> Uncertain transactions <---
---> Active tables <---
books : with 0 records occupying 304 words of mem
schema : with 2 records occupying 566 words of mem
===> System info in version "4.14.1", debug level = none <===
opt_disc. Directory "/Users/gmile/Mnesia.node1@127.0.0.1" is used.
use fallback at restart = false
running db nodes = ['node3@127.0.0.1','node2@127.0.0.1','node1@127.0.0.1']
stopped db nodes = []
master node tables = []
remote = []
ram_copies = []
disc_copies = [books,schema]
disc_only_copies = []
[{'node1@127.0.0.1',disc_copies},
{'node2@127.0.0.1',ram_copies},
{'node3@127.0.0.1',ram_copies}] = [schema,books]
12 transactions committed, 0 aborted, 0 restarted, 10 logged to disc
0 held locks, 0 in queue; 0 local transactions, 0 remote
0 transactions waits for other nodes: []
:ok
iex(node1@127.0.0.1)32>
“schema” + “books” tables exist
on 3 different nodes
3 nodes are running
current node (node1)
keeps 2 tables as RAM + disk
Before we
proceed…
CAP theorem!
@b0rk
Consistency
Every read receives the
most recent write or an error
Availability
Every request receives a response,
without guarantee that
it contains the most recent
version of the information
Partition tolerance
The system continues to operate despite
an arbitrary number of messages being
dropped by the network between nodes
Pick two!
AP or AC or CP
AC
is kind of
pointless
Mnesia chooses…
AC!
If in your cluster the network
connection between two nodes
goes bad, then each one
will think that the other node is down,
and continue to write data.
Recovery from this is complicated.
AXD 301
switch
“…measures are taken such
that network reliability is very high”
“…In such a highly specialized
environment, the reliability of the control
backplane essentially removes some of
the worries which the CAP theorem
introduces.”
Databases
Tools
Theory
Takeaways
1. Elixir lowers the
barrier of entrance
in building clusters
…via productivity
batteries!
And yet it’s all
about Erlang
2. “Hello world” for
clusters is simple
3. Deciding what
matters is hard
Understahd your
values when
building a cluster!
4. Releasing &
deploying stuff
may get tricky
5. Building stateful
clusters is really
challanging
6. An Erlang app
can be your little
universe
Magic Clusters and Where to Find Them 2.0 - Eugene Pirogov
Magic Clusters and Where to Find Them 2.0 - Eugene Pirogov
Magic Clusters and Where to Find Them 2.0 - Eugene Pirogov

Magic Clusters and Where to Find Them 2.0 - Eugene Pirogov