Going Multi-Node
Eric Oestrich
SmartLogic
Eric Oestrich
SmartLogic - smartlogic.io
GitHub: oestrich
Twitter: ericoestrich
Going Multi-Node
● What is a MUD?
● Clustering
● Leader Selection
● Spanning the Cluster
● Problems Encountered
ExVenture
exventure.org
Gossip
gossip.haus
midmud.com
MidMUD
What is a M.U.D.?
What is a MUD?
● Multi-User Dungeon
● Text based multiplayer game
● Started in the late 70s
● World of Warcraft or EverQuest, but with no graphics
What is a MUD?
What is a MUD?
Supervision Tree
● ExVenture
○ Cluster.Supervisor
○ Raft
○ Data.Repo
○ Web.Supervisor
○ Game.Supervisor
○ :ranch
Supervision Tree
● Game.Supervisor
○ Game.Session.Registry
○ Game.Config
○ Game.Caches
■ Game.Items
■ …
○ Game.Session.Supervisor
○ Game.World
Supervision Tree
● Game.World
○ Game.World.Master
○ Game.World.ZoneController
○ Game.ZoneSupervisor
○ Game.ZoneSupervisor
○ ...
Supervision Tree
● Game.ZoneSupervisor
○ Game.Zone
○ Game.Room.Supervisor
■ Game.Room
■ ...
○ Game.NPC.Supervisor
■ Game.NPC
■ ….
Starting Point
Everything Assumed One Node
● Heavy geared for a single node
● Most processes were local Registry
● Data stored in local ETS tables
● Lots of in process state (a virtual world)
Paths Forward
● Umbrella App
● Single Application
Single Application that is Cluster Aware
Clustering
libcluster
bitwalker/libcluster
libcluster
config :libcluster,
topologies: [
local: [
strategy: Cluster.Strategy.Epmd,
config: [
hosts: [:"world1@host", :"world2@host"]
]
]
]
Start the nodes
iex --sname world1 -S mix
iex --sname world2 -S mix
Start the nodes
Picking a leader
Raft Protocol
raft.github.io
Raft Basics
Raft Basics
Raft Basics
Raft Basics
Raft Basics
Picking a Leader
● Each node waits a random amount of time and picks itself
as leader
● Each other node votes for the first node it sees ask
● Once a majority is chosen the leader is picked and starts
the world
What the leader does
● Pushes zones out across the cluster
● When a node dies
○ Looks at zones that should be online
○ Spins them up across the cluster
Squabble
● ExVenture’s leader election has been pulled out
● GitHub: oestrich/squabble
Squabble Callbacks
defmodule MyApp.Leader do
@behaviour Squabble.Leader
@impl true
def leader_selected(term) do
end
@impl true
def node_down() do
end
end
Why a majority of the cluster?
Removing local Registry
Switch to :global registry
Spanning the Cluster
● Cache updates need to be updated on each node
● Process groups to handle this
● Might be a nicer way to handle these
Join the pg2 group on cache start
defmodule Game.Items do
@ets_key :items
def init(_) do
:ok = :pg2.create(@ets_key)
:ok = :pg2.join(@ets_key, self())
#...
end
Client API
def insert(item) do
members = :pg2.get_members(@ets_key)
Enum.map(members, fn member ->
GenServer.call(member, {:insert, item})
end)
end
Problems Encountered
Calls timing out
● Might be a network blip and a previously stable call is no
longer stable
Circuit Breakers
Circuit Breakers
Circuit Breakers
Things that should happen at most once
● Attached to Gossip, a chat service
● Each node connects as a websocket to Gossip
● New messages posted at most once to the local channel
● Only the leader node should handle these actions
Performance Tweaks
The journey from 230 to 3500
Single process being overloaded by messages
● Room processes became a bottleneck
● Create a side process that handles notifications
● PR #72
● ~230 -> ~600
Single process being overloaded by data size
● Session registry was overloaded with size of user data
● Pushing a large preloaded Ecto struct around
● Massively simplify what is stored
● PR #73
● ~600 -> ~1200
Too large messages
● The data being passed around was too large
● Same huge User structs
● Ran out of ram at 50GB
● Use same simplification in messages
● PR #74
● ~1200 -> ~3500
Specs
- Intel Core i7-6700K
- Quad Core, with Hyperthreading
- 64GB of RAM
Demo
ExVenture Discord
https://discord.gg/GPEa6dB
Image Credits
http://discworld.starturtle.net/lpc/
http://www.majormud.com/
Thanks!
● GitHub: oestrich/ex_venture
● Documentation: exventure.org

Going Multi-Node