Salt Transport Modularity and
Concurrency for Performance and
Scale
Thomas Jackson
Staff Site Reliability Engineer
LinkedIn
3
Agenda
• for item in (‘transport’, ‘concurrency’):
• History
• Problems
• Options
• Solution
Transport in Salt
4
Salt Transport: a history
• In the beginning Salt was primarily a remote execution engine
• Send jobs from Master to N minions (defined by some target)
• In the beginning there was
5
"ZeroMQ (also spelled ØMQ, 0MQ or ZMQ)
is a high-performance asynchronous
messaging library, aimed at use in
distributed or concurrent applications.”
- Wikipedia (https://en.wikipedia.org/wiki/ZeroMQ)
6
We took a normal TCP socket, injected it with a mix of radioactive
isotopes stolen from a secret Soviet atomic research project,
bombarded it with 1950-era cosmic rays, and put it into the hands
of a drug-addled comic book author with a badly-disguised fetish
for bulging muscles clad in spandex. Yes, ZeroMQ sockets are the
world-saving superheroes of the networking world.
- http://zguide.zeromq.org/page:all#How-It-Began
7
Salt Transport: a history
How ZMQ PUB/SUB looks
Server
context = zmq.Context()
socket = context.socket(zmq.PUB)
socket.bind("tcp://*:12345")
socket.send(”Message")
Client
context = zmq.Context()
socket = context.socket(zmq.SUB)
socket.connect("tcp://localhost:12345")
print socket.recv()
8
Salt Transport: a history
How ZMQ REQ/REP looks
Server
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:12345")
message = socket.recv()
socket.send(“got message”)
Client
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://localhost:12345")
socket.send("Hello”)
message = socket.recv()
Request lifecycle
9
Salt Transport: a history
Master Minion
1. Job publish
2. Sign-in (optional – potentially reused or cached)
3. Pillar Fetch
4. SLS/file fetch (optional)
5. Return
Initial ZeroMQ implementation
10
Salt Transport: a history
• Master-initiated messages
• Using the pub/sub socket pair in zmq
• All broadcast messages from the master to the minion
• Minion-initiated messages
• Using the req/rep socket pair in zmq
• All messages initiated by the minion, such as:
• Sign-in
• Job return
• Module sync
• Pillar
• Etc.
Initial problems
11
Salt Transport: a history
• Message loss
• Broadcasts where filtered client side
• Added zmq filtering: https://github.com/saltstack/salt/pull/13285
• Etc.
12
Larger problems
13
Salt Transport: a history
• Huge ZMQ publisher memory leak (https://github.com/zeromq/libzmq/issues/954)
• Workaround: Process manager in salt
• No concept of client state
• When messages arrive, there is no way to see if the client is still connected– which leads to auth storms
• Workaround: Exponential backoff on the minion side
• No sync "connect" (https://github.com/saltstack/salt/pull/21570)
• Workaround: fire event and wait for it to return (or timeout to expire)
• Some users have issues with the LGPL license
• Workaround: n/a 
15
The Reliable Asynchronous Event Transport, or
RAET, is an alternative transport medium developed
specifically with Salt in mind. It has been developed to
allow queuing to happen up on the application layer
and comes with socket layer encryption. It also
abstracts a great deal of control over the socket layer
and makes it easy to bubble up errors and exceptions.
- docs.saltstack.com
Salt Transport: previous attempt
RAET
16
Salt Transport: previous attempt
• The good
• No ZMQ!
• The bad
• Effectively a re-implementation of the daemons (separate files, etc.)
• Unable to run zmq and RAET simultaneously (initially, hydra was added later – which just runs both daemons at once)
• The different
• Changed the model from “minions always connect” to “minions are listening”, meaning minions have a socket to
attack
17
What do we really need
18
Salt Transport: back to basics
• Salt is a platform, not a specific transport– we need transports to be modular
• Some requirements:
• Simple interface to implement (such that other modules can be written)
• Test coverage (including pre-canned tests for new modules)
• Support N transports simultaneously (for ramps, and complex infra)
• Clear contract of security/privacy requirements of various methods
• ReqChannel: minion to master messages
19
Salt Transport: Channels!
• Master
• pre_fork(self, process_manager)
• post_fork(self, payload_handler, io_loop)
• Minion
• send(self, load, tries=3, timeout=60)
• crypted_transfer_decode_dictentry(self, load, dictkey=None, tries=3, timeout=60)
• PubChannel: broadcasts to the appropriate minions
20
Salt Transport: Channels!
• Master
• pre_fork(self, process_manager)
• publish(self, load)
• Minion:
• on_recv(self, callback)
Responsibilities
21
Salt Transport: Channels!
• Serialization
• Encryption
• Targeting (pub channel only)
TCP channel
22
Salt Transport: Channels!
• Wire protocol: msgpack({'head': SOMEHEADER, 'body': SOMEBODY})
• Main advantages over ZMQ? better failure modes
• Faster failure detection (if minion isn’t connected to the master, you don’t have to wait for the timeouts)
• True link-status (no more auth storms!)
• Basically, we have sockets again! 
• https://docs.saltstack.com/en/develop/topics/transports/tcp.html
TCP: How does it look?
23
Salt Transport: Channels!
async_channel = salt.transport.client.AsyncReqChannel.factory(minion_opts)
ret = yield async_channel.send(msg)
TCP: How accurate?
24
Salt Transport: Channels!
• ZeroMQ
• Total jobs: 1000
• Completed jobs: 171
• Hit rate: 17.1%
• TCP
• Total jobs: 1000
• Completed jobs: 1000
• Hit rate: 100%
TCP: How does it perform
25
Salt Transport: Channels!
• 15 byte message
• ZeroMQ*
• Average time: 0.00295809405715
• QPS: 2246.952241147
• TCP
• Average time: 0.0023341544863
• QPS: 2580.04452801
TCP: How does it perform
26
Salt Transport: Channels!
• 1053 byte message
• ZeroMQ*
• Average time: 0.00278297542184
• QPS: 2489.300394919
• TCP
• Average time: 0.00251070397869
• QPS: 2602.4855051
Awesome!
27
Salt Transport: Channels!
• Definitely awesome!
• But async? What was that about?
• Before we get into specifics, lets talk about concurrency
The General Problem
28
Concurrency
We have lots of things to do, some of which are blocking calls to remote things which
are “slow”. It is more efficient (and overall “faster”) to work on something else while we
wait for that “slow” call.
29
Current state of concurrency in Salt
30
Concurrency
• Master-side: the master creates N Mworkers to process N requests in parallel
• N Mworkers to process N requests in parallel
• Interaces with non-blocking as well, using `while True:` loops to do timeouts etc.
• Minion-side:
• Threads used in MultiMaster for managing the multiple master connections
Problems
31
Concurrency
• No unified approach (multiprocessing, threading, nonblocking “loops” -- all in use)
• Slow and/or blocking operations hold process/thread while waiting
• No consistent use of non-blocking libraries, so the code is a mix of loops and
blocking calls
• Limited scalability (each approach scales differently)
Common solutions in Python
32
Concurrency
• Threading
• Multiprocessing
• User-space “threads”: Coroutines / stackless threads
33
Concurrency
Threading
• Some isolation between threads
• Pre-emptive scheduling
Import threading
def handle_request():
ret = requests.get(‘http://slowthing/’)
# do something else
threads = []
for x in xrange(0, NUM)REQUESTS):
t = threading.Thread(target=handle_request)
t.start()
threads.append(t)
for t in threads:
t.join()
34
Concurrency
Multiprocessing
• Complete isolation
• Pre-emptive scheduling
Import multiprocessing
def handle():
ret = requests.get(‘http://slowthing/’)
# do something else
Processes = []
for x in xrange(0, NUM)REQUESTS):
p = multiprocessing.Process(target=handle)
p.start()
processes.append(p)
For p in processes:
p.join()
• User-space “threads”: Coroutines / stackless threads
35
Concurrency
• Some libraries you may have heard of
• gevent
• Stackless python
• Greenlet
• Twisted
• Tornado
• How are these implemented
• Green threads
• callbacks
• coroutines
Why Coroutines?
36
Concurrency
• Coroutines have been in use in python for a while (tornado)
• The new asyncio in python3 (tulip) is coroutines
(https://docs.python.org/3/library/asyncio.html)
37
Coroutines are computer program components
that generalize subroutines for nonpreemptive
multitasking, by allowing multiple entry points
for suspending and resuming execution at
certain locations.
- https://en.wikipedia.org/wiki/Coroutine
Concurrency
38
Concurrency
Coroutines– what is this magic?
def item_of_work():
while True:
input = yield
yield do_something(input)
39
Concurrency
Coroutines– what is this magic?
def some_complex_handle():
while True:
input = yield
out1 = do_something(input)
yield None
out2 = do_something2(out1)
yield None
return do_something3(out2)
40
Concurrency
Tornado coroutines
• Some isolation between coroutines
• Explicit yield
• Light “threads”
Import threading
@tornado.gen.coroutine
def handle_request():
ret = yield requests.get(‘http://slow/’)
# do something else
loop = tornado.ioloop.IOLoop.current()
loop.spawn_callback(handle_request)
loop.start()
Coroutines– futures
41
Concurrency
• Futures are just objects that represent a thing that will complete in the future
• This allows methods to return immediately, but finish the task in the future
• This allows the callers to yield execution until the futures they depend on complete
42
Concurrency
Coroutines– with futures
• Yield execution, and get returns
• Method looks fairly normal
• Stack traces in here have context
• Easy chaining of futures
@tornado.gen.coroutine
def some_complex_handle(request):
a = yield is_authd(request)
if not a:
return False
ret = yield do_request(request)
yield save1(ret), save2(ret)
return ret
Tornado in Salt
43
Concurrency
• What is tornado?
• Python web framework and asynchronous networking library
• Why Tornado and not asyncio?
• Free python 2.x compatibility!
• A fairly comprehensive set of libraries for it (http, locks, queues, etc.)
Back to the transport interfaces
44
Concurrency
• AsyncReqChannel
• send: return a future
• crypted_transfer_decode_dictentry: return a future
ret = yield channel.send(load, timeout=timeout)
Now what?
45
Concurrency
• Now that we have a real concurrency model, what have we done with it?
• MultiMinion in a single process (coroutine per connection)
• Easily implement concurrent networking within Salt
• TCP transport
• IPC
46
Really? Problems?
47
Concurrency problems
• Most common pitfalls to concurrent programming
• race conditions and memory collisions
• deadlocks
Race conditions
48
Concurrency problems
• Weird data problems in the reactor: https://github.com/saltstack/salt/issues/23373
• The underlying problem: injected stuff in modules (__salt__ etc.) were just dicts—
which aren’t threadsafe (or coroutinesafe!)
• The solution? `ContextDict`
Copy-on-write thread/coroutine specific dict
49
ContextDict
• Works just like a dict
• Exposes a clone() method, which creates a `ChildContextDict` which is a
thread/coroutine local copy
• With tornado’s StackContext, we switch the backing dict of the parent with your
child using a context manager
cd = ContextDict(foo=bar)
print cd[‘foo’] # will be bar
with tornado.stack_context.StackContext(cd.clone):
print cd[‘foo’] # will be bar
cd[‘foo’] = ‘baz’
print cd[‘foo’] # will be baz
print cd[‘foo’] # will be bar
More examples: https://github.com/saltstack/salt/blob/develop/tests/unit/context_test.py
Deadlocks
50
Concurrency problems
• haven't seen any yet *knock on wood* -- in general we avoid these since each
coroutine is more-or-less independent of the others
Layers!
51
Concurrency problems
• Don’t forget, concurrency at all layers– including your DC-wide state execution
• For example: automated highstate enforcement of your whole DC
• Does it matter if all DB hosts update at once?
• Does it matter if all web servers update at once?
• Does it matter if all edge boxes update at once?
concurrency controls for state execution
52
zk_concurrency
acquire_lock:
zk_concurrency.lock:
- name: /trafficeserver
- zk_hosts: 'zookeeper:2181'
- max_concurrency: 4
- prereq:
- service: trafficserver
trafficserver:
service.running: []
release_lock:
zk_concurrency.unlock:
- name: /trafficserver
- require:
- service: trafficserver
Things on my “list”
53
Future Awesomeness
• Transport
• failover groups
• even better HA (https://github.com/saltstack/salt/issues/25700 -- get involved in the conversation)
• Concurrency
• async ext_pillar
• Partially concurrent state execution (prefetch, etc.)?
• Coroutine-based:
• Reactor
• Engines
• Beacons
• Thorium
Š2014 LinkedIn Corporation. All Rights Reserved.Š2014 LinkedIn Corporation. All Rights Reserved.

Saltconf 2016: Salt stack transport and concurrency

  • 2.
    Salt Transport Modularityand Concurrency for Performance and Scale Thomas Jackson Staff Site Reliability Engineer LinkedIn
  • 3.
    3 Agenda • for itemin (‘transport’, ‘concurrency’): • History • Problems • Options • Solution
  • 4.
    Transport in Salt 4 SaltTransport: a history • In the beginning Salt was primarily a remote execution engine • Send jobs from Master to N minions (defined by some target) • In the beginning there was
  • 5.
    5 "ZeroMQ (also spelledØMQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library, aimed at use in distributed or concurrent applications.” - Wikipedia (https://en.wikipedia.org/wiki/ZeroMQ)
  • 6.
    6 We took anormal TCP socket, injected it with a mix of radioactive isotopes stolen from a secret Soviet atomic research project, bombarded it with 1950-era cosmic rays, and put it into the hands of a drug-addled comic book author with a badly-disguised fetish for bulging muscles clad in spandex. Yes, ZeroMQ sockets are the world-saving superheroes of the networking world. - http://zguide.zeromq.org/page:all#How-It-Began
  • 7.
    7 Salt Transport: ahistory How ZMQ PUB/SUB looks Server context = zmq.Context() socket = context.socket(zmq.PUB) socket.bind("tcp://*:12345") socket.send(”Message") Client context = zmq.Context() socket = context.socket(zmq.SUB) socket.connect("tcp://localhost:12345") print socket.recv()
  • 8.
    8 Salt Transport: ahistory How ZMQ REQ/REP looks Server context = zmq.Context() socket = context.socket(zmq.REP) socket.bind("tcp://*:12345") message = socket.recv() socket.send(“got message”) Client context = zmq.Context() socket = context.socket(zmq.REQ) socket.connect("tcp://localhost:12345") socket.send("Hello”) message = socket.recv()
  • 9.
    Request lifecycle 9 Salt Transport:a history Master Minion 1. Job publish 2. Sign-in (optional – potentially reused or cached) 3. Pillar Fetch 4. SLS/file fetch (optional) 5. Return
  • 10.
    Initial ZeroMQ implementation 10 SaltTransport: a history • Master-initiated messages • Using the pub/sub socket pair in zmq • All broadcast messages from the master to the minion • Minion-initiated messages • Using the req/rep socket pair in zmq • All messages initiated by the minion, such as: • Sign-in • Job return • Module sync • Pillar • Etc.
  • 11.
    Initial problems 11 Salt Transport:a history • Message loss • Broadcasts where filtered client side • Added zmq filtering: https://github.com/saltstack/salt/pull/13285 • Etc.
  • 12.
  • 13.
    Larger problems 13 Salt Transport:a history • Huge ZMQ publisher memory leak (https://github.com/zeromq/libzmq/issues/954) • Workaround: Process manager in salt • No concept of client state • When messages arrive, there is no way to see if the client is still connected– which leads to auth storms • Workaround: Exponential backoff on the minion side • No sync "connect" (https://github.com/saltstack/salt/pull/21570) • Workaround: fire event and wait for it to return (or timeout to expire) • Some users have issues with the LGPL license • Workaround: n/a 
  • 15.
    15 The Reliable AsynchronousEvent Transport, or RAET, is an alternative transport medium developed specifically with Salt in mind. It has been developed to allow queuing to happen up on the application layer and comes with socket layer encryption. It also abstracts a great deal of control over the socket layer and makes it easy to bubble up errors and exceptions. - docs.saltstack.com Salt Transport: previous attempt
  • 16.
    RAET 16 Salt Transport: previousattempt • The good • No ZMQ! • The bad • Effectively a re-implementation of the daemons (separate files, etc.) • Unable to run zmq and RAET simultaneously (initially, hydra was added later – which just runs both daemons at once) • The different • Changed the model from “minions always connect” to “minions are listening”, meaning minions have a socket to attack
  • 17.
  • 18.
    What do wereally need 18 Salt Transport: back to basics • Salt is a platform, not a specific transport– we need transports to be modular • Some requirements: • Simple interface to implement (such that other modules can be written) • Test coverage (including pre-canned tests for new modules) • Support N transports simultaneously (for ramps, and complex infra) • Clear contract of security/privacy requirements of various methods
  • 19.
    • ReqChannel: minionto master messages 19 Salt Transport: Channels! • Master • pre_fork(self, process_manager) • post_fork(self, payload_handler, io_loop) • Minion • send(self, load, tries=3, timeout=60) • crypted_transfer_decode_dictentry(self, load, dictkey=None, tries=3, timeout=60)
  • 20.
    • PubChannel: broadcaststo the appropriate minions 20 Salt Transport: Channels! • Master • pre_fork(self, process_manager) • publish(self, load) • Minion: • on_recv(self, callback)
  • 21.
    Responsibilities 21 Salt Transport: Channels! •Serialization • Encryption • Targeting (pub channel only)
  • 22.
    TCP channel 22 Salt Transport:Channels! • Wire protocol: msgpack({'head': SOMEHEADER, 'body': SOMEBODY}) • Main advantages over ZMQ? better failure modes • Faster failure detection (if minion isn’t connected to the master, you don’t have to wait for the timeouts) • True link-status (no more auth storms!) • Basically, we have sockets again!  • https://docs.saltstack.com/en/develop/topics/transports/tcp.html
  • 23.
    TCP: How doesit look? 23 Salt Transport: Channels! async_channel = salt.transport.client.AsyncReqChannel.factory(minion_opts) ret = yield async_channel.send(msg)
  • 24.
    TCP: How accurate? 24 SaltTransport: Channels! • ZeroMQ • Total jobs: 1000 • Completed jobs: 171 • Hit rate: 17.1% • TCP • Total jobs: 1000 • Completed jobs: 1000 • Hit rate: 100%
  • 25.
    TCP: How doesit perform 25 Salt Transport: Channels! • 15 byte message • ZeroMQ* • Average time: 0.00295809405715 • QPS: 2246.952241147 • TCP • Average time: 0.0023341544863 • QPS: 2580.04452801
  • 26.
    TCP: How doesit perform 26 Salt Transport: Channels! • 1053 byte message • ZeroMQ* • Average time: 0.00278297542184 • QPS: 2489.300394919 • TCP • Average time: 0.00251070397869 • QPS: 2602.4855051
  • 27.
    Awesome! 27 Salt Transport: Channels! •Definitely awesome! • But async? What was that about? • Before we get into specifics, lets talk about concurrency
  • 28.
    The General Problem 28 Concurrency Wehave lots of things to do, some of which are blocking calls to remote things which are “slow”. It is more efficient (and overall “faster”) to work on something else while we wait for that “slow” call.
  • 29.
  • 30.
    Current state ofconcurrency in Salt 30 Concurrency • Master-side: the master creates N Mworkers to process N requests in parallel • N Mworkers to process N requests in parallel • Interaces with non-blocking as well, using `while True:` loops to do timeouts etc. • Minion-side: • Threads used in MultiMaster for managing the multiple master connections
  • 31.
    Problems 31 Concurrency • No unifiedapproach (multiprocessing, threading, nonblocking “loops” -- all in use) • Slow and/or blocking operations hold process/thread while waiting • No consistent use of non-blocking libraries, so the code is a mix of loops and blocking calls • Limited scalability (each approach scales differently)
  • 32.
    Common solutions inPython 32 Concurrency • Threading • Multiprocessing • User-space “threads”: Coroutines / stackless threads
  • 33.
    33 Concurrency Threading • Some isolationbetween threads • Pre-emptive scheduling Import threading def handle_request(): ret = requests.get(‘http://slowthing/’) # do something else threads = [] for x in xrange(0, NUM)REQUESTS): t = threading.Thread(target=handle_request) t.start() threads.append(t) for t in threads: t.join()
  • 34.
    34 Concurrency Multiprocessing • Complete isolation •Pre-emptive scheduling Import multiprocessing def handle(): ret = requests.get(‘http://slowthing/’) # do something else Processes = [] for x in xrange(0, NUM)REQUESTS): p = multiprocessing.Process(target=handle) p.start() processes.append(p) For p in processes: p.join()
  • 35.
    • User-space “threads”:Coroutines / stackless threads 35 Concurrency • Some libraries you may have heard of • gevent • Stackless python • Greenlet • Twisted • Tornado • How are these implemented • Green threads • callbacks • coroutines
  • 36.
    Why Coroutines? 36 Concurrency • Coroutineshave been in use in python for a while (tornado) • The new asyncio in python3 (tulip) is coroutines (https://docs.python.org/3/library/asyncio.html)
  • 37.
    37 Coroutines are computerprogram components that generalize subroutines for nonpreemptive multitasking, by allowing multiple entry points for suspending and resuming execution at certain locations. - https://en.wikipedia.org/wiki/Coroutine Concurrency
  • 38.
    38 Concurrency Coroutines– what isthis magic? def item_of_work(): while True: input = yield yield do_something(input)
  • 39.
    39 Concurrency Coroutines– what isthis magic? def some_complex_handle(): while True: input = yield out1 = do_something(input) yield None out2 = do_something2(out1) yield None return do_something3(out2)
  • 40.
    40 Concurrency Tornado coroutines • Someisolation between coroutines • Explicit yield • Light “threads” Import threading @tornado.gen.coroutine def handle_request(): ret = yield requests.get(‘http://slow/’) # do something else loop = tornado.ioloop.IOLoop.current() loop.spawn_callback(handle_request) loop.start()
  • 41.
    Coroutines– futures 41 Concurrency • Futuresare just objects that represent a thing that will complete in the future • This allows methods to return immediately, but finish the task in the future • This allows the callers to yield execution until the futures they depend on complete
  • 42.
    42 Concurrency Coroutines– with futures •Yield execution, and get returns • Method looks fairly normal • Stack traces in here have context • Easy chaining of futures @tornado.gen.coroutine def some_complex_handle(request): a = yield is_authd(request) if not a: return False ret = yield do_request(request) yield save1(ret), save2(ret) return ret
  • 43.
    Tornado in Salt 43 Concurrency •What is tornado? • Python web framework and asynchronous networking library • Why Tornado and not asyncio? • Free python 2.x compatibility! • A fairly comprehensive set of libraries for it (http, locks, queues, etc.)
  • 44.
    Back to thetransport interfaces 44 Concurrency • AsyncReqChannel • send: return a future • crypted_transfer_decode_dictentry: return a future ret = yield channel.send(load, timeout=timeout)
  • 45.
    Now what? 45 Concurrency • Nowthat we have a real concurrency model, what have we done with it? • MultiMinion in a single process (coroutine per connection) • Easily implement concurrent networking within Salt • TCP transport • IPC
  • 46.
  • 47.
    Really? Problems? 47 Concurrency problems •Most common pitfalls to concurrent programming • race conditions and memory collisions • deadlocks
  • 48.
    Race conditions 48 Concurrency problems •Weird data problems in the reactor: https://github.com/saltstack/salt/issues/23373 • The underlying problem: injected stuff in modules (__salt__ etc.) were just dicts— which aren’t threadsafe (or coroutinesafe!) • The solution? `ContextDict`
  • 49.
    Copy-on-write thread/coroutine specificdict 49 ContextDict • Works just like a dict • Exposes a clone() method, which creates a `ChildContextDict` which is a thread/coroutine local copy • With tornado’s StackContext, we switch the backing dict of the parent with your child using a context manager cd = ContextDict(foo=bar) print cd[‘foo’] # will be bar with tornado.stack_context.StackContext(cd.clone): print cd[‘foo’] # will be bar cd[‘foo’] = ‘baz’ print cd[‘foo’] # will be baz print cd[‘foo’] # will be bar More examples: https://github.com/saltstack/salt/blob/develop/tests/unit/context_test.py
  • 50.
    Deadlocks 50 Concurrency problems • haven'tseen any yet *knock on wood* -- in general we avoid these since each coroutine is more-or-less independent of the others
  • 51.
    Layers! 51 Concurrency problems • Don’tforget, concurrency at all layers– including your DC-wide state execution • For example: automated highstate enforcement of your whole DC • Does it matter if all DB hosts update at once? • Does it matter if all web servers update at once? • Does it matter if all edge boxes update at once?
  • 52.
    concurrency controls forstate execution 52 zk_concurrency acquire_lock: zk_concurrency.lock: - name: /trafficeserver - zk_hosts: 'zookeeper:2181' - max_concurrency: 4 - prereq: - service: trafficserver trafficserver: service.running: [] release_lock: zk_concurrency.unlock: - name: /trafficserver - require: - service: trafficserver
  • 53.
    Things on my“list” 53 Future Awesomeness • Transport • failover groups • even better HA (https://github.com/saltstack/salt/issues/25700 -- get involved in the conversation) • Concurrency • async ext_pillar • Partially concurrent state execution (prefetch, etc.)? • Coroutine-based: • Reactor • Engines • Beacons • Thorium
  • 54.
    Š2014 LinkedIn Corporation.All Rights Reserved.Š2014 LinkedIn Corporation. All Rights Reserved.

Editor's Notes

  • #4 Transport != concurrency, although transport uses concurrency
  • #8 10K foot view: Contexts have sockets Sockets are message passing things that are “like” sockets, but are not sockets (they are really a socket and a bunch of contexts) ZeroMQ attempts (and succeeds) in dramatically simplifying message passing, go zmq!
  • #9 Notice, to switch message types we only had to change the socket type– simple!
  • #10 Basically, this means we can break down communications in salt into two categories
  • #11 Effectively two separate transport issues to solve, so two socket pairs– great done
  • #12 Initially zmq was awesome, as with anything we ran into a variety of weird little issues Message loss: retries, various new versions of zmq to fix cases that dropped messages Broadcasts: ran out of B/W for medium sized job publishes, fixed by implementing zmq’s filtering (zmq saved the day!) But at this point, these are just bugs– nothing that’s a deal breaker
  • #13 At this point, the problems it has had aren’t really zmq’s fault… so we are okay right?
  • #14 Memory leaks: connecting and disconnecting on TCP causes ~600 bytes to be leaked on the master! Still unfixed to this day! Client state: publishes have to wait timeout (even if the minion isn’t connected) AND auth storms! So at this point, we are running into a variety of issues which we are attempting to hack around that are either getting little response, or are contrary to the design. Basically at our scale the abstraction layer is costing us too much
  • #15 At our scale (and with our availability/perf requirements) we need another transport option
  • #16 SaltStack had been working on a replacement, which I’m sure you have all heard of-- RAET
  • #17 NOTE: RAET in salt was being used for both transport (RAET) and concurrency (ioflo) All that being said, RAET (the transport) isn’t bad– its just too specific (and not modular), salt isn’t this specific about anything else-- so why transport?
  • #19 New systems might require new transports (QUIC, serial ports, USB, over text message??, who knows!) There had already been some work to consolidate the transport into “channel” classes before, so might as well finish that– then make them pluggable So, 2 types of channels: req and pub
  • #20 Master: Prefork so that you can bind before forking– to split the FD across multiple processes (to work around python’s GIL limitations) Process_manager, in case you need to make additional processes of your own (instead of just coroutines on the ioloop) Post_fork called in each process after fork, this sets up the handlers etc Minion: send– send load crypted_transfer_decode_dictentry– send `load` encrypted only to the master (e.g. not with the shared symmetric key)
  • #22 This means that as far as “Salt” is concerned, there is a thing (channel) I can pass something to which will get it to wherever I asked. And of course, the system couldn’t be considered modular unless there were at least two modules
  • #23 Since msgpack is an iterably parsed serialization, we can simply write the serialized payload to the wire. Crypto: still using aes that the zeromq stuff uses
  • #24 People asked about performance, which TBH I didn’t really think about putting in this presentation– because I was more worried about …accuracy
  • #25 This is a simple benchmark of sending 1k {‘cmd’: ‘get_token’} as quickly as possible to a master Note: ZMQ drops a LARGE number of messages– this is due to internal queues in ZMQ filling up– so, even if tcp was slower (which it isn’t) we’d still want it 
  • #26 I am of course obligated to show some metrics. This is a simple benchmark of sending {‘cmd’: ‘get_token’} repeatedly from a master Note, quick benchmark– mostly to show that it is roughly equivalent. In practice * ZeroMQ ReqClient is apparently VERY CPU heavy (probably a bug)– it uses ~5 client processes to get this number– whereas TCP uses just one
  • #27 Same as previous benchmark, we just added ~1k additional bytes to the payload
  • #29 Especially important for large fast modern CPUs that have to talk to things that are slow/far-away. Concurrency in python is more fun— because of the GIL, but still helpful because stuff is slooooow
  • #30 Sorry, not a funny picture :/ but you’ve probably seen it before at some conference  Since stuff is so far, there is no reason to leave the CPU just waiting, we can do something while we wait. Salt attempts to accommodate this…
  • #32 Basically-- doing only one thing at a time severely limits your performance and scalability. So lets go back to what our options are
  • #33 Lets do some examples
  • #34 Fairly clunky code, but it works. Linux pthreads– requires a decent amount of memory, and has some hard limits based on your OS Walk through how this runs: Creates a thread per request Waits for requests to finish Thread closes Note: still subject to the GIL
  • #35 Fairly clunky code, but it works. (Note: serialization (pickle)!!) Linux processes– requires a decent amount of memory, and has some hard limits based on your OS (pids) Walk through how this runs: Creates a process per request Waits for requests to finish Process closes Note: no GIL!
  • #36 Green threads: All the pre-emptive yields require some amount of monkey-patching, making it… difficult for a plugin based system (like Salt) Callbacks: mess! Coroutines-- yes But we don’t even have to make this decision, python already did!
  • #37 - quick aside RE ioflo-- basically a naive implementation of coroutines to achieve the required concurrency for the flo based model it has, serious scaling problems, limited usage, etc.-- details can be messy, talk after :) what exactly is a coroutine?
  • #38 Lets break that down Preemptive: implicit vs explicit yield Basically coroutines are explicitly yielded tasks lets talk a little about what coroutines are with some examples Great examples on www.dabeaz.com/coroutines/ -- I’ll try to explain it in a shorter way, but I highly recommend reading dabeaz’s page.
  • #39 To make it clearer, lets copy/paste an example of a naĂŻve implementation in python using generators
  • #40 So, something like this lets us “schedule” tasks, meaning we can interleave execution of these things– even if they aren all blocking operations What would be even better– is if we could resume execution when whatever we are waiting on is completed Note: the return within a generator is new in python 3.x, so tornado (and trollius) use an exception of a specifc type
  • #41 Cleaner code, easy isolation, lighter concurrency (effectively just a stack)
  • #42 https://docs.python.org/3/library/asyncio-task.html#future
  • #43 So, something like this lets us “schedule” tasks, meaning we can interleave execution of these things– even if they aren all blocking operations What would be even better– is if we could resume execution when whatever we are waiting on is completed
  • #44 As of 2015.8– you get tornado!
  • #45 So, from the client– you say “send load with timeout” and we return a future that will fulfill that contract (either send or timeout). So from the client this is SUPER clean
  • #46 But, of course– we haven’t got this far without breaking anything ;)
  • #48 Like anything else concurrency isn’t free 
  • #50 The implementation here is a RequestContext (based on tornado's stack_context). This RequestContext will do all of the bookkeeping of which coroutine/thread is currently executing-- and will switch between the values for each one. With this I made the loader threadsafe (yay!) and it is easy to re-use if you need a concurrent copy-on-write structure
  • #52 What happens if you have automated highstate enforcement across your proxies??
  • #53 Limit concurrency of this particular part of your states– but not the rest
  • #55 Questions