Running Cassandra in a docker environment to give you a flexible development environment that uses only a very small set of resources, both locally and with your favorite cloud provider. Lessons learned running Cassandra with a very small set of resources are applicable to both your local development environment and larger, less constrained production deployments.
From http://www.meetup.com/Docker-Santa-Clara/events/232789407/
3. Who am I and what do I do?
• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud.
• Currently support AWS, Azure, Heroku and Softlayer with more to come.
• 700+ nodes
4.
5. Objectives
• A quick intro on docker (for the Cassandra folk).
• Our docker story
• Working with Cassandra and docker.
• Running C* in a constrained env w/ docker
• Listen to my astonishment of all the progress docker has made
since I last gave this talk
6. Why docker matters
• Finally Developers have a solution to build once and deploy
anywhere
• Finally Ops/Admin has a solution to configure anywhere
• Finally DevOps is easy
• Dev == Test == Staging == Production
• Move with speed
7. Docker, how it works.
• Runs anywhere (Linux kernel 2.6.32+)
• Uses lightweight VMs:
• Own process space (namespace)
• Process isolation and resource control (cgroups)
• Own network adapter
• Own filesystem (chroot)
• Linux Analog to Solaris Zones, *BSD jails
8. Docker, how it works.
• Difference between a container and a VM
Virtual Machine Container
9. Docker, how it works.
• What about the packaging component?
• Uses Union filesystem to create a git like workflow around your deployed code:
!
!
Docker!
Container!
Image!
Registry!
Push%
!
!
!
!
Bins/!
Libs!
!
!
!
!
App!
A!
App!Δ!!
!
!
!
!
Bins/!
Docker'Engine' Docker'Engine'
Update'
Host'is'now'running'A’’'
'
App'Δ''
'
'
'
'
Bins/'
'
'
'
'
Bins/'
Libs'
'
'
'
'
App'
A'
'
'
'
'
Bins/'
'
'
'
'
Bins/'
Libs'
'
'
'
'
App'
A’’'
Host'running'A'wants'to'upgrade'to'A’’.'
Requests'update.'Gets'only'diffs'
'
10. Why we started using Docker
• We are super duper big fans of the “Immutable server” concept
• Once it’s deployed you don’t touch it
• No config management, no chef, no puppet etc
• Seed at boot and be done with it
11. Why we started using Docker
• Before Docker, we built AMIs in Amazon
• A new AMI for every deploy, version etc
• This meant we cycled our entire fleet of instances constantly
• Which is fine for some, but we work with persistent data
• Sooo much time streaming from replicas/copying backups from
S3
12. Why we started using Docker
• Docker images solved this for us
• Treat the host as a sterile environment
• Everything in a few docker containers which we can simply
update
• Cycle the docker container instead of the AMI
• Yes… docker was primarily a package management tool for us
13. Docker at Instaclustr
• So how do we get on board the hype train an established devops
practice? Without killing performance or stability?
• Ran in dev to get comfortable with it, then non-critical systems.
• Talked to others who use it in production
• https://github.com/docker/docker/issues - https://docs.docker.com/
You will spend a lot of time here
16. Docker & Cassandra - Networking
• 1st trial, throughput dropped in half!
• Writes sucked, streaming sucked, what was going on?
• Quick check with iperf showed a 50% hit in throughput
17. Docker & Cassandra - Networking
• Docker uses Linux Ethernet Bridges for basic software defined
routing. This will hose your network throughput (2014).
• Use the host network stack instead (—net=host), 0% impact on
Cassandra throughput (iperf still showed minor overhead)
• Also solves NAT issues in an AWS like networking environment.
18. Docker & Cassandra + Filesystem
• The filesystems (AUFS, BTRFS etc) that bring great benefits to Dockers
workflow around building and snapshoting containers are not very good for
databases.
• You also need keep your C* data, commitlogs & caches in a Docker volume
mount for persistence.
• UnionFS (AUFS) is terrible for writing lots of big files.
• BTRFS is a pain to use from an ops point of view. Terrible
• Hooray volume mounts use the underlying filesystem. Put cassandra data dir
on a volume mount with a decent fs (e.g. xfs)
20. Docker + Process Capabilities
• Docker by default drops all process capabilities except the
minimum needed to start.
• https://github.com/docker/docker/blob/master/oci/
defaults_linux.go#L64-L79
21. Docker + Process Capabilities
• Cassandra needs to pin files to memory using Mlockall, otherwise things
get sloooow.
• Mlockall is a process capability.
• A process needs CAP_IPC_LOCK & RLIMIT_MEMLOCK in order to
perform this operation. By default docker doesn't assign this to a running
container…
• Can use --privileged and be done with it. Kind of lazy though
• Use --cap-add instead
22. Docker + SIGTERM propagation
• When stopping the process docker will send a SIGTERM.
• Some interpreted languages treat PID 1 differently. E.g. Python/Bash does not
have default signal handlers when it’s PID 1.
• Bad if you use a bash script to launch Cassandra
• Java to the rescue!
• Make sure you run the cassandra bash script with -f (foreground)
• exec causes the JVM to replace the bash process… making the world a
happier place
23. Docker + SIGTERM propagation
• Tools like OpsCenter Server will have trouble with this.
• Can be fixed using a wacky combination of trap and wait stanzas in
your OpsCenter Server script (see http://veithen.github.io/
2014/11/16/sigterm-propagation.html)
• But now you have a bash script that duplicates init/systemd/
supervisord
• The debate rages on…
24. Docker + CoreOS
• Docker + fav OS + CM?, CoreOS + etcd, Swarm + Machine, Deis
etc
• We chose CoreOS (Appeared to be sane, etcd is cool, systemd if
you are into that kind of thing)
• Docker (the company) now does their own thing… did you know
they now call Docker… Docker Engine… who’d have thunk.
25. Docker + CoreOS
• Disable automatic updates + restarts (seriously do this)
• Fix logging, otherwise you will log to 3 locations (/var/log/
cassandra, journalctl and dockers json based log
• JVM will exit with error 143 (128 + 15 for SIGTERM). Need to ignore
that in your systemd service definition.
26. Docker + Dev Env
• Docker relies on Linux kernel capabilites… so no native docker in OS X
• We use OSX for dev, so we run vagrant and the CoreOS vagrant file
• Install Docker userland tools in OS X and forward ports to the vagrant box
running CoreOS
• Our env is a little strange, we a single cassandra instance on a single CoreOS
vm.
• Docker for mac now uses a lighter weight virtualisation layer native to OSX.
• Look at https://github.com/tobert/cassandra-docker for full dockerisation!
27. Docker + C* + Dev Env
• How do I run lots of C* instances on a VM or my dev laptop without
it falling over?
• Backwards performance tuning!
• Make it run as slowly, but as stable as possible!
28. Docker + C* + Dev Env
• Set Memory to be super low (you can go higher than this), edit your
cassandra-env.sh:
MAX_HEAP_SIZE="128M"
HEAP_NEWSIZE=“24M"
29. Docker + C* + Dev Env
• Tune compaction to have free reign and to smash the disk
concurrent_compactors: 1
in_memory_compaction_limit_in_mb: 2
compaction_throughput_mb_per_sec: 0
30. Docker + C* + Dev Env
• Let’s use HSHA thrift server as it reduces the memory per thread
used.
rpc_server_type: hsha
31. Docker + C* + Dev Env
• The HSHA server also lets us limit the number of threads serving in
flight requests, but still have a large number of clients connected.
concurrent_reads: 4
concurrent_writes: 4
rpc_min_threads: 2
rpc_max_threads: 2
• You can play with these to get the right numbers based on how your
clients connect, but keep them low.
32. Docker + C* + Dev Env
• This is Dev! Caches have no power here!
key_cache_size_in_mb: 0
reduce_cache_sizes_at: 0
reduce_cache_capacity_to: 0
33. Docker + C* + Dev Env
• How well does this work?!?!
• Will survive running the insane workload in the c* 2.1 new stresstest
tool.
• We run this on AWS t2.small instances
• Sign up at https://www.instaclustr.com and give our new Developer
nodes a spin!