In this talk we share the lessons learned while building out the Baqend Cloud platform on AWS and Docker. Baqend’s AWS-hosted architecture consists of a caching CDN-Layer, global and local load balancing, a group of REST and Node.js servers and a database cluster with Redis and MongoDB. As customers have their own set of containerized REST and Node servers, we needed a cluster that on the one hand is horizontally scalable and on the other hand easily manageable and fault-tolerant from an operational perspective. Today there are at least 4 popular systems that claim to support this:
- Apache Mesos
- Docker Swarm
- AWS Elastic Container Service (ECS)
Thinking that ECS would certainly be the easiest option on AWS, we started building our cluster on it. We quickly came to realize that while ECS was astoundingly stable and easy to use there were inherent limitations that could not be worked around. An old Docker version, missing network isolation, no means of parameterizing task and forced memory constraints are major limitations of ECS we will talk about. Seeing the daunting operational overhead of running Kubernetes or Mesos in practice we turned to Docker’s native clustering solution Swarm. We will present how Swarm works with both Docker and AWS and highlight the advantages and downsides compared to Amazon’s ECS.