Your Auto-Scaling Bot
Volkan Tufekci
DevOps Eng.,
eBay Turkey - GittiGidiyor
• DevOps Eng 2+ years
• Backend Dev. 6+ years
Who am I?
• 15M Subscriber
• 38M Visitor/Month
• In every 2 secs a product
is sold
eBay TR -
GittiGidiyor
• When we hit the thresholds
• When there is a push notification
• When we employ more services
Problem
We need to scale
• Container scaling
• Host(Node) scaling*
Problem
What kind of scaling?
* We can’t use public cloud services
We can dedicate someone to monitor the load on our services and
perform ”docker scale”
or
We can use a bot to do the stuff automatically for us
Solution
Ingredients
• Docker Swarm Mode
• Node Exporter
• cAdvisor
• Prometheus
• Alert Manager
• Slack + Python Bot
• Grafana(not mandatory but nice to have)
Recipe
Official Prometheus exporter
for node metrics
Node Exporter
Container resource usage and
performance metrics
cAdvisor
Time series data based
monitoring and alerting
solution
Prometheus
Prometheus
• scrape_interval
• evaluation_interval
• states:
• inactive
• pending
• firing
Under the Hood
For deduplicating, grouping,
and routing alarms to the
correct receiver integration
such as Slack
Alert Manager
Under the Hood
Alert Manager
• group_by
• group_wait
• group_interval
Open source software for
time series analytics.
Graphical visualization
solution.
Grafana
Big Picture
PROMETHEUS
BOT
• Parse the notification
• If it is ”FIRING” alarm from a service
• Get current # of tasks
• Check if we are below upper
threshold
• If so, increase scale
• If not, notify
Algorithm
Scaling up the service
• Parse the notification
• If it is ”RESOLVED” notification
• For each task
• Get current # of tasks
• Check if we are above
lower threshold
• If so, decrease scale
Algorithm
Scaling down the services
Simplified Code
Thank You!
@volkantufekci
@docker
#dockercon

Your Auto-Scaling Bot - Volkan Tufecki