2. Few words about logs
▪ Log contains formatted messages
▪ Some log messages may be dropped
▪ Log contains no business-critical data
2
3. Introduction
▪ We have 6 ventures in 3 DCs
▪ We have about 60 micro-services
(GO) + some legacy (PHP)
▪ We have log format defined as
service-oriented architecture (SOA)
convention part
3
4. How to deal with it?
▪ How service may log a message?
▪ How to collect logs from all services?
▪ How to deliver collected logs to a
central place?
▪ How to store logs for analysis?
▪ How to store logs for longer term?
4
5. Dark times 1/2
▪ dockerized service write log to
std{out,err}
▪ std{out,err} redirected to file residing
on a mapped directory
▪ logrotate with `copytruncate` option
5
6. Dark times 2/2
▪ td-agent (distribution of fluentd) on
host:
– continuously reads log file
– parses it to JSON
– sends to Kafka in SG DC(!)
– eats all RAM and die
– ... or is stuck in silence
6
7. Best practices™
▪ write log message to std{out,err}
▪ use docker log drivers for collection
and delivery
7
8. Lazada-way™
▪ replace td-agent with rsyslog for log
collection, parsing and delivery
▪ write log message to a custom AF_UNIX
SOCK_DGRAM socket mapped from host
(rsyslog imuxsock input)
– why not /dev/log?
– what about blocking?
– what about max message size?
8
13. Log delivery
▪ send message to in-DC log collector
for a long term storage
▪ send message to in-DC log relay for
delivery to SG DC (compressed tcp
stream)
▪ inject messages to Kafka from SG DC
relay
13
14. Log storage
▪ rsyslog-based log file storage per DC
(and for command-line tools like tail/
grep/awk/...)
▪ Graylog + ElasticSearch for analysis
14
15. Metrics 1/3
▪ home-grown rsyslog_exporter to store
rsyslog statistics (impstats) to
prometheus
▪ Grafana dashboards
▪ Prometheus alerts
15