Logging in dockerized

environment
Yury Bushmelev
Senior System Engineer
@jay7t
Few words about logs
▪ Log contains formatted messages
▪ Some log messages may be dropped
▪ Log contains no business-critical data
2
Introduction
▪ We have 6 ventures in 3 DCs
▪ We have about 60 micro-services
(GO) + some legacy (PHP)
▪ We have log format defined as
service-oriented architecture (SOA)
convention part
3
How to deal with it?
▪ How service may log a message?
▪ How to collect logs from all services?
▪ How to deliver collected logs to a
central place?
▪ How to store logs for analysis?
▪ How to store logs for longer term?
4
Dark times 1/2
▪ dockerized service write log to
std{out,err}
▪ std{out,err} redirected to file residing
on a mapped directory
▪ logrotate with `copytruncate` option
5
Dark times 2/2
▪ td-agent (distribution of fluentd) on
host:
– continuously reads log file
– parses it to JSON
– sends to Kafka in SG DC(!)
– eats all RAM and die
– ... or is stuck in silence
6
Best practices™
▪ write log message to std{out,err}
▪ use docker log drivers for collection
and delivery
7
Lazada-way™
▪ replace td-agent with rsyslog for log
collection, parsing and delivery
▪ write log message to a custom AF_UNIX
SOCK_DGRAM socket mapped from host
(rsyslog imuxsock input)
– why not /dev/log?
– what about blocking?
– what about max message size?
8
Lazada-way™
9
API services
/run/syslog/<socket>
rsyslogd
Lazada-way™ for k8s
▪ rsyslog is installed and configured on
every k8s node
▪ socket dir (/run/syslog) is mapped to
every container
10
Log parsing
▪ SOA format:

{instance_name} | 

{YYYY-mm-ddTHH:mm:ss.microseconds±hh:mm} | {TraceId} |
{ParentSpanId} | {SpanId} | {rollout_type} |
{service_name} | {level} | {component_name} |
{file_name_and_line} | {message} | {additional_data} |
{is_truncated}
▪ Parsing rule:

rule=v2:%instance_name:word% | %origintime:date-
rfc5424% | %traceid:char-sep: % | %parentspanid:char-
sep: % | %spanid:char-sep: % | %rollout_type:char-sep:
% | %service_name:string-to: |% | %level:char-sep: % |
%component_name:char-sep: % | %filename_and_line:char-
sep: % | %short_message:string-to: |% |
%additional_data:string-to: |% | %is_truncated:rest%
11
Log delivery
12
relay
store
relay
store
Kafka
Graylog
API servers API servers
Log delivery
▪ send message to in-DC log collector
for a long term storage
▪ send message to in-DC log relay for
delivery to SG DC (compressed tcp
stream)
▪ inject messages to Kafka from SG DC
relay
13
Log storage
▪ rsyslog-based log file storage per DC
(and for command-line tools like tail/
grep/awk/...)
▪ Graylog + ElasticSearch for analysis
14
Metrics 1/3
▪ home-grown rsyslog_exporter to store
rsyslog statistics (impstats) to
prometheus
▪ Grafana dashboards
▪ Prometheus alerts
15
Metrics 2/3
16
Metrics 3/3
17
Future plans
▪ Rate-limits
▪ More work on reliability
▪ SLA with DEVs
18
It's feedback time!
Yury Bushmelev
@jay7t
sgtechhub@lazada.com

Logging in dockerized environment

  • 1.
    Logging in dockerized
 environment YuryBushmelev Senior System Engineer @jay7t
  • 2.
    Few words aboutlogs ▪ Log contains formatted messages ▪ Some log messages may be dropped ▪ Log contains no business-critical data 2
  • 3.
    Introduction ▪ We have6 ventures in 3 DCs ▪ We have about 60 micro-services (GO) + some legacy (PHP) ▪ We have log format defined as service-oriented architecture (SOA) convention part 3
  • 4.
    How to dealwith it? ▪ How service may log a message? ▪ How to collect logs from all services? ▪ How to deliver collected logs to a central place? ▪ How to store logs for analysis? ▪ How to store logs for longer term? 4
  • 5.
    Dark times 1/2 ▪dockerized service write log to std{out,err} ▪ std{out,err} redirected to file residing on a mapped directory ▪ logrotate with `copytruncate` option 5
  • 6.
    Dark times 2/2 ▪td-agent (distribution of fluentd) on host: – continuously reads log file – parses it to JSON – sends to Kafka in SG DC(!) – eats all RAM and die – ... or is stuck in silence 6
  • 7.
    Best practices™ ▪ writelog message to std{out,err} ▪ use docker log drivers for collection and delivery 7
  • 8.
    Lazada-way™ ▪ replace td-agentwith rsyslog for log collection, parsing and delivery ▪ write log message to a custom AF_UNIX SOCK_DGRAM socket mapped from host (rsyslog imuxsock input) – why not /dev/log? – what about blocking? – what about max message size? 8
  • 9.
  • 10.
    Lazada-way™ for k8s ▪rsyslog is installed and configured on every k8s node ▪ socket dir (/run/syslog) is mapped to every container 10
  • 11.
    Log parsing ▪ SOAformat:
 {instance_name} | 
 {YYYY-mm-ddTHH:mm:ss.microseconds±hh:mm} | {TraceId} | {ParentSpanId} | {SpanId} | {rollout_type} | {service_name} | {level} | {component_name} | {file_name_and_line} | {message} | {additional_data} | {is_truncated} ▪ Parsing rule:
 rule=v2:%instance_name:word% | %origintime:date- rfc5424% | %traceid:char-sep: % | %parentspanid:char- sep: % | %spanid:char-sep: % | %rollout_type:char-sep: % | %service_name:string-to: |% | %level:char-sep: % | %component_name:char-sep: % | %filename_and_line:char- sep: % | %short_message:string-to: |% | %additional_data:string-to: |% | %is_truncated:rest% 11
  • 12.
  • 13.
    Log delivery ▪ sendmessage to in-DC log collector for a long term storage ▪ send message to in-DC log relay for delivery to SG DC (compressed tcp stream) ▪ inject messages to Kafka from SG DC relay 13
  • 14.
    Log storage ▪ rsyslog-basedlog file storage per DC (and for command-line tools like tail/ grep/awk/...) ▪ Graylog + ElasticSearch for analysis 14
  • 15.
    Metrics 1/3 ▪ home-grownrsyslog_exporter to store rsyslog statistics (impstats) to prometheus ▪ Grafana dashboards ▪ Prometheus alerts 15
  • 16.
  • 17.
  • 18.
    Future plans ▪ Rate-limits ▪More work on reliability ▪ SLA with DEVs 18
  • 19.
    It's feedback time! YuryBushmelev @jay7t sgtechhub@lazada.com