ContainerDays NYC 2016: "Observability and Manageability in a Container Environment" (Tim Gross)

48 views

Published on

Slides from the workshop "Observability and Manageability in a Container Environment", led by Tim Gross, at ContainerDays NYC 2016: http://dynamicinfradays.org/events/2016-nyc/programme.html#observability

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
48
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ContainerDays NYC 2016: "Observability and Manageability in a Container Environment" (Tim Gross)

  1. 1. OBSERVABILITY IN A CONTAINERIZED WORLD Tim Gross @0x74696d
  2. 2. https://joyent.com/about/careers
  3. 3. @0x74696dgithub.com/tgross/observability-workshop “We have built mind-bogglingly complicated systems that we cannot see, allowing glaring performance problems to hide in broad daylight in our systems.” Bryan Cantrill, Joyent CTO ACM Queue Vol 4, Issue 1, 2006 Feb 23 http://queue.acm.org/detail.cfm?id=1117401
  4. 4. @0x74696dgithub.com/tgross/observability-workshop “Get used to interacting with your observability tooling every day. As part of your release cycle, or just out of curiosity. Honestly, things are broken all the time — you don’t even know what normal looks like unless you’re also interacting with your observability tooling under “normal” circumstances.” Charity Majors CEO Honeycomb.io Building Badass Engineers and Badass Teams http://ow.ly/IDOs305uN7W
  5. 5. @0x74696dgithub.com/tgross/observability-workshop OBSERVABILITY IN CONTAINERS OBSERVABILITY OF APPLICATIONS =
  6. 6. @0x74696dgithub.com/tgross/observability-workshop $ ssh ubuntu@${IP} Welcome to Ubuntu 16.04.1 LTS (GNU/Linux 4.4.0-45-generic x86_64) Certified Ubuntu Cloud Image __ . . _| |_ | .-. . . .-. :--. |- |_ _| ;| || |(.-' | | | |__| `--' `-' `;-| `-' ' ' `-' / ; Instance (Ubuntu 16.04.1 LTS 20161020) `-' https://docs.joyent.com/images/linux/ubuntu-certified http://www.ubuntu.com/cloud#joyent * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage Get cloud support with Ubuntu Advantage Cloud Guest: http://www.ubuntu.com/business/services/cloud Last login: Wed Nov 2 14:20:32 2016 from 98.114.23.154 ubuntu@05ca8420-3b56-43c5-9e28-209bd2eab154:~$
  7. 7. @0x74696dgithub.com/tgross/observability-workshop ubuntu@05ca8420-3b56-43c5-9e28-209bd2eab154:~$ cd workshop/ ubuntu@05ca8420-3b56-43c5-9e28-209bd2eab154:~/workshop$ ls -lah total 68K drwxrwxr-x 8 ubuntu ubuntu 4.0K Nov 2 14:32 . drwxr-xr-x 7 ubuntu ubuntu 4.0K Oct 31 19:51 .. drwxrwxr-x 2 ubuntu ubuntu 4.0K Oct 31 18:55 consul -rw-rw-r-- 1 ubuntu ubuntu 2.9K Nov 2 14:32 docker-compose.yml drwxrwxr-x 3 ubuntu ubuntu 4.0K Nov 1 13:56 fortunes drwxrwxr-x 8 ubuntu ubuntu 4.0K Nov 1 19:46 .git -rw-rw-r-- 1 ubuntu ubuntu 28 Oct 31 18:55 .gitignore -rw-rw-r-- 1 ubuntu ubuntu 16K Oct 31 17:26 LICENSE -rw-rw-r-- 1 ubuntu ubuntu 3.1K Nov 1 19:46 local-compose.yml drwxrwxr-x 2 ubuntu ubuntu 4.0K Oct 31 18:55 mysql drwxrwxr-x 2 ubuntu ubuntu 4.0K Nov 1 15:26 nginx -rw-rw-r-- 1 ubuntu ubuntu 6.1K Oct 31 18:55 README.md drwxrwxr-x 3 ubuntu ubuntu 4.0K Oct 31 20:31 setup
  8. 8. @0x74696dgithub.com/tgross/observability-workshop ubuntu@05ca8420-3b56-43c5-9e28-209bd2eab154:~/workshop$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE workshop_nginx latest 1e8c5dabd8d5 19 hours ago 249.6 MB workshop_fortunes latest c94501cbbf3b 25 hours ago 61.42 MB workshop_mysql latest 504d8ce0ff06 44 hours ago 491.8 MB workshop_consul latest a52cbd2b8c03 44 hours ago 54.69 MB alpine 3.4 baa5d63471ea 2 weeks ago 4.799 MB autopilotpattern/nginx 1-r6.1.0 50ff23913232 2 weeks ago 249.6 MB autopilotpattern/mysql 5.6r3.1.0 d9709015cd62 4 weeks ago 491.8 MB autopilotpattern/consul 0.7r0.7 224d9f7134fa 6 weeks ago 54.69 MB
  9. 9. @0x74696dgithub.com/tgross/observability-workshop ubuntu@05ca8420-3b56-43c5-9e28-209bd2eab154:~/workshop$ docker-compose up -d Creating workshop_mysql_1 Creating workshop_consul_1 Creating workshop_nginx_1 Creating workshop_fortunes_1 ubuntu@05ca8420-3b56-43c5-9e28-209bd2eab154:~/workshop$ docker-compose ps Name Command State Ports -------------------------------------------------------------------- workshop_consul_1 /usr/local/bin/containerpi ... Up workshop_fortunes_1 /bin/containerpilot -confi ... Up workshop_mysql_1 containerpilot mysqld --co ... Up workshop_nginx_1 /usr/local/bin/containerpi ... Up
  10. 10. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Primary MySQL Replica ES Master ES Data Kibana Prometheus Service B Logstash Service A
  11. 11. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Elastic search Kibana Prometheus Fortunes Logstash Consul Agent
  12. 12. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Elastic search Kibana Prometheus Fortunes Logstash Consul Agent
  13. 13. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Elastic search Kibana Prometheus Fortunes Logstash Consul Agent
  14. 14. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Elastic search Kibana Prometheus Fortunes Logstash Consul Agent
  15. 15. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Elastic search Kibana Prometheus Fortunes Logstash Consul Agent
  16. 16. @0x74696dgithub.com/tgross/observability-workshop Nginx Consul MySQL Elastic search Kibana Prometheus Fortunes Logstash Consul Agent
  17. 17. @0x74696dgithub.com/tgross/observability-workshop Nginx Container Fortunes Container (node.js) Consul Agent Container MySQL Container Docker Your KVM: ▸ Docker Engine ▸ Composed containers ▸ Host networking (shared IP) Virtual Machine
  18. 18. @0x74696dgithub.com/tgross/observability-workshop Customer KVM Customer KVM SmartOS Container Hypervisor Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Triton compute node: ▸ SmartOS ▸ Many customer containers ▸ VXLAN: 1 container =1+ IPs Bare-metal compute
  19. 19. @0x74696dgithub.com/tgross/observability-workshop Customer KVM SmartOS Container Hypervisor Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer Container Customer KVM Elasticsearch Prometheus Logstash Kibana Triton Cloud ▸ Many compute nodes ▸ Containers distributed transparently across DC Bare-metal compute
  20. 20. @0x74696dgithub.com/tgross/observability-workshop Docker Log drivers ‣ Captures stdout/stderr; easy for 12-Factor apps ‣ Mangles multi-line logs (stack traces) ‣ Wraps log line in log driver structure: makes parsing structured logs really messy
  21. 21. @0x74696dgithub.com/tgross/observability-workshop Logging to File ‣ Mounting volume on host for logging to file isn’t portable across platforms (e.g. PaaS) ‣ Log shippers as co-process in container can be arbitrarily smart
  22. 22. @0x74696dgithub.com/tgross/observability-workshop Structured Logging ‣ Require more storage, less ingest processing ‣ More metadata to search on ‣ Docker log drivers blow away structured logs =( ‣ Consider logging directly from app to collector @0x74696d
  23. 23. @0x74696dgithub.com/tgross/observability-workshop <20>Nov 11 2016 14:52:01 nginx/edaac4e19616 [4996]: [2016-11-04T14:52:01+00:00] 5c18b677d628d9511818baab9f33ceb3 "GET / HTTP/1.1" 200 204 "-" 127.0.0.1 "curl/ 7.47.0"
  24. 24. @0x74696dgithub.com/tgross/observability-workshop <20>Nov 11 2016 14:52:01 nginx/edaac4e19616 [4996]: [2016-11-04T14:52:01+00:00] 5c18b677d628d9511818baab9f33ceb3 "GET / HTTP/1.1" 200 204 "-" 127.0.0.1 "curl/ 7.47.0" syslog wrapper: added by log driver
  25. 25. @0x74696dgithub.com/tgross/observability-workshop <20>Nov 11 2016 14:52:01 nginx/edaac4e19616 [4996]: [2016-11-04T14:52:01+00:00] 5c18b677d628d9511818baab9f33ceb3 "GET / HTTP/1.1" 200 204 "-" 127.0.0.1 "curl/ 7.47.0" container identifier via Compose hostname flag
  26. 26. @0x74696dgithub.com/tgross/observability-workshop <20>Nov 11 2016 14:52:01 nginx/edaac4e19616 [4996]: [2016-11-04T14:52:01+00:00] 5c18b677d628d9511818baab9f33ceb3 "GET / HTTP/1.1" 200 204 "-" 127.0.0.1 "curl/ 7.47.0" Nginx access log timestamp
  27. 27. @0x74696dgithub.com/tgross/observability-workshop <20>Nov 11 2016 14:52:01 nginx/edaac4e19616 [4996]: [2016-11-04T14:52:01+00:00] 5c18b677d628d9511818baab9f33ceb3 "GET / HTTP/1.1" 200 204 "-" 127.0.0.1 "curl/ 7.47.0" Nginx $request_id field
  28. 28. @0x74696dgithub.com/tgross/observability-workshop $ vi ~/workshop/setup/supporting/logstash/logstash.conf ... filter { # first parse out the body from the syslog format # we've added tags via the Docker log drivers to identify the # specific container and the service identifier. See: # https://docs.docker.com/engine/admin/logging/log_tags/ grok { match => { "message" => '%{SYSLOG5424PRI:syslog5424_pri}+(?:% {SYSLOGTIMESTAMP:syslog_timestamp}|-) %{WORD:serviceid}/+(?:% {HOSTNAME:containerid}|-)[+(?:%{POSINT:pid}|-)]: %{GREEDYDATA:msg}' } } syslog_pri { } ...
  29. 29. @0x74696dgithub.com/tgross/observability-workshop ... # failed to match, so parse as error if "_grokparsefailure" in [tags] { mutate { add_tag => "parse_error" } } else { mutate { # the raw message is redundant data at this point remove_field => [ "message", "@source_host" ] } mutate { # once we've got a valid syslog parse we can discard all this rubbish # because the Docker log driver stomped all over anything useful remove_field => [ "syslog_hostname", "syslog_message", "syslog_timestamp", "syslog_severity", "syslog_facility_code", "syslog_severity_code", "syslog_facility", "syslog5424_pri" ] } } ...
  30. 30. @0x74696dgithub.com/tgross/observability-workshop ... # filter to get the application-specific log format. lots of these have # their own timestamps, which we'll capture and overwrite the "outermost" # timestamp with grok { # nginx access log match => { "msg" => '[%{TIMESTAMP_ISO8601:log_timestamp}] %{WORD:req_id} "% {WORD:http_method} %{URIPATHPARAM:http_request} HTTP/%{NUMBER:http_version}" % {NUMBER:http_code} %{NUMBER:http_bytes_sent} (?:%{QUOTEDSTRING:http_referer}|-) % {IP:client} (?:%{QUOTEDSTRING:http_user_agent}|-)' } # nginx error msg match => { "msg" => '(?<log_timestamp>%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME}) (? <http_timestamp>%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME})? ?%{GREEDYDATA:msg}' } ...
  31. 31. @0x74696dgithub.com/tgross/observability-workshop # mysql log messages match => { "msg" => '%{TIMESTAMP_ISO8601:log_timestamp} %{NUMBER} [%{LOGLEVEL:level}] ?% {GREEDYDATA:msg}' } # mysql manage.py log messages match => { "msg" => '(?<log_timestamp>%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME})? % {LOGLEVEL:level} manage %{GREEDYDATA:msg}' } # ContainerPilot log format match => { "msg" => '(?<log_timestamp>%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME})? ?% {GREEDYDATA:msg}' } # catchall match => { "msg" => "%{GREEDYDATA:msg}" } overwrite => [ "log_timestamp" ] overwrite => [ "msg" ] } # end grok
  32. 32. @0x74696dgithub.com/tgross/observability-workshop # fortunes application json { source => "msg" add_field => [ "log_timestamp", "%{time}"] # overwrites w/ timestamp from app remove_field => [ "time" ] } # anything other than our fortunes application will not be JSON # so we'll ignore this error if "_jsonparsefailure" in [tags] { mutate { remove_tag => "_jsonparsefailure" } } }
  33. 33. @0x74696dgithub.com/tgross/observability-workshop Metrics ‣ Ephemeral containers == higher dimensionality ‣ Use service discovery to find collector / targets ‣ Pulling metrics safer / more scalable in multi-tenant environments (tenants can’t DDOS collector)
  34. 34. @0x74696dgithub.com/tgross/observability-workshop Prometheus Container Pilot Application Consul I’m “application” at 192.168.1.100:3000 I’m “telemetry” at 192.168.1.100:9090 Application container Sensor
  35. 35. @0x74696dgithub.com/tgross/observability-workshop Prometheus Container Pilot Application Consul Where is telemetry? Application container 192.168.1.100:9090
  36. 36. @0x74696dgithub.com/tgross/observability-workshop Prometheus Container Pilot Application Consul /metrics scrape Application container
  37. 37. @0x74696dgithub.com/tgross/observability-workshop Container Design ‣ Large containers == slower deploys ‣ Mount common tooling read-only from the host
  38. 38. @0x74696dgithub.com/tgross/observability-workshop Distributed Request Tracing ‣ Zipkin / Open Tracing looks promising; little support outside app code (load balancers, DB) ‣ Nginx can inject request ID field ‣ Carry request ID in your logging
  39. 39. OBSERVABILITY IN A CONTAINERIZED WORLD Tim Gross @0x74696d

×