Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Patterns of Distributed Logging and Containers

10,106 views

Published on

The presentation at CloudNativeCon Europe 2017, about how to design logging platform in container and microservices based computing platforms.

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

The Patterns of Distributed Logging and Containers

  1. 1. THE PATTERNS OF DISTRIBUTED LOGGING AND CONTAINERS CloudNativeCon Europe 2017 March 30, 2017 Satoshi Tagomori (@tagomoris) Treasure Data, Inc.
  2. 2. SATOSHI TAGOMORI (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  3. 3. 1. Microservices, Containers and Logging 2. Scaling Logging Platform 3. Patterns: Source/Destination -side Aggregation 4. Patterns: Scaling Up/Out Destination 5. Practices
  4. 4. MICROSERVICES, CONTAINERS AND LOGGING
  5. 5. Logging in Industries • Service Logs • Web access logs • Ad logs • Commercial transaction logs for analytics (EC, Game, ...) • System Logs • Syslog and other OS logs • Audit logs • Performance metrics Logs for Business Growth Logs for Service Stability
  6. 6. Microservices and Logging Users LAMP/Rails/MEAN/... Apps Logs Users Search Logs Recommendation Shopping cart Reviews Ads ... Monolithic service Microservices
  7. 7. Microservices and Containers • Microservices • Isolated dependencies • Agile deployment • Containers • Isolated environments & resources • Simple pull&restart deployment • Less overhead, high density
  8. 8. Logging Challenges with Microservices/Containers • Containerization changes everything: • No permanent storages • No fixed physical/network addresses • No fixed mapping between servers and roles
  9. 9. Logging Challenges with Microservices/Containers • Containerization changes everything: • No permanent storages • No fixed physical/network addresses • No fixed mapping between servers and roles Transfer Logs to Anywhere ASAP
  10. 10. Logging Challenges with Microservices/Containers • Containerization changes everything: • No permanent storages • No fixed physical/network addresses • No fixed mapping between servers and roles Push Logs From Containers
  11. 11. Logging Challenges with Microservices/Containers • Containerization changes everything: • No permanent storages • No fixed physical/network addresses • No fixed mapping between servers and roles Label Logs With Service Names/Tags
  12. 12. Logging Challenges with Microservices/Containers • Containerization changes everything: • No permanent storages • No fixed physical/network addresses • No fixed mapping between servers and roles Label Logs With Service Names/Tags Parse Logs & Label Values At Source Structured Logs
  13. 13. Structured Logs: tag, time, key-value pairs Original log: the customer put an item to cart: item_id=101, items=10, client=web Structured log: ec_service.shopping_cart 2017-03-30 16:35:37 +0100 { "container_id": "bfdd5b9....", "container_name": "/infallible_mayer", "source": "stdout", "event": "put an item to cart", "item_id": 101, "items": 10, "client": "web" } tag timestamp record
  14. 14. How to Ship Logs from Docker Containers nginx, mysql, .... log files agents read files, parse plain texts apps, middleware json log files agents read files, parse json lines applications agents just receive transferred logs apps, middleware agents just receive transferred logs Using mounted volume Using container json logs Sending logs to agents directly Using logging drivers + disk I/O penalty + mount points + disk I/O penalty + logger code + agent config 😃
  15. 15. SCALING LOGGING PLATFORM
  16. 16. Core Architecture: Distributed Logging Source (Container + Agent) Transferring/Aggregation layer Destination (Storage, Database, Service)
  17. 17. Distributed Logging Workflow • Retrieve raw logs: file system / network • Parse log content Collector Aggregator Destination • Get data from multiple sources • Split/merge incoming data into streams • Retrieve structured logs from Aggregator • Store formatted logs
  18. 18. Scaling Logging • Network Traffic • Split heavy log traffic into traffics to nodes • CPU Load • Distribute processing to nodes about parsing/formatting logs • High Availability • Switch traffic from a node to another for failures • Agility • Reconfigure whole logging layer to modify destinations
  19. 19. PATTERNS: SOURCE/DESTINATION -SIDE AGGREGATION
  20. 20. Source Side Aggregation Destination Side Aggregation NO YES YESNO
  21. 21. Now I'm Talking About: Source Transferring Aggregation Destination Source Side Destination Side
  22. 22. Source-side Aggregation Patterns Without Source-side Aggregation With Source-side Aggregation Collector Destination-side Aggregate Container
  23. 23. Aggregation Pattern without Source-side Aggregation • Pros: • Simple configuration • Cons: • Fixed aggregator (destination endpoint) address
 configured in containers • Many network connections • High load in aggregator / destination
  24. 24. Aggregation Pattern with Source-side Aggregation • Pros: • Less connections • Lower load in aggregator / destination • Less configurations in containers • More agility
 (aggregate containers can be reconfigured) • Cons: • Need more resources (+1 container per host) Aggregate Container
  25. 25. Destination-side Aggregation Patterns Without Destination-side Aggregation With Destination-side Aggregation Aggregator Node Source-side Destination
  26. 26. Aggregation Pattern without Destination-side Aggregation • Pros: • Less nodes • Simpler configuration • Cons: • Destination changes affects all source nodes • Worse performance:
 many small write requests on destination(storage)
  27. 27. • Pros: • Destination changes does NOT affect source nodes • Better performance:
 destination aggregator can merge write operations • Cons: • More nodes • More complex configuration Aggregation Pattern with Destination-side Aggregation Aggregator Node
  28. 28. PATTERNS: SCALING UP/OUT DESTINATION
  29. 29. HOW TO SCALE HERE Source Transferring Aggregation Destination Now I'm Talking About:
  30. 30. Scaling Destination Patterns Scaling Up Aggregator/Destination Endpoints Scaling Out Aggregator/Destination Endpoints Destination-side Aggregator or Destination Load balancer or Huge queue Backend nodes Collector nodes Using HTTP Load Balancer or Huge Queues Using Round Robin Clients
  31. 31. • Pros: • Simple configuration:
 specifying load balancer only
 in collector nodes • Cons: • Upper limits about scaling up
 on Load balancer (or queue) Scaling Up Destination Backend nodes Load balancer or Huge queue
  32. 32. Scaling Out Destination • Pros: • Unlimited scaling by adding nodes • Cons: • Complex configuration in collector nodes • Client feature required for round-robin • Unavailable for traffic over Internet
  33. 33. Destination-side Aggregation and Destination Scaling Destination Side Aggregation Scaling Up Destination Endpoints YESNO Scaling Out Destination Endpoints Early Stage Systems Collect Logs over Internet or Using Queues Collect Logs in Data Center All Collector Nodes Must Know All Destination Nodes ↓ Uncontrollable
  34. 34. PRACTICES
  35. 35. Practices: Docker + Fluentd • Docker Fluentd Logging Driver • Docker containers can send these logs to Fluentd directly,
 with less overhead • Fluentd's Pluggable Architecture • Various destination systems (storage/database/service) are available
 by changing configuration • Small Memory Footprint • Source aggregation requires +1 container per hosts:
 less additional resource usage is fine!
  36. 36. Practice 1: Source-side Aggregation + Scaling Up • Kubernetes: Fluentd + Elasticsearch • a.k.a EFK stack (inspired by ELK stack) • Elasticsearch - Fluentd - Kibana https://kubernetes.io/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/ apps, middleware json log files
  37. 37. Practice 2: Source-side Aggregation + Scaling Up • Containerized Applications • w/ Google Stackdriver for Monitoring • w/ Treasure Data for Analytics Google Stackdriver Logging apps, middleware
  38. 38. Practice 3: Source/Destination-side Aggregation + Scaling Out • Containerized Application • w/ Log processing on Hadoop • writing files on HDFS via WebHDFS • Hadoop HDFS prefers large files on HDFS: • Destination-side aggregation works well apps, middleware
  39. 39. Practice 4: Source/Destination-side Aggregation + Scaling Out • Containerized Application • w/ Log processing on Google BigQuery • putting logs via HTTPS • BigQuery has quota about write requests: • Destination-side aggregation works well apps, middleware
  40. 40. Best practices? • Source aggregation: do it • it makes app containers free from logging problems (buffering, HA, ...)
 • Destination aggregation: it depends • no need for cloud logging services/storages • may need for self-hosted distributed filesystems/databases • may need for cloud services which charges per requests
 • Destination scaling: it depends on destinations
  41. 41. Make Logging Scalable, Service Stable & Business Growing. Happy Logging! @tagomoris

×