Successfully reported this slideshow.

Elasticsearch for Logs & Metrics - a deep dive



1 of 48
1 of 48

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Elasticsearch for Logs & Metrics - a deep dive

  1. 1. Elasticsearch for logs and metrics (a deep dive) Rafał Kuć and Radu Gheorghe Sematext Group, Inc.
  2. 2. About us Logsene SPM ES API metrics ... Products Services
  3. 3. Agenda Index layout Cluster layout Per-index tuning of settings and mappings Hardware+OS options Pipeline patterns
  4. 4. Daily indices are a good start ... indexing, most searches Indexing is faster in smaller indices Cheap deletes Search only needed indices “Static” indices can be cached
  5. 5. The Black Friday problem* * for logs. Metrics usually don’t suffer from this
  6. 6. Typical indexing performance graph for one shard* * throttled so search performance remains decent At this point it’s better to index in a new shard Typically 5-10GB, YMMV
  8. 8. INDEX Y U NO AS FAST more merges more expensive (+uncached) searches Mostly because
  9. 9. Rotate by size* * use Field Stats for queries or rely on query cache:
  10. 10. Aliases; Rollover Index API* * 5.0 feature
  11. 11. Slicing data by time For spiky ingestion, use size-based indices Make sure you rotate before the performance drop (test on one node to get that limit)
  12. 12. Multi tier architecture (aka hot/cold) Client Client Client Data Data Data ... Data Data Data Master Master Master We can optimize data nodes layer Ingest Ingest Ingest
  13. 13. Multi tier architecture (aka hot/cold) logs_2016.11.07 indexing es_hot_1 es_cold_1 es_cold_2
  14. 14. Multi tier architecture (aka hot/cold) logs_2016.11.07 logs_2016.11.08 indexing m ove es_hot_1 es_cold_1 es_cold_2 curl -XPUT localhost:9200/logs_2016.11.07/_settings -d '{ "index.routing.allocation.exclude.tag" : "hot", "index.routing.allocation.include.tag": "cold" }'
  15. 15. Multi tier architecture (aka hot/cold) logs_2016.11.08 logs_2016.11.07 indexing es_hot_1 es_cold_1 es_cold_2
  16. 16. Multi tier architecture (aka hot/cold) logs_2016.11.11 logs_2016.11.07 logs_2016.11.09 logs_2016.11.08 logs_2016.11.10 indexing, most searches long running searches good CPU, best possible IO heap, IO for backup/replication and stats es_hot_1 es_cold_1 es_cold_2 SSD or RAID0 for spinning
  17. 17. Hot - cold architecture summary Costs optimization - different hardware for different tier Performance - above + fewer shards, less overhead Isolation - long running searches don't affect indexing
  18. 18. Elasticsearch high availability & fault tolerance Dedicated masters is a must discovery.zen.minimum_master_nodes = N/2 + 1 Keep your indices balanced not balanced cluster can lead to instability Balanced primaries are also good helps with backups, moving to cold tier, etc total_shards_per_node is your friend
  19. 19. Elasticsearch high availability & fault tolerance When in AWS - spread between availability zones bin/elasticsearch cluster.routing.allocation.awareness.attributes: zone We need headroom for spikes leave at least 20 - 30% for indexing & search spikes Large machines with many shards? look out for GC - many clusters died because of that consider running smaller ES instances but more
  20. 20. Which settings to tune Merges → most indexing time Refreshes → check refresh_interval Flushes → normally OK with ES defaults
  21. 21. Relaxing the merge policy Less merges ⇒ faster indexing/lower CPU while indexing Slower searches, but: - there’s more spare CPU - aggregations aren’t as affected, and they are typically the bottleneck especially for metrics More open files (keep an eye on them!) Increase index.merge.policy.segments_per_tier ⇒ more segments, less merges Increase max_merge_at_once, too, but not as much ⇒ reduced spikes Reduce max_merged_segment ⇒ no more huge merges, but more small ones
  22. 22. And even more settings Refresh interval (index.refresh_interval)* - 1s -> baseline indexing throughput - 5s -> +25% to baseline throughput - 30s -> +75% to baseline throughput Higher indices.memory.index_buffer_size higher throughput Lower indices.queries.cache.size for high velocity data to free up heap Omit norms (frequencies and positions, too?) Don't store fields if _source is used Don't store catch-all (i.e. _all) field - data copied from other fields *
  23. 23. Let’s dive deeper into storage Not searches on a field, just aggregations ⇒ index=false Not sorting/aggregating on a field ⇒ doc_values=false Doc values can be used for retrieving (see docvalue_fields), so: ● Logs: use doc values for retrieving, exclude them from _source* ● Metrics: short fields normally ⇒ disable _source, rely on doc values Long retention for logs? For “old” indices: ● set index.codec=best_compression ● force merge to few segments * though you’ll lose highlighting, update API, reindex API...
  24. 24. Metrics: working around sparse data Ideally, you’d have one index per metric type (what you can fetch with one call) Combining them into one (sparse) index will impact performance (see LUCENE-7253) One doc per metric: you’ll pay with space Nested documents: you’ll pay with heap (bitset used for joins) and query latency
  25. 25. What about the OS? Say no to swap Disk scheduler: CFQ for HDD, deadline for SSD Mount options: noatime, nodiratime, data=writeback, nobarrier because strict ordering is for the weak
  26. 26. And hardware? Hot tier. Typical bottlenecks: CPU and IO throughput indexing is CPU-intensive flushes and merges write (and read) lots of data Cold tier: Memory (heap) and IO latency more data here ⇒ more indices&shards ⇒ more heap ⇒ searches hit more files many stats calls are per shard ⇒ potentially choke IO when cluster is idle Generally: network storage needs to be really good (esp. for cold tier) network needs to be low latency (pings, cluster state replication) network throughput is needed for replication/backup
  27. 27. AWS specifics c3 instances work, but there’s not enough local SSD ⇒ EBS gp2 SSD* c4 + EBS give similar performance, but cheaper i2s are good, but expensive d2s are better value, but can’t deal with many shards (spinning disk latency) m4 + gp2 EBS are a good balance gp2 → PIOPS is expensive, spinning is slow 3 IOPS/GB, but caps at 160MB/s or 10K IOPS (of up to 256kb) per drive performance isn’t guaranteed (for gp2) ⇒ one slow drive slows RAID0 Enhanced Networking (and EBS Optimized if applicable) are a must * And used local SSD as cache. With --cachemode writeback for async writing: ume_Manager_Administration/lvm_cache_volume_creation.html block size?
  28. 28. The pipeline read buffer deliver
  29. 29. The pipeline read buffer deliver Log shipper reason #1
  30. 30. The pipeline read buffer deliver Log shipper reason #1 Files? Sockets? Network? What if buffer fills up? Processing before/after buffer? How? Others besides Elasticsearch? How to buffer if $destination is down? Overview of 6 log shippers:
  31. 31. Types of buffers buffer application.log Log file can act as a buffer Memory and/or disk of the log shipper or a dedicated tool for buffering
  32. 32. Where to do processing Logstash (or Filebeat or…) Buffer (Kafka/Redis) here Logstash Elasticsearch
  33. 33. Where to do processing Logstash Buffer (Kafka/Redis) here Logstash Elasticsearch something else
  34. 34. Where to do processing Logstash Buffer (Kafka/Redis) here Logstash Elasticsearch something else Outputs need to be in sync
  35. 35. Where to do processing Logstash Kafka Logstash Elasticsearch something else LogstashElasticsearch offset other offset here here, too
  36. 36. Where to do processing (syslog-ng, fluentd…) input here Elasticsearch something else here
  37. 37. Where to do processing (rsyslogd…) input here here here
  38. 38. Zoom into processing Ideally, log in JSON Otherwise, parse For performance and maintenance (i.e. no need to update parsing rules) Regex-based (e.g. grok) Easy to build rules Rules are flexible Slow & O(n) on # of rules Tricks: Move matching patterns to the top of the list Move broad patterns to the bottom Skip patterns including others that didn’t match Grammar-based (e.g. liblognorm, PatternDB) Faster. O(1) on # of rules. References: Logagent Logstash rsyslog syslog-ng
  39. 39. Back to buffers: check what happens if when they fill up Local files: when are they rotated/archived/deleted? TCP: what happens when connection breaks/times out? UNIX sockets: what happens when socket blocks writes? UDP: network buffers should handle spiky load Check/increase net.core.rmem_max net.core.rmem_default Unlike UDP&TCP, both DGRAM and STREAM local sockets are reliable/blocking
  40. 40. Let’s talk protocols now UDP: cool for the app, but not reliable TCP: more reliable, but not completely Application-level ACKs may be needed: No failure/backpressure handling needed App gets ACK when OS buffer gets it ⇒ no retransmit if buffer is lost* * more at sender receiver ACKs Protocol Example shippers HTTP Logstash, rsyslog, syslog-ng, Fluentd, Logagent RELP rsyslog, Logstash Beats Filebeat, Logstash Kafka Fluentd, Filebeat, rsyslog, syslog-ng, Logstash
  41. 41. Wrapping up: where to log? critical? UDP. Increase network buffers on destination, so it can handle spiky traffic Paying with RAM or IO? UNIX socket. Local shipper with memory buffers, that can drop data if needed Local files. Make sure rotation is in place or you’ll run out of disk! no IO RAM yes
  42. 42. Flow patterns (1 of 5) application.log application.log Logstash Logstash Elasticsearch Easy&flexible Overhead
  43. 43. Flow patterns (2 of 5) application.log application.log Filebeat Filebeat Elasticsearch (with Ingest) Light&simple Harder to scale processing
  44. 44. Flow patterns (3 of 5) Elasticsearch files, sockets (syslog?), localhost TCP/UDP Logagent Fluentd rsyslog syslog-ng Light, scales No central control
  45. 45. Flow patterns (4 of 5) ElasticsearchKafka Filebeat Logagent Fluentd rsyslog syslog-ng Good for multiple destinations More complex something else Logstash, custom consumer
  46. 46. Flow patterns (5 of 5)
  47. 47. Thank you! Rafał Kuć @kucrafal Radu Gheorghe @radu0gheorghe Sematext @sematext Join Us! We are hiring!
  48. 48. Pictures