Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tuning Solr & Pipeline for Logs


Published on

An updated talk about how to use Solr for logs and other time-series data, like metrics and social media. In 2016, Solr, its ecosystem, and the operating systems it runs on have evolved quite a lot, so we can now show new techniques to scale and new knobs to tune.

We'll start by looking at how to scale SolrCloud through a hybrid approach using a combination of time- and size-based indices, and also how to divide the cluster in tiers in order to handle the potentially spiky load in real-time. Then, we'll look at tuning individual nodes. We'll cover everything from commits, buffers, merge policies and doc values to OS settings like disk scheduler, SSD caching, and huge pages.

Finally, we'll take a look at the pipeline of getting the logs to Solr and how to make it fast and reliable: where should buffers live, which protocols to use, where should the heavy processing be done (like parsing unstructured data), and which tools from the ecosystem can help.

Published in: Technology
  • Dating direct: ❶❶❶ ❶❶❶
    Are you sure you want to  Yes  No
    Your message goes here

Tuning Solr & Pipeline for Logs

  1. 1. OCTOBER 11-14, 2016 • BOSTON, MA
  2. 2. Tuning Solr and its Pipeline for Logs Rafał Kuć and Radu Gheorghe Software Engineers, Sematext Group, Inc.
  3. 3. 3 01 Agenda Designing a Solr(Cloud) cluster for time-series data Solr and operating system knobs to tune Pipeline patterns and shipping options
  4. 4. 4 01 Time-based collections, the single best improvement 14.10 indexing
  5. 5. 5 01 Time-based collections, the single best improvement 14.10 indexing 15.10
  6. 6. 6 01 Time-based collections, the single best improvement 14.10 indexing 15.10
  7. 7. 7 01 Time-based collections, the single best improvement 14.10 15.10 ... 21.10 indexing
  8. 8. 8 01 Time-based collections, the single best improvement 14.10 15.10 ... 21.10 indexing Less merging ⇒ faster indexing Quick deletes (of whole collections) Search only some collections Better use of caches
  9. 9. 9 01 Load is uneven Black Friday Saturday Sunday
  10. 10. 1 0 01 Load is uneven Need to “rotate” collections fast enough to work with this (otherwise indexing and queries will be slow) These will be tiny Black Friday Saturday Sunday
  11. 11. 1 1 01 If load is uneven, daily/monthly/etc indices are suboptimal: you either have poor performance or too many collections Octi* is worried: * this is Octi →
  12. 12. 1 2 01 Solution: rotate by size indexing logs01 Size limit
  13. 13. 1 3 01 Solution: rotate by size indexing logs01 Size limit
  14. 14. 1 4 01 Solution: rotate by size indexing logs01 logs02 Size limit
  15. 15. 1 5 01 Solution: rotate by size indexing logs01 logs02 Size limit
  16. 16. 1 6 01 Solution: rotate by size indexing logs01 logs08... logs02
  17. 17. 1 7 01 Solution: rotate by size indexingPredictable indexing and search performance Fewer shards logs01 logs08... logs02
  18. 18. 1 8 01 Dealing with size-based collections logs01 logs08... logs02 app (caches results) stats 2016-10-11 2016-10-13 2016-10-12 2016-10-14 2016-10-18 doesn’t matter, it’s the latest collection
  19. 19. 1 9 01 Size-based collections handle spiky load better Octi concludes:
  20. 20. 2 0 01 Tiered cluster (a.k.a. hot-cold) 14 Oct 11 Oct 10 Oct 12 Oct 13 Oct hot01 cold01 cold02 indexing, most searches longer-running (+cached) searches
  21. 21. 2 1 01 Tiered cluster (a.k.a. hot-cold) 14 Oct 11 Oct 10 Oct 12 Oct 13 Oct hot01 cold01 cold02 indexing, most searches longer-running (+cached) searches ⇒ Good CPU and IO* ⇒ Heap. Decent IO for replication&backup * Ideally local SSD; avoid network storage unless it’s really good
  22. 22. 2 2 01 Octi likes tiered clusters Costs: you can use different hardware for different workloads Performance (see costs): fewer shards, less overhead Isolation: long-running searches don’t slow down indexing
  23. 23. 2 3 01 AWS specifics Hot tier: c3 (compute optimized) + EBS and use local SSD as cache* c4 (EBS only) Cold tier: d2 (big local HDDs + lots of RAM) m4 (general purpose) + EBS i2 (big local SSDs + lots of RAM) General stuff: EBS optimized Enhanced Networking VPC (to get access to c4&m4 instances) * Use --cachemode writeback for async writing: ml/Logical_Volume_Manager_Administration/lvm_cache_volume_creation.html
  24. 24. PIOPS is best but expensive HDD - too slow (unless cold=icy) ⇒ General purpose SSDs 2 4 01 EBS Stay under 3TB. More smaller (<1TB) drives in RAID0 give better, but shorter IOPS bursts Performance isn’t guaranteed ⇒ RAID0 will wait for the slowest disk Check limits (e.g. 160MB/s per drive, instance-dependent IOPS and network) 3 IOPS/GB up to 10K (~3TB), up to 256kb/IO, merges up to 4 consecutive IOs
  25. 25. 2 5 01 Octi’s AWS top picks c4s for the hot tier: cheaper than c3s with similar performance m4s for the cold tier: well balanced, scale up to 10xl, flexible storage via EBS EBS drives < 3TB, otherwise avoids RAID0: higher chances of performance drop
  26. 26. 2 6 01 Scratching the surface of OS options Say no to swap Disk scheduler: CFQ for HDD, deadline for SSD Mount options: noatime, nodiratime, data=writeback, nobarrier For bare metal: check CPU governor and THP* because strict ordering is for the weak * often it’s enabled, but /proc/sys/vm/nr_hugepages is 0
  27. 27. 2 7 01 Schema and solrconfig Auto soft commit (5s?) Auto commit (few minutes?) RAM buffer size + Max buffered docs Doc Values for faceting + retrieving those fields (stored=false) Omit norms, frequencies and positions Don’t store catch-all field(s)
  28. 28. 2 8 01 Relaxing the merge policy* Merges are the heaviest part of indexing Facets are the heaviest part of searching Facets (except method=enum) depend on data size more than # of segments Knobs: segmentsPerTier: more segments ⇒ less merging maxMergeAtOnce < segmentsPerTier, to smooth out those IO spikes maxMergedSegmentMB: lower to merge more small segments, less bigger ones * unless you only do “grep”. YMMV, of course. Keep an eye on open files, though ⇒ fewer open files
  29. 29. 2 9 01 Some numbers: more segments, more throughput (10-15% here) 10 segmentsPerTier 10 maxMergeAtOnce 50 segmentsPerTier 50 maxMergeAtOnce need to rotate before this drop
  30. 30. 3 0 01 Lower max segment (500MB from 5GB default) less CPU fewer segments
  31. 31. 3 1 01 There’s more... SPM screenshots from all tests + JMeter test plan here: We’d love to hear about your own results! correct spelling:
  32. 32. 3 2 01 Increasing segments per tier while decreasing max merged segment (by an order of magnitude) makes indexing better and search latency less spiky Octi’s conclusions so far
  33. 33. 3 3 01 Optimize I/O and CPU by not optimizing Unless you have spare CPU & IO (why would you?) And unless you run out of open files Only do this on “old” indices!
  34. 34. 3 4 01 Optimizing the pipeline* logs log shipper(s) Ship using which protocol(s)? Buffer Route to other destinations? Parse * for availability and performance/costs Or log to Solr directly from app (i.e. implement a new, embedded log shipper)
  35. 35. 3 5 01 A case for buffers performance & availability allows batches and threads when Solr is down or can’t keep up
  36. 36. 3 6 01 Types of buffers Disk*, memory or a combination On the logging host or centralized * doesn’t mean it fsync()s for every message file or local log shipper Easy scaling; fewer moving parts Often requires a lightweight shipper Kafka/Redis/etc or central log shipper Extra features (e.g. TTL) One place for changes bufferapp bufferapp
  37. 37. 3 7 01 Multiple destinations * or flat files, or S3 or... input buffer processing Outputs need to be in sync input Processing may cause backpressure processing *
  38. 38. 3 8 01 Multiple destinations input Solr offset HDFS offset input processing processing
  39. 39. 3 9 01 Just Solr and maybe flat files? Go simple with a local shipper Custom, fast-changing processing & multiple destinations? Kafka as a central buffer Octi’s pipeline preferences
  40. 40. 4 0 01 Parsing unstructured data Ideally, log in JSON* Otherwise, parse * or another serialization format For performance and maintenance (i.e. no need to update parsing rules) Regex-based (e.g. grok) Easy to build rules Rules are flexible Typically slow & O(n) on # of rules, but: Move matching patterns to the top of the list Move broad patterns to the bottom Skip patterns including others that didn’t match Grammar-based (e.g. liblognorm, PatternDB) Faster. O(1) on # of rules Numbers in our 2015 session:
  41. 41. 4 1 01 Decide what gives when buffers fill up input shipper Can drop data here, but better buffer when shipper buffer is full, app can block or drop data Check: Local files: what happens when files are rotated/archived/deleted? UDP: network buffers should handle spiky load* TCP: what happens when connection breaks/times out? UNIX sockets: what happens when socket blocks writes**? * you’d normally increase net.core.rmem_max and rmem_default ** both DGRAM and STREAM local sockets are reliable (vs Internet ones, UDP and TCP)
  42. 42. 4 2 01 Octi’s flow chart of where to log critical? UDP. Increase network buffers on destination, so it can handle spiky traffic Paying with RAM or IO? UNIX socket. Local shipper with memory buffers, that can drop data if needed Local files. Make sure rotation is in place or you’ll run out of disk! no IO RAM yes
  43. 43. 4 3 01 Protocols UDP: cool for the app, but not reliable TCP: more reliable, but not completely Application-level ACKs may be needed: No failure/backpressure handling needed App gets ACK when OS buffer gets it ⇒ no retransmit if buffer is lost* * more at sender receiver ACKs Protocol Example shippers HTTP Logstash, rsyslog, Fluentd RELP rsyslog, Logstash Beats Filebeat, Logstash Kafka Fluentd, Filebeat, rsyslog, Logstash
  44. 44. 4 4 01 Octi’s top pipeline+shipper combos rsyslog app UNIX socket HTTP memory+disk buffer can drop app fileKafka consumer Filebeat HTTP simple & reliable
  45. 45. 4 5 01 Conclusions, questions, we’re hiring, thank you The whole talk was pretty much only conclusions :) Feels like there’s much more to discover. Please test & share your own nuggets Scary word, ha? Poke us: @kucrafal @radu0gheorghe @sematext ...or @ our booth here