SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance

1,696 views

Published on

"Benchmarking Solr Performance" - Tim Potter, Lucidworks

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,696
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
32
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Yes, it does, but we need to start somewhere
  • more shards == better indexing throughput (if your servers can handle it)
  • replication is as fast as your slowest replica

    replicas have to re-analyze documents

    future – would like to send pre-analyzed docs between leader and replica, esp if text analysis is complex
  • Not much capacity available for running queries on this node
  • First couple of passes, I either had really bad performance (too random) or really fast (not random enough)
  • Make it very easy to launch and manage SolrCloud clusters in Amazon of sizes 1 node to 100’s

    User has basic control over instance type, # of instances, # of nodes per instance, ZooKeeper ensemble

    Doesn’t have to care about how to start each Solr, how to connect it with ZooKeeper, host names / IPs, etc.



  • Custom AMI that we just need to “start” stuff on is preferred to configuring a barebones instance each time
  • SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance

    1. 1. Search | Discover | Analyze Confidential and Proprietary © Copyright 2013 Benchmarking Solr Performance June 18, 2014 Timothy Potter
    2. 2. Confidential and Proprietary © Copyright 2013 My SolrCloud Experience • At LucidWorks, mostly focused on hardening SolrCloud; Lucene/Solr committer • Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs) • Built a Fabric/boto framework for deploying and managing a cluster in EC2 • Co-author of Solr In Action
    3. 3. Confidential and Proprietary © Copyright 2013 Agenda • Indexing performance tests • Solr Scale Toolkit • Next steps
    4. 4. Confidential and Proprietary © Copyright 2013 Cluster sizing How many servers do I need to index X docs? ... shards ... ? ... replicas ... ? I need N queries per second over M docs, how many servers do I need? It depends?!?
    5. 5. Confidential and Proprietary © Copyright 2013 Methodology • Transparent repeatable results – Ideally hoping for something owned by the community • Synthetic docs ~ 1K each on disk, mix of field types – Data set created using code borrowed from PigMix – English text fields generated using a Zipfian distribution • Java 1.7u55, Amazon Linux, r3.2xlarge nodes – enhanced networking enabled, placement group, same AZ • Stock Solr (cloud) 4.8.1 – Using Shawn Heisey’s GC tuning parameters • Use Elastic MapReduce to generate load – As many nodes as I need to drive Solr!
    6. 6. Confidential and Proprietary © Copyright 2013 Indexing Results Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec 10 10 1 48 1762 73,780 10 10 2 34 3727 34,881 10 20 1 48 1282 101,404 10 20 2 34 3207 40,536 10 30 1 72 1070 121,495 10 30 2 60 3159 41,152 15 15 1 60 1106 117,541 15 15 2 42 2465 52,738 15 30 1 60 827 157,195 15 30 2 42 2129 61,062
    7. 7. Confidential and Proprietary © Copyright 2013 Direct Updates Indexing Client 1 CloudSolrServer (SolrJ) ZooKeeper /clusterstate.json Shard 1 (leader) Shard 2 (leader) Shard 3 (leader) <doc> <doc> Watch /clusterstate.json <doc> <doc> compute shard assignment on clientbatch
    8. 8. Confidential and Proprietary © Copyright 2013 Replication CloudSolrServer (SolrJ) ZooKeeper /clusterstate.json Shard 1 (leader) Shard 2 (leader) Shard 3 (leader) <doc> <doc> Watch /clusterstate.json <doc> Shard 1 (replica) Shard 2 (replica) Shard 3 (replica) Blocks for response from replica(s)
    9. 9. Confidential and Proprietary © Copyright 2013 Don’t swamp your servers!
    10. 10. Confidential and Proprietary © Copyright 2013 Lessons Learned • Know what throughput your client side is capable of generating – If in MapReduce, index from reducers with speculative execution disabled • Don’t change Solr config without good reasons for doing so • Overshard (but not too much) • Near-linear scalability as I added nodes!
    11. 11. Confidential and Proprietary © Copyright 2013 Query Performance Tests • All nodes in SolrCloud perform indexing and execute queries • Using the TermsComponent to build queries based on the terms in each field. • Harder to accurately simulate user queries over synthetic data – Need mix of faceting, paging, sorting, grouping, boolean clauses, range queries, boosting, filters (some cached, some not), etc ... • Does the randomness in your test queries model (expected) user behavior? • Start with one server (1 shard) to determine baseline query performance. – Look for inefficiencies in your schema and other config settings
    12. 12. Confidential and Proprietary © Copyright 2013 Solr Scale Toolkit • Fabric / Python based toolset for deploying and managing SolrCloud clusters • SolrJ-based client application useful for building tools that need access to cluster state information in ZooKeeper • Code to support benchmarks for Solr
    13. 13. Confidential and Proprietary © Copyright 2013 Python-based Tools boto – Python API for AWS (EC2, S3, etc) Fabric – Python-based tool for automating system admin tasks over SSH pysolr – Python library for Solr (sending commits, queries, ...) kazoo – Python client tools for ZooKeeper Supporting Cast: JMeter – run tests, generate reports collectd – system monitoring Logstash4Solr – log aggregation JConsole/VisualVM – monitor JVM during indexing / queries
    14. 14. Confidential and Proprietary © Copyright 2013 Solr Scale Toolkit: Demo • Launch a meta node – Log agg / basic monitoring using SiLK • Launch ZooKeeper Ensemble – 3 nodes to establish quorum – Setup cron job to clean-up snapshots • Launch SolrCloud cluster • Create new collection and index some docs – Attach JConsole while indexing • Run a healthcheck on the collection • Checkout Banana Dashboard • Backup / Restore – Requires patch for SOLR-5956 – Use fab patch_jars to update jars and do a rolling restart
    15. 15. Confidential and Proprietary © Copyright 2013 • Custom built AMI? • Block device mapping – dedicated disk per Solr node • Launch and then poll status until they are live – verify SSH connectivity • Tag each instance with a cluster ID and username Provisioning machines fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
    16. 16. Confidential and Proprietary © Copyright 2013 • Two options: – provision 1 to N nodes when you launch Solr cluster – use existing named ensemble • Fabric command simply creates the myid files and zoo.cfg file for the ensemble – and some cron scripts for managing snapshots • Basic health checking of ZooKeeper status: – echo srvr | nc localhost 2181 ZooKeeper fab new_zk_ensemble:zk1,n=3
    17. 17. Confidential and Proprietary © Copyright 2013 • Upload a BASH script that starts/stops Solr • Set system props: jetty.port, host, zkHost, JVM opts • One or more Solr nodes per machine • JVM mem opts dependent on instance type and # of Solr nodes per instance • Optionally configure log4j.properties to append messages to Rabbitmq for Logstash4Solr integration SolrCloud fab new_solrcloud:test1,zk=zk1,nodesPerHost=2
    18. 18. Confidential and Proprietary © Copyright 2013 • BASH script that implements: – start/stop Solr nodes on each EC2 instance – sets JVM memory options, system properties (jetty.port), enable remote JMX, etc – backup log files before restarting nodes – ensure JVM is killed correctly before restarting • Environment variables in: solr-ctl-env.sh solr-ctl.sh
    19. 19. Confidential and Proprietary © Copyright 2013 • Deploy a configuration directory to ZooKeeper • Create a new collection • Attach a local JConsole/VisualVM to a remote JVM • Rolling restart (with Overseer awareness) • Build Solr locally and patch remote – Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network • Put/get files • Grep over all log files (across the cluster) Miscellaneous Utility Tasks
    20. 20. Confidential and Proprietary © Copyright 2013 • fab mine: See clusters I’m running (or for other users too) • fab kill_mine: Terminate all instances I’m running – Use termination protection in production • fab ssh_to: Quick way to SSH to one of the nodes in a cluster • fab stop/recover/kill: Basic commands for controlling specific Solr nodes in the cluster • fab jmeter: Execute a JMeter test plan against your cluster – Example test plan and Java sampler is included with the source Other useful stuff ...
    21. 21. Confidential and Proprietary © Copyright 2013 • Java-based command-line application that uses SolrJ’s CloudSolrServer to perform advanced cluster management operations: – healthcheck: collect metadata and health information from all replicas for a collection from ZooKeeper – backup: create a snapshot of each shard in a collection for backing up to remote storage (S3) • Framework for building complex tools that benefit from having access to cluster state information in ZooKeeper SolrCloud Tools (SolrJ client app) ./tools.sh –tool healthcheck
    22. 22. Confidential and Proprietary © Copyright 2013 SiLK Integration • SiLK: Solr integrated with Logstash and Kibana – Index time-series data, such as log data (collectd, Solr logs, ...) – Build cool dashboards with Banana (fork of Kibana) • Easily aggregate all WARN and more severe log messages from all Solr servers into logstash4solr • Send collectd metrics to logstash4solr
    23. 23. Confidential and Proprietary © Copyright 2013 SiLK Integration
    24. 24. Confidential and Proprietary © Copyright 2013 What’s Next? • Migrate to using Apache libcloud instead of using boto directly • Benchmark mixed work-loads (queries and indexing) • SiLK is improving rapidly! • Chaos monkey tests – integrate jepsen? • Open source so please kick the tires!
    25. 25. Confidential and Proprietary © Copyright 2013 Wrap-up • Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk • LucidWorks: http://www.lucidworks.com • SiLK: http://www.lucidworks.com/lucidworks-silk/ • Solr In Action: http://www.manning.com/grainger/ • Connect: @thelabdude / tim.potter@lucidworks.com Questions?

    ×