• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Administering and Monitoring SolrCloud Clusters
 

Administering and Monitoring SolrCloud Clusters

on

  • 638 views

Presented by Rafal Kuć, Consultant and Software engineer, , Sematext Group, Inc. ...

Presented by Rafal Kuć, Consultant and Software engineer, , Sematext Group, Inc.

Even though Solr can run without causing any troubles for long periods of time it is very important to monitor and understand what is happening in your cluster. In this session you will learn how to use various tools to monitor how Solr is behaving at a high level, but also on Lucene, JVM, and operating system level. You'll see how to react to what you see and how to make changes to configuration, index structure and shards layout using Solr API. We will also discuss different performance metrics to which you ought to pay extra attention. Finally, you'll learn what to do when things go awry - we will share a few examples of troubleshooting and then dissect what was wrong and what had to be done to make things work again.

Statistics

Views

Total Views
638
Views on SlideShare
486
Embed Views
152

Actions

Likes
0
Downloads
13
Comments
0

2 Embeds 152

http://www.lucenerevolution.org 150
http://lucenerevolution.org 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Administering and Monitoring SolrCloud Clusters Administering and Monitoring SolrCloud Clusters Presentation Transcript

    • Administering and Monitoring SolrCloud Rafał  Kuć  – Sematext Group, Inc. @kucrafal @sematext sematext.com
    • Ta  me… Sematext consultant & engineer Solr.pl co-founder Father and husband 
    • SolrCloud Concepts Shard1 Replica Shard2 Replica Solr Server Solr Server Shard2 Shard1 Solr Server Solr Server Application
    • Local SolrCloud Cluster java -Dbootstrap_confdir=./solr/revolution/conf -Dcollection.configName=revolution -DzkRun -DnumShards=1 -jar start.jar Runs embedded ZooKeeper Bootstraps collection with 1 shards Starts Solr
    • Starting Solr Cluster No Collection No Collection -DzkHost=192.168.1.1:2181, 192.168.1.2:2181,192.168.1.3:2181 Solr Server -DzkHost=192.168.1.3:2181, 192.168.1.1:2181,192.168.1.2:2181 Solr Server No Collection No Collection -DzkHost=192.168.1.2:2181, 192.168.1.1:2181,192.168.1.3:2181 -DzkHost=192.168.1.3:2181, 192.168.1.1:2181,192.168.1.2:2181 Solr Server ZooKeeper ZooKeeper ZooKeeper Solr Server
    • Uploading Collection Configuration ./zkcli.sh -cmd upconfig -zkhost 192.168.1.1:2181 -confdir ./conf/ -confname revolution ZooKeeper Collection configuration ZooKeeper ZooKeeper Solr
    • Collections API Create Delete Reload Split Create Alias Delete Alias Shard Creation/Deletion http://wiki.apache.org/solr/SolrCloud
    • Collection Creation curl 'http://solrhost:8983/solr/admin/collections?action=CREATE &name=revolution&numShards=3&replicationFactor=4' name numShards replicationFactor maxShardsPerNode createNodeSet collection.configName
    • Collection Split Example $ curl 'http://solr1:8983/solr/admin/collections?action=CREATE& name=collection1&numShards=2&replicationFactor=1'
    • Collection Split Example $ curl 'http://localhost:8983/solr/admin/collections? action=SPLITSHARD&collection=collection1&shard=shard1'
    • Getting Deeper – CoreAdmin API curl 'http://solrhost:8983/solr/admin/cores?action=CREATE &name=newcore&collection=revolution&shard=shard2' collection shard numShards collection.configName
    • Schema – the API Reading (Solr 4.2) Fields Dynamic fields Types Copy fields Name (4.3) Version (4.3) Unique Key (4.3) Similarity (4.3) Writing (Solr 4.4) Adding new fields Adding copy fields
    • Reading Your Schema curl -XGET 'http://solrhost:8983/solr/rev/schema/fields/name' { "responseHeader" : { "status" : 0, "QTime" : 5 }, "field" : { "name" : "name", "type" : "text_general", "indexed" : true, "stored" : true } } Full reference: http://wiki.apache.org/solr/SchemaRESTAPI
    • Dynamic Schema Modifications <schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory> curl -XPUT 'http://solrhost:8983/solr/rev/schema/fields/content' –d '{ "type" : "text", "stored" : "false", "copyFields" : ["catchAll"] }' curl -XPOST 'http://solrhost:8983/solr/rev/schema/copyFields' -d '[ { "source" : "name", "dest" : [ "text", "personal" ] } ]'
    • The Right Directory StandardDirectory SimpleFSDirectory NIOFSDirectory MMapDirectory _0.fdt _0.fdx _0.fnm _0.nvd _1.fdt _1.fdx _1.fnm _1.nvd NRTCachingDirectory RAMDirectory <directoryFactory name="DirectoryFactory" class="solr.NRTCachingDirectoryFactory" />
    • Segment Merging Level 0 a b f Level 1 c c d e g
    • Segment Merge Under Control Merge policy Merge scheduler Merge factor Merge policy configuration https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig
    • Autocommit or Not? Automatic data flush (hard commit) Automatic index view refresh <autoCommit> <maxTime>15000</maxTime> <maxDocs>1000</maxDocs> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>
    • Caches Refreshed with IndexSearcher Configurable Different purposes Different implementations Solr Cache
    • Monitoring Importance
    • What to Pay Attention to?
    • Cluster State Health Shards and replica status Shard placement Failing nodes
    • Indexing Related Metrics Index throughput Document distribution I/O subsystem metrics Merging
    • Search - related Metrics Count Latency Distribution among nodes Anomalies and spikes
    • Monitoring Memory and GC Heap details Pool size Pool utilization Garbage collection count Garbage collection time
    • Monitoring OS Related Metrics CPU details Load I/O activity Network usage
    • Solr Administration Panel
    • Solr & JMX <jmx /> java -Dcom.sun.management.jmxremote –jar start.jar
    • Solr & JMX
    • SPM Index statistics Request # and latency Caches and warmup CPU JVM Memory and OS Memory Garbage collector OS related statistics
    • SPM Dashboard
    • Other Monitoring Tools Ganglia http://ganglia.sourceforge.net/ New Relic http://www.newrelic.com/ Opsview http://www.opsview.com
    • Too much is too much
    • Too hot
    • Caches
    • We Are Hiring ! Dig Search ? Dig Analytics ? Dig Big Data ? Dig Performance ? Dig working with and in open – source ? We’re hiring world – wide ! http://sematext.com/about/jobs.html
    • Thank You ! Rafał  Kuć   @kucrafal rafal.kuc@sematext.com Sematext @sematext http://sematext.com http://blog.sematext.com SPM discount code: LR2013SPM20 @ Sematext booth ;)