Administering and Monitoring SolrCloud

Rafał  Kuć  – Sematext Group, Inc.
@kucrafal @sematext sematext.com
Ta  me…
Sematext consultant & engineer
Solr.pl co-founder
Father and husband 
SolrCloud Concepts
Shard1
Replica

Shard2
Replica

Solr Server

Solr Server

Shard2

Shard1

Solr Server

Solr Server

App...
Local SolrCloud Cluster
java -Dbootstrap_confdir=./solr/revolution/conf
-Dcollection.configName=revolution -DzkRun -DnumSh...
Starting Solr Cluster
No Collection

No Collection

-DzkHost=192.168.1.1:2181,
192.168.1.2:2181,192.168.1.3:2181

Solr Ser...
Uploading Collection Configuration
./zkcli.sh -cmd upconfig -zkhost 192.168.1.1:2181
-confdir ./conf/ -confname revolution...
Collections API
Create
Delete

Reload
Split

Create Alias
Delete Alias
Shard Creation/Deletion

http://wiki.apache.org/sol...
Collection Creation
curl 'http://solrhost:8983/solr/admin/collections?action=CREATE
&name=revolution&numShards=3&replicati...
Collection Split Example

$ curl
'http://solr1:8983/solr/admin/collections?action=CREATE&
name=collection1&numShards=2&rep...
Collection Split Example

$ curl 'http://localhost:8983/solr/admin/collections?
action=SPLITSHARD&collection=collection1&s...
Getting Deeper – CoreAdmin API
curl 'http://solrhost:8983/solr/admin/cores?action=CREATE
&name=newcore&collection=revoluti...
Schema – the API
Reading (Solr 4.2)
Fields
Dynamic fields
Types
Copy fields
Name (4.3)
Version (4.3)
Unique Key (4.3)
Simi...
Reading Your Schema
curl -XGET 'http://solrhost:8983/solr/rev/schema/fields/name'
{
"responseHeader" : {
"status" : 0,
"QT...
Dynamic Schema Modifications
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name=...
The Right Directory
StandardDirectory
SimpleFSDirectory
NIOFSDirectory
MMapDirectory

_0.fdt

_0.fdx _0.fnm _0.nvd

_1.fdt...
Segment Merging
Level 0

a

b

f

Level 1

c

c

d

e

g
Segment Merge Under Control
Merge policy
Merge scheduler
Merge factor

Merge policy configuration

https://cwiki.apache.or...
Autocommit or Not?
Automatic data flush (hard commit)
Automatic index view refresh

<autoCommit>
<maxTime>15000</maxTime>
...
Caches
Refreshed with IndexSearcher
Configurable
Different purposes

Different implementations

Solr Cache
Monitoring Importance
What to Pay Attention to?
Cluster State
Health

Shards and replica status
Shard placement

Failing nodes
Indexing Related Metrics
Index throughput
Document distribution
I/O subsystem metrics
Merging
Search - related Metrics
Count
Latency
Distribution among nodes
Anomalies and spikes
Monitoring Memory and GC
Heap details

Pool size
Pool utilization

Garbage collection count
Garbage collection time
Monitoring OS Related Metrics
CPU details
Load
I/O activity
Network usage
Solr Administration Panel
Solr & JMX
<jmx />
java -Dcom.sun.management.jmxremote –jar start.jar
Solr & JMX
SPM
Index statistics

Request # and latency
Caches and warmup

CPU
JVM Memory and OS Memory
Garbage collector
OS related s...
SPM Dashboard
Other Monitoring Tools
Ganglia
http://ganglia.sourceforge.net/

New Relic
http://www.newrelic.com/

Opsview
http://www.ops...
Too much is too much
Too hot
Caches
We Are Hiring !
Dig Search ?
Dig Analytics ?
Dig Big Data ?
Dig Performance ?
Dig working with and in open – source ?
We’r...
Thank You !
Rafał  Kuć  
@kucrafal
rafal.kuc@sematext.com

Sematext
@sematext
http://sematext.com
http://blog.sematext.com...
Upcoming SlideShare
Loading in...5
×

Administering and Monitoring SolrCloud Clusters

900

Published on

Presented by Rafal Kuć, Consultant and Software engineer, , Sematext Group, Inc.

Even though Solr can run without causing any troubles for long periods of time it is very important to monitor and understand what is happening in your cluster. In this session you will learn how to use various tools to monitor how Solr is behaving at a high level, but also on Lucene, JVM, and operating system level. You'll see how to react to what you see and how to make changes to configuration, index structure and shards layout using Solr API. We will also discuss different performance metrics to which you ought to pay extra attention. Finally, you'll learn what to do when things go awry - we will share a few examples of troubleshooting and then dissect what was wrong and what had to be done to make things work again.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
900
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Administering and Monitoring SolrCloud Clusters

  1. 1. Administering and Monitoring SolrCloud Rafał  Kuć  – Sematext Group, Inc. @kucrafal @sematext sematext.com
  2. 2. Ta  me… Sematext consultant & engineer Solr.pl co-founder Father and husband 
  3. 3. SolrCloud Concepts Shard1 Replica Shard2 Replica Solr Server Solr Server Shard2 Shard1 Solr Server Solr Server Application
  4. 4. Local SolrCloud Cluster java -Dbootstrap_confdir=./solr/revolution/conf -Dcollection.configName=revolution -DzkRun -DnumShards=1 -jar start.jar Runs embedded ZooKeeper Bootstraps collection with 1 shards Starts Solr
  5. 5. Starting Solr Cluster No Collection No Collection -DzkHost=192.168.1.1:2181, 192.168.1.2:2181,192.168.1.3:2181 Solr Server -DzkHost=192.168.1.3:2181, 192.168.1.1:2181,192.168.1.2:2181 Solr Server No Collection No Collection -DzkHost=192.168.1.2:2181, 192.168.1.1:2181,192.168.1.3:2181 -DzkHost=192.168.1.3:2181, 192.168.1.1:2181,192.168.1.2:2181 Solr Server ZooKeeper ZooKeeper ZooKeeper Solr Server
  6. 6. Uploading Collection Configuration ./zkcli.sh -cmd upconfig -zkhost 192.168.1.1:2181 -confdir ./conf/ -confname revolution ZooKeeper Collection configuration ZooKeeper ZooKeeper Solr
  7. 7. Collections API Create Delete Reload Split Create Alias Delete Alias Shard Creation/Deletion http://wiki.apache.org/solr/SolrCloud
  8. 8. Collection Creation curl 'http://solrhost:8983/solr/admin/collections?action=CREATE &name=revolution&numShards=3&replicationFactor=4' name numShards replicationFactor maxShardsPerNode createNodeSet collection.configName
  9. 9. Collection Split Example $ curl 'http://solr1:8983/solr/admin/collections?action=CREATE& name=collection1&numShards=2&replicationFactor=1'
  10. 10. Collection Split Example $ curl 'http://localhost:8983/solr/admin/collections? action=SPLITSHARD&collection=collection1&shard=shard1'
  11. 11. Getting Deeper – CoreAdmin API curl 'http://solrhost:8983/solr/admin/cores?action=CREATE &name=newcore&collection=revolution&shard=shard2' collection shard numShards collection.configName
  12. 12. Schema – the API Reading (Solr 4.2) Fields Dynamic fields Types Copy fields Name (4.3) Version (4.3) Unique Key (4.3) Similarity (4.3) Writing (Solr 4.4) Adding new fields Adding copy fields
  13. 13. Reading Your Schema curl -XGET 'http://solrhost:8983/solr/rev/schema/fields/name' { "responseHeader" : { "status" : 0, "QTime" : 5 }, "field" : { "name" : "name", "type" : "text_general", "indexed" : true, "stored" : true } } Full reference: http://wiki.apache.org/solr/SchemaRESTAPI
  14. 14. Dynamic Schema Modifications <schemaFactory class="ManagedIndexSchemaFactory"> <bool name="mutable">true</bool> <str name="managedSchemaResourceName">managed-schema</str> </schemaFactory> curl -XPUT 'http://solrhost:8983/solr/rev/schema/fields/content' –d '{ "type" : "text", "stored" : "false", "copyFields" : ["catchAll"] }' curl -XPOST 'http://solrhost:8983/solr/rev/schema/copyFields' -d '[ { "source" : "name", "dest" : [ "text", "personal" ] } ]'
  15. 15. The Right Directory StandardDirectory SimpleFSDirectory NIOFSDirectory MMapDirectory _0.fdt _0.fdx _0.fnm _0.nvd _1.fdt _1.fdx _1.fnm _1.nvd NRTCachingDirectory RAMDirectory <directoryFactory name="DirectoryFactory" class="solr.NRTCachingDirectoryFactory" />
  16. 16. Segment Merging Level 0 a b f Level 1 c c d e g
  17. 17. Segment Merge Under Control Merge policy Merge scheduler Merge factor Merge policy configuration https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig
  18. 18. Autocommit or Not? Automatic data flush (hard commit) Automatic index view refresh <autoCommit> <maxTime>15000</maxTime> <maxDocs>1000</maxDocs> <openSearcher>false</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>1000</maxTime> </autoSoftCommit>
  19. 19. Caches Refreshed with IndexSearcher Configurable Different purposes Different implementations Solr Cache
  20. 20. Monitoring Importance
  21. 21. What to Pay Attention to?
  22. 22. Cluster State Health Shards and replica status Shard placement Failing nodes
  23. 23. Indexing Related Metrics Index throughput Document distribution I/O subsystem metrics Merging
  24. 24. Search - related Metrics Count Latency Distribution among nodes Anomalies and spikes
  25. 25. Monitoring Memory and GC Heap details Pool size Pool utilization Garbage collection count Garbage collection time
  26. 26. Monitoring OS Related Metrics CPU details Load I/O activity Network usage
  27. 27. Solr Administration Panel
  28. 28. Solr & JMX <jmx /> java -Dcom.sun.management.jmxremote –jar start.jar
  29. 29. Solr & JMX
  30. 30. SPM Index statistics Request # and latency Caches and warmup CPU JVM Memory and OS Memory Garbage collector OS related statistics
  31. 31. SPM Dashboard
  32. 32. Other Monitoring Tools Ganglia http://ganglia.sourceforge.net/ New Relic http://www.newrelic.com/ Opsview http://www.opsview.com
  33. 33. Too much is too much
  34. 34. Too hot
  35. 35. Caches
  36. 36. We Are Hiring ! Dig Search ? Dig Analytics ? Dig Big Data ? Dig Performance ? Dig working with and in open – source ? We’re hiring world – wide ! http://sematext.com/about/jobs.html
  37. 37. Thank You ! Rafał  Kuć   @kucrafal rafal.kuc@sematext.com Sematext @sematext http://sematext.com http://blog.sematext.com SPM discount code: LR2013SPM20 @ Sematext booth ;)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×