The document discusses SolrCloud autoscaling including:
1. An autoscaling policy determines core placement on nodes based on preferences and rules.
2. A simulation framework allows testing autoscaling configurations without impacting a live cluster.
3. The bin/solr autoscaling tool loads snapshots of cluster state, simulates autoscaling suggestions, and tunes policies without risk.
11. Challenges
• It’s not always easy to figure out the effective policy
• Actual replica placements may differ from expected
• How should I design the rules to get the desired layout?
• Will the suggested replica movements eventually lead to a stable
and balanced cluster?
– Can’t recklessly test this on a live cluster!
13. Autoscaling Simulation Framework
• Uses actual autoscaling code
• Supports testing large virtual clusters, using accelerated time
• Provides API for unit testing
• Provides API for saving / loading snapshots of autoscaling data
– Anonymized snapshots in 8.3
• Available as a command-line tool since Solr 8.1
15. Tool functionality
• Does NOT require a Solr instance to run simulations
– Takes a snapshot of a live cluster’s state to initialize, save and load later
• Uses existing or provided autoscaling config
• Autoscaling data can be redacted before sharing
• “What if” exploration – iteratively applies autoscaling suggestions
17. Tuning workflow
1. Save a snapshot of a live cluster
2. Modify the Policy in autoscaling.json
3. Load a snapshot and apply the modified autoscaling.json
– Optionally, run several iterations to see if all suggestions / violations are eventually
resolved
4. Check the resulting layout of the simulated cluster
5. Repeat steps 2-4 as needed
…
6. Profit! 😀
22. What’s in /autoscaling.json
Config?
• Autoscaling Policy:
– Cluster preferences – determine the priority of candidate nodes for replica placement
– Cluster-wide policy
– Named policies (for use in collection configs)
• Trigger configurations
– Events, actions and listener
23. Autoscaling Policy
• Global rules put constraints on cores (regardless of collection)
{ cores : <10, node : #ANY }
• Collection rules put constraints on replicas
{ replica: 33%, shard: #EACH, nodeset: { sysprop.zone: east }}
{ replica: 66%, shard: #EACH, nodeset: { sysprop.zone: west }}
• Named policies define additional rules to be used for specific
collections
– May not contain global rules
24. Effective collection policy
• Applied to collections that use the named policy
• Combination of the global cluster policy and a named policy
– Effective policy holds ONLY unique rules with regard to the node selector:
node, nodeset, nodeRole, heapUsage, sysprop.* …
• Global rules using a different node selector are appended
• Global rules using the same node selector are ignored
25. Autoscaling snapshot
• All data (JSON) to perform autoscaling calculations and simulate actions
– Cluster information (live nodes) and collection information (ClusterState)
– All ZooKeeper data
– Node information and metrics (relevant to the current Policy)
– Replica information and metrics (relevant to the current Policy)
– Additional diagnostics and statistics
• Optional consistent redaction
– http://my.secret.cluster:8000/solr/mySecretCollection
http://N_0/solr/COLL_0
– … “node_name” : “my.secret.cluster_8000_solr” “N_0_solr”
– … “core” : “mySecretCollection_shard1_replica_n2”
26. $ bin/solr autoscaling -help
-zkHost <HOST> Address of the Zookeeper ensemble; defaults
to: localhost:9983
-a,--config <CONFIG> Autoscaling config file, defaults to the one
deployed in the cluster.
-c,--clusterState Show ClusterState (collections layout)
-d,--diagnostics Show calculated diagnostics
-s,--suggestions Show calculated suggestions
-stats Show summarized collection & node statistics.
-save <DIR> Store autoscaling snapshot of the current cluster.
-load <DIR> Load autoscaling snapshot of the cluster
instead of using the real one.
-r,--redact Redact node and collection names (original
names will be consistently randomized)
-simulate Simulate execution of all suggestions.
-ss,--saveSimulated <DIR> Save autoscaling shapshots at each step of
simulated execution.
-i,--iterations <NUMBER> Max number of simulation iterations.
27. $ bin/solr autoscaling -help
-zkHost <HOST> Address of the Zookeeper ensemble; defaults
to: localhost:9983
-a,--config <CONFIG> Autoscaling config file, defaults to the one
deployed in the cluster.
-c,--clusterState Show ClusterState (collections layout)
-d,--diagnostics Show calculated diagnostics
-s,--suggestions Show calculated suggestions
-stats Show summarized collection & node statistics.
-save <DIR> Store autoscaling snapshot of the current cluster.
-load <DIR> Load autoscaling snapshot of the cluster
instead of using the real one.
-r,--redact Redact node and collection names (original
names will be consistently randomized)
-simulate Simulate execution of all suggestions.
-ss,--saveSimulated <DIR> Save autoscaling shapshots at each step of
simulated execution.
-i,--iterations <NUMBER> Max number of simulation iterations.
28. $ bin/solr autoscaling -help
-zkHost <HOST> Address of the Zookeeper ensemble; defaults
to: localhost:9983
-a,--config <CONFIG> Autoscaling config file, defaults to the one
deployed in the cluster.
-c,--clusterState Show ClusterState (collections layout)
-d,--diagnostics Show calculated diagnostics
-s,--suggestions Show calculated suggestions
-stats Show summarized collection & node statistics.
-save <DIR> Store autoscaling snapshot of the current cluster.
-load <DIR> Load autoscaling snapshot of the cluster
instead of using the real one.
-r,--redact Redact node and collection names (original
names will be consistently randomized)
-simulate Simulate execution of all suggestions.
-ss,--saveSimulated <DIR> Save autoscaling shapshots at each step of
simulated execution.
-i,--iterations <NUMBER> Max number of simulation iterations.
29. $ bin/solr autoscaling -help
-zkHost <HOST> Address of the Zookeeper ensemble; defaults
to: localhost:9983
-a,--config <CONFIG> Autoscaling config file, defaults to the one
deployed in the cluster.
-c,--clusterState Show ClusterState (collections layout)
-d,--diagnostics Show calculated diagnostics
-s,--suggestions Show calculated suggestions
-stats Show summarized collection & node statistics.
-save <DIR> Store autoscaling snapshot of the current cluster.
-load <DIR> Load autoscaling snapshot of the cluster
instead of using the real one.
-r,--redact Redact node and collection names (original
names will be consistently randomized)
-simulate Simulate execution of all suggestions.
-ss,--saveSimulated <DIR> Save autoscaling shapshots at each step of
simulated execution.
-i,--iterations <NUMBER> Max number of simulation iterations.
30. $ bin/solr autoscaling -help
-zkHost <HOST> Address of the Zookeeper ensemble; defaults
to: localhost:9983
-a,--config <CONFIG> Autoscaling config file, defaults to the one
deployed in the cluster.
-c,--clusterState Show ClusterState (collections layout)
-d,--diagnostics Show calculated diagnostics
-s,--suggestions Show calculated suggestions
-stats Show summarized collection & node statistics.
-save <DIR> Store autoscaling snapshot of the current cluster.
-load <DIR> Load autoscaling snapshot of the cluster
instead of using the real one.
-r,--redact Redact node and collection names (original
names will be consistently randomized)
-simulate Simulate execution of all suggestions.
-ss,--saveSimulated <DIR> Save autoscaling shapshots at each step of
simulated execution.
-i,--iterations <NUMBER> Max number of simulation iterations.
Editor's Notes
Initial placement of replicas on nodes when a collection is created (or replicas added)
Movements of replicas in response to events (node lost, added, or other trigger events)
Suggested movements of replicas based on the current cluster state and the policy
Unresolved violations – replicas that violate current policy given the current cluster state
Are you confused yet?
Did I manage to confuse you? That was my goal.