Demystifying Solr Cloud Autoscaling: Simulations and Testing

1. Demystifying SolrCloud Autoscaling Simulations and Testing Andrzej Białecki Senior Solr Engineer, Lucidworks ab@lucidworks.com

2. Agenda •Autoscaling Policy Overview •Autoscaling Simulation Framework •Using the bin/solr autoscaling tool •Simulation-based Policy Tuning

3. Autoscaling Policy Overview

4. Autoscaling Policy • Determines Solr core placement on SolrCloud nodes • Preferences determine the sorting order of candidate nodes { minimize : cores }, { maximize : freedisk, precision : 10 } • Rules determine core placement limits on selected nodes { replica : <2, shard : #EACH, node : #ANY }

5. Example cluster policy cluster-preferences : [ { minimize : cores, precision : 1 } { maximize : freedisk, precision : 10 } ], cluster-policy : [ { replica : 0, nodeset : { nodeRole : overseer }}, { replica : <2, shard : #EACH, node: #ANY } ]

6. Example cluster + named policies cluster-preferences : [ { minimize : cores, precision : 1 } { maximize : freedisk, precision : 20 } ], cluster-policy : [ { replica : 0, nodeset : { nodeRole : overseer }}, { replica : <2, shard : #EACH, node : #ANY } ], policies : { policy1 : [ { replica : <3, shard : #EACH, node : #ANY } ], policy2 : [ { replica : <5, shard : #EACH, nodeset: { diskType : ssd }} ] }

7. 🤔

8. Effective policy cluster-policy : [ { replica : 0, nodeset : { nodeRole : overseer }}, { replica : <2, shard : #EACH, node : #ANY } ], policies : { policy1 : [ { replica : <3, shard : #EACH, node : #ANY } ] } Effective policy1: { replica : <3, shard : #EACH, node : #ANY }, { replica : 0, nodeset : { nodeRole : overseer }}

9. Effective policy cluster-policy : [ { replica : 0, nodeset : { nodeRole : overseer }}, { replica : <2, shard : #EACH, node : #ANY } ], policies : { policy2 : [ { replica : <5, shard : #EACH, nodeset : { diskType : ssd }} ] } Effective policy2: { replica : <5, shard : #EACH, nodeset : { diskType : ssd }}, { replica : 0, nodeset : { nodeRole : overseer }}, { replica : <2, shard : #EACH, node : #ANY }

11. Challenges • It’s not always easy to figure out the effective policy • Actual replica placements may differ from expected • How should I design the rules to get the desired layout? • Will the suggested replica movements eventually lead to a stable and balanced cluster? – Can’t recklessly test this on a live cluster!

12. Autoscaling Simulation Framework

13. Autoscaling Simulation Framework • Uses actual autoscaling code • Supports testing large virtual clusters, using accelerated time • Provides API for unit testing • Provides API for saving / loading snapshots of autoscaling data – Anonymized snapshots in 8.3 • Available as a command-line tool since Solr 8.1

14. bin/solr autoscaling command-line tool

15. Tool functionality • Does NOT require a Solr instance to run simulations – Takes a snapshot of a live cluster’s state to initialize, save and load later • Uses existing or provided autoscaling config • Autoscaling data can be redacted before sharing • “What if” exploration – iteratively applies autoscaling suggestions

16. Simulation-based Policy Tuning

17. Tuning workflow 1. Save a snapshot of a live cluster 2. Modify the Policy in autoscaling.json 3. Load a snapshot and apply the modified autoscaling.json – Optionally, run several iterations to see if all suggestions / violations are eventually resolved 4. Check the resulting layout of the simulated cluster 5. Repeat steps 2-4 as needed … 6. Profit! 😀

18. Demo

19. Summary •Autoscaling Policy Overview •Autoscaling Simulation Framework •Using the bin/solr autoscaling tool •Simulation-based Policy Tuning

20. THANK YOUAndrzej Białecki Senior Solr Engineer, Lucidworks ab@lucidworks.com

21. Bonus slides

22. What’s in /autoscaling.json Config? • Autoscaling Policy: – Cluster preferences – determine the priority of candidate nodes for replica placement – Cluster-wide policy – Named policies (for use in collection configs) • Trigger configurations – Events, actions and listener

23. Autoscaling Policy • Global rules put constraints on cores (regardless of collection) { cores : <10, node : #ANY } • Collection rules put constraints on replicas { replica: 33%, shard: #EACH, nodeset: { sysprop.zone: east }} { replica: 66%, shard: #EACH, nodeset: { sysprop.zone: west }} • Named policies define additional rules to be used for specific collections – May not contain global rules

24. Effective collection policy • Applied to collections that use the named policy • Combination of the global cluster policy and a named policy – Effective policy holds ONLY unique rules with regard to the node selector: node, nodeset, nodeRole, heapUsage, sysprop.* … • Global rules using a different node selector are appended • Global rules using the same node selector are ignored

25. Autoscaling snapshot • All data (JSON) to perform autoscaling calculations and simulate actions – Cluster information (live nodes) and collection information (ClusterState) – All ZooKeeper data – Node information and metrics (relevant to the current Policy) – Replica information and metrics (relevant to the current Policy) – Additional diagnostics and statistics • Optional consistent redaction – http://my.secret.cluster:8000/solr/mySecretCollection  http://N_0/solr/COLL_0 – … “node_name” : “my.secret.cluster_8000_solr”  “N_0_solr” – … “core” : “mySecretCollection_shard1_replica_n2” 

26. $ bin/solr autoscaling -help -zkHost <HOST> Address of the Zookeeper ensemble; defaults to: localhost:9983 -a,--config <CONFIG> Autoscaling config file, defaults to the one deployed in the cluster. -c,--clusterState Show ClusterState (collections layout) -d,--diagnostics Show calculated diagnostics -s,--suggestions Show calculated suggestions -stats Show summarized collection & node statistics. -save <DIR> Store autoscaling snapshot of the current cluster. -load <DIR> Load autoscaling snapshot of the cluster instead of using the real one. -r,--redact Redact node and collection names (original names will be consistently randomized) -simulate Simulate execution of all suggestions. -ss,--saveSimulated <DIR> Save autoscaling shapshots at each step of simulated execution. -i,--iterations <NUMBER> Max number of simulation iterations.

Demystifying Solr Cloud Autoscaling: Simulations and Testing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Demystifying Solr Cloud Autoscaling: Simulations and Testing

Similar to Demystifying Solr Cloud Autoscaling: Simulations and Testing (20)

More from Lucidworks

More from Lucidworks (20)

Recently uploaded

Recently uploaded (20)

Demystifying Solr Cloud Autoscaling: Simulations and Testing

Editor's Notes