Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, Cloudera


Published on

Presented at Lucene/Solr Revolution 2016

Published in: Technology
  • Be the first to comment

  • Be the first to like this

SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, Cloudera

  1. 1. O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
  2. 2. SolrCloud: High Availability and Fault Tolerance Mark Miller Software Engineer, Cloudera
  3. 3. 3 01 Who am I? I’m Mark Miller I’m a Lucene junkie (2006) I’m a Lucene committer (2008) And a Solr committer (2009) And a member of the ASF (2011) And a former Lucene PMC Chair (2014-2015) I’ve done a lot of core Solr work and co-created SolrCloud
  4. 4. This talk is about how SolrCloud tries to protect your data. And about some things that should change.
  5. 5. 5 01 SolrCloud Diagram
  6. 6. 6 03 Failure Cases (Shards of index can be treated independently) • A Leader dies (loses ZK connection) • A Replica dies or update from leader to replica fails. • A Replica is partitioned (eg can talk to ZK, but not a shard leader) R L ZK
  7. 7. 7 01 Replica Recovery • A replica will recover from the leader on startup. • A replica will recover if an update from the leader to the replica fails. • A replica may recover from the leader in the leader election sync up dance.R L ZK
  8. 8. 8 01 Replica Recovery Dance • Start Buffering Updates from Leader • Publish Recovering to ZK • Wait for leader to see Recovering State • On first Recovery try, PeerSync • Otherwise full index replication • Commit on leader • Replicate Index • Replay Buffered Documents R L ZK RecoveryStrategy
  9. 9. 9 01 A Replica is Partitioned • In the early days we half punted on this • Now, when a leader cannot reach a replica, it will put it in LIR in ZK. • A replica in LIR will realize that it must recover before clearing it’s LIR status. • We worked through some bugs, but this is very solid now. R L ZK X
  10. 10. 10 01 Leader Recovery • The ‘best effort’ leader recovery dance • If it’s after startup and the last published state is not active, can’t be leader. • Otherwise, try to peer sync with shard. • If success, try to peer sync from replicas to leader. • If any of those sync fails, ask replicas to recover from leader. R L ZK SyncStrategy / ElectionContext
  11. 11. 11 01 Leader Election Forward Progress Stall… • Each replica decides for itself if it thinks it should be leader. • Everyone may think they are unfit. • Only replicas that have last published ACTIVE will attempt to be leader after the first election.
  12. 12. 12 01 Leader Election Forward Progress Stall… • While rare, if all replicas in a shard lose their connection to ZK at the same time, no replica will become leader without intervention. • There is a manual API to intervene, but this should be done automatically. • In practice, this tends to happen for reasons that can be ‘tuned’ out of. • Still needs to be improved.
  13. 13. 13 01 User chooses durability requirements • You can specify how many replicas you want to see success from to consider an update successful. minRf param. • This won’t fail based on that criteria though - simply flag you in the response. • If you replicate factor is not achieved, that also does not mean the update is rolled back.
  14. 14. 14 01 User chooses durability requirements • If we improve some of this… • We can stop trying so hard. • And put it on the user to specify a replication factor that controls how ‘safe’ updates are.
  15. 15. 15 JIRA
  16. 16. 16 01 Handeling Cluster Shutdown / Startup • What if an old replica returns? • How to ensure every replica participates in election? • What if no replica thinks it should be leader? • Staggered shutdowns? • Explicit cluster commands might help
  17. 17. 17 Thank You! Mark Miller @heismark Software Engineer Cloudera