Solrcloud Leader Election

Questions we want to answer
• What is the purpose of leader in SolrCloud?
• How a leader is selected?
• What happens when a leader dies?

Purpose of Leader
• Shards: to scale : particular collection of
documents, the collection can be divided in
multiple shards.
• Shard replica: to failover correction(high
availability), load balancing : each of the shard
can be replicated to multiple shard replica

Purpose of Leader
• Collection – multiple shards – multiple replica
• How a request is served?
– Types of request:
• Read – search query, no consistency issue between
replica
• Write – index a document, consistency issue, should
have single source for write – Hence leader

Purpose of Leader
Image Source: Ref.-2

Leader selection
• Zookeeper: SolrCloud uses Zok to track which
node is active and not, manage config files
etc.
• Zok helps is leader selection
• Zok already embedded in SolrCloud, but can
be run externally

Leader selection
• SolrCloud += new node
– The new node registers itself with Zok
– And creates znodes:
• session – with timeout, updated by the client node
regulary
• ephemaral node
• sequence node: when created gets a unique seq. no
assigned and suffixed to its name
– the clusterstate.json file gets updated (by
overseer)

Leader selection

Leader selection
• Based on seq. flag – leader gets selected
• The one having the lowest seq. no.

Leader selection

Leader dies
• When the leader dies, znode having the
lowest sequence no.
• all znodes are being watched by ZoK
• Znode having the next sequence no. is elected
as the leader

Leader dies
• New leader candidate starts sync process with
each replica, if everyone has same version.
Then it registers as leader active
• Old leader might have sent docs to some
replicas and not all.
• And if a replica is far too behind, its tries to
replay log or ask for full replication

Ref:
• http://techblog.outbrain.com/2011/07/leader-election-with-zookeeper/
• http://events.linuxfoundation.org/sites/events/files/slides/ApacheC
on_IntroSolrCloud.pdf
• http://zookeeper.apache.org/doc/current/recipes.html#sc_leaderEl
ection
• https://cwiki.apache.org/confluence/display/solr/Read+and+Write+
Side+Fault+Tolerance
• http://youtu.be/K6EC8iFDEuA
• http://youtu.be/eVK0wLkLw9w
• https://wiki.apache.org/solr/SolrCloud
• https://cwiki.apache.org/confluence/display/solr/SolrCloud
• http://grokbase.com/t/lucene/solr-user/12bd9kst9t/role-purpose-of-
overseer

Code Flow of write requests
Rough sketch ->
org.apache.solr.handler.UpdateRequestHandler -> multiple
org.apache.solr.handler.loader.ContentStreamLoader: csv, xml, json
For each write request: loader is identified and its load method is
called
Within the loader, for different type of write request -
org.apache.solr.update.UpdateCommand is created and it is passed to
org.apache.solr.update.processor.UpdateRequestProcessor.process<Ad
d/Commit/...>
For solrcloud: DistributedUpdateProcessor is used

Solrcloud Leader Election

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Solrcloud Leader Election

Similar to Solrcloud Leader Election (20)

Recently uploaded

Recently uploaded (20)

Solrcloud Leader Election