Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera

•

0 likes•926 views

Lucidworks

Presented at Lucene/Solr Revolution 2017

Technology

Solr consistency and recovery internals
Mano Kovacs
Cloudera

Intro
Mano Kovacs
•  Cloudera Search engineer
•  Working on “Where did my
Solr go?” mysteries.
Mike Drob (co-author)
•  Committer on Apache Solr,
HBase, etc…
•  Distributed Systems Junkie

Agenda
•  Consistency basics (leaders/follower)
•  Leader election
•  When to recover
•  General recovery (peersync, replication)
•  Recovery in detail
•  Leader-Initiated Recovery
•  Auto Add Replica

01
Basics
•  Shards in collection
-  ID or other shard-key routing
•  Shards can have replicas
•  One leader per shard
•  Read distributed to shards
•  Write
-  First on leader (consistency)
-  Replicates

01
Leader Election
•  Zookeeper Leader election recipe
-  Sequential, ephemeral nodes
-  Order dictates the leader candidates
-  First becomes leader candidate
•  Replicas* watch the previous candidate
•  If leader fails, next will be the candidate
•  Candidates follow leader preparation
process
*(Solr 7.0+: Applies to Realtime Replicas)

01
Leader Election - Preparation Process
•  On restart: waits all replicas to participate (default 3
mins)
-  Replicas are asked to replay any missing updates
•  Verify last state ACTIVE if not startup
-  If all were DOWN, shard hangs (SOLR-7065)
•  Verify there was no error reported (LIR… tbd)

01
What causes Recovery?
Routine Events
•  Add or Move Replica - not having the data
•  Restart (upgrade/tuning) - might missed updates
Not Routine Events
•  Server crash
-  Leader
-  Replica
•  Network failure (Lose ZK Connection)
•  Replica partitioned: can access ZK, but not the leader

01
Recovery (from 30k fts.)
•  Replaying unfinished updates from tlog
•  Check if we are synced
•  If no, “How much am I behind?”
-  If N (def=100) docs or less
Retrieving delta
-  Else
Replication: pulling full index
•  Go ACTIVE

Recovery (from 1000 fts.)
•  Buffering new updates
-  So we won’t get behind over and over
again
•  Waiting leader to notice us
-  Otherwise we don’t get updates
•  Replay buffered updates
-  Hopefully replay catches up with
incoming updates

Problem with PeerSync
If there’s a document missing before the
window, then we won’t know!
LeaderReplica
Last 100 docs
all match
Some older docs
are missing

Recovery (from 100 fts.)
•  Updates are versioned
-  Timestamp+counter
•  Index has fingerprint (checksum of doc)
•  If there is other updates missing, fingerprint
will fail
-  Consistency safety net if others fail

01
Leader-Initiated Recovery
•  Partitioning Leader from Replica, but
not ZK
•  Leader will send recovery requests to
replica (with retries)
•  If Replica went down, it will do normal
recovery process anyway
•  If replica is partitioned and up, it will
still serve stale reads :(
•  (can happen during update forwarding
phase of recovery)

01
LIR problems - SOLR-9555
•  Race condition between LIR and
standard Recovery
•  Leader overwrites RECOVERING
state
•  Follower waits leader to see until
timeout
•  Mike Drob’s patch is almost done
-  Solves problem with partitioned
replicas too with ZK watches

01
AutoAddReplica
•  Using shared file system (e.g. HDFS)
-  Provides durability
-  Instances share index folders
•  Move cores to live nodes on failure
•  Use same index folder
•  Benefits
-  Durability with rep factor 1
-  Handle perm. node loss
•  Lots of fix from Mark Miller lately
•  Rewrite in SOLR-10397

What's hot

A Journey to Reactive Function ProgrammingAhmed Soliman

Apache Zeppelin & ClusterJongyoul Lee

Akka 2.4 plus commercial features in Typesafe Reactive PlatformLegacy Typesafe (now Lightbend)

Highly Available, Elastic and Self-healing Moodle on OpenStackEnovation

CloudCamp. Julian Fischer Anynines - migrating a cloud foundry from vm war...Chris Purrington

Your Guide to Streaming - The Engineer's PerspectiveIlya Ganelin

Do's and don'ts when deploying akka in productionjglobal

Haute Disponibilité et Reprise sur incidents en SharePoint 2013 avec Sql Serv...serge luca

Engineered Systems: Environment-as-a-Service DemonstrationEnkitec

Apache Kafka 0.8 basic training - VerisignMichael Noll

Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend

Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan VolzDatabricks

Breaking Spark: Top 5 mistakes to avoid when using Apache Spark in productionNeelesh Srinivas Salian

Stream Collections - Scala DaysGreg Silin

Getting Ready to Use Redis with Apache Spark with Tague GriffithDatabricks

Advanced Use Cases for Analytics Breakout SessionSplunk

Avoid boring work_v2Marcin Przepiórowski

2017 OWASP SanFran March Meetup - Hacking SQL Server on Scale with PowerShellScott Sutherland

First oslo solr community meetup lightning talk janhoyCominvent AS

Deployment pipeline for Azure SQL DatabasesEduardo Piairo

What's hot (20)

A Journey to Reactive Function Programming

Apache Zeppelin & Cluster

Akka 2.4 plus commercial features in Typesafe Reactive Platform

Highly Available, Elastic and Self-healing Moodle on OpenStack

CloudCamp. Julian Fischer Anynines - migrating a cloud foundry from vm war...

Your Guide to Streaming - The Engineer's Perspective

Do's and don'ts when deploying akka in production

Haute Disponibilité et Reprise sur incidents en SharePoint 2013 avec Sql Serv...

Engineered Systems: Environment-as-a-Service Demonstration

Apache Kafka 0.8 basic training - Verisign

Akka at Enterprise Scale: Performance Tuning Distributed Applications

Archiving, E-Discovery, and Supervision with Spark and Hadoop with Jordan Volz

Breaking Spark: Top 5 mistakes to avoid when using Apache Spark in production

Stream Collections - Scala Days

Getting Ready to Use Redis with Apache Spark with Tague Griffith

Advanced Use Cases for Analytics Breakout Session

Avoid boring work_v2

2017 OWASP SanFran March Meetup - Hacking SQL Server on Scale with PowerShell

First oslo solr community meetup lightning talk janhoy

Deployment pipeline for Azure SQL Databases

Similar to Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera

Solr consistency and recovery internals - Mano KovacsMano Kovacs

Solrcloud Leader Electionravikgiitk

Zarafa SummerCamp 2012 - Zarafa 7.1 featuresZarafa

Apache Solr - Enterprise search platformTommaso Teofili

Best practices for highly available and large scale SolrCloudAnshum Gupta

Lessons from Sharding SolrGregg Donovan

Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLucidworks

Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMLucidworks

akka-scalaphx-jun2015Terry Drozdowski

Deploying and managing Solr at scaleAnshum Gupta

Ch5 process synchronizationWelly Dian Astika

Ensuring Consistency in a Replicated WorldYelp Engineering

Replication, Durability, and Disaster RecoverySteven Francia

What's new in Lucene and Solr 4.xGrant Ingersoll

Ease of use in Apache SolrAnshum Gupta

Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab

Clojure's take on concurrencyyoavrubin

Transitioning From SQL Server to MySQL - Presentation from Percona Live 2016Dylan Butler

Sql server 2012 ha dr 24_hop_finalJoseph D'Antoni

Understanding SQL Trace, TKPROF and Execution Plan for beginnersCarlos Sierra

Similar to Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera (20)

Solr consistency and recovery internals - Mano Kovacs

Solrcloud Leader Election

Zarafa SummerCamp 2012 - Zarafa 7.1 features

Apache Solr - Enterprise search platform

Best practices for highly available and large scale SolrCloud

Lessons from Sharding Solr

Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy

Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM

akka-scalaphx-jun2015

Deploying and managing Solr at scale

Ch5 process synchronization

Ensuring Consistency in a Replicated World

Replication, Durability, and Disaster Recovery

What's new in Lucene and Solr 4.x

Ease of use in Apache Solr

Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab

Clojure's take on concurrency

Transitioning From SQL Server to MySQL - Presentation from Percona Live 2016

Sql server 2012 ha dr 24_hop_final

Understanding SQL Trace, TKPROF and Execution Plan for beginners

Recently uploaded

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community

Pigging Solutions in Pet Food ManufacturingPigging Solutions

CloudStudio User manual (basic edition):comworks

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

How to Remove Document Management Hurdles with X-Docs?XfilesPro

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime

Recently uploaded (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx

Pigging Solutions in Pet Food Manufacturing

CloudStudio User manual (basic edition):

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

IAC 2024 - IA Fast Track to Search Focused AI Solutions

08448380779 Call Girls In Friends Colony Women Seeking Men

The transition to renewables in India.pdf

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

How to Remove Document Management Hurdles with X-Docs?

Injustice - Developers Among Us (SciFiDevCon 2024)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

Pigging Solutions Piggable Sweeping Elbows

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget

Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera

1. Solr consistency and recovery internals Mano Kovacs Cloudera

2. Intro Mano Kovacs •  Cloudera Search engineer •  Working on “Where did my Solr go?” mysteries. Mike Drob (co-author) •  Committer on Apache Solr, HBase, etc… •  Distributed Systems Junkie

3. Agenda •  Consistency basics (leaders/follower) •  Leader election •  When to recover •  General recovery (peersync, replication) •  Recovery in detail •  Leader-Initiated Recovery •  Auto Add Replica

4. 01 Basics •  Shards in collection -  ID or other shard-key routing •  Shards can have replicas •  One leader per shard •  Read distributed to shards •  Write -  First on leader (consistency) -  Replicates

5. 01 Leader Election •  Zookeeper Leader election recipe -  Sequential, ephemeral nodes -  Order dictates the leader candidates -  First becomes leader candidate •  Replicas* watch the previous candidate •  If leader fails, next will be the candidate •  Candidates follow leader preparation process *(Solr 7.0+: Applies to Realtime Replicas)

6. 01 Leader Election - Preparation Process •  On restart: waits all replicas to participate (default 3 mins) -  Replicas are asked to replay any missing updates •  Verify last state ACTIVE if not startup -  If all were DOWN, shard hangs (SOLR-7065) •  Verify there was no error reported (LIR… tbd)

7. 01 What causes Recovery? Routine Events •  Add or Move Replica - not having the data •  Restart (upgrade/tuning) - might missed updates Not Routine Events •  Server crash -  Leader -  Replica •  Network failure (Lose ZK Connection) •  Replica partitioned: can access ZK, but not the leader

8. 01 Recovery (from 30k fts.) •  Replaying unfinished updates from tlog •  Check if we are synced •  If no, “How much am I behind?” -  If N (def=100) docs or less Retrieving delta -  Else Replication: pulling full index •  Go ACTIVE

9. Recovery (from 1000 fts.) •  Buffering new updates -  So we won’t get behind over and over again •  Waiting leader to notice us -  Otherwise we don’t get updates •  Replay buffered updates -  Hopefully replay catches up with incoming updates

10. Problem with PeerSync If there’s a document missing before the window, then we won’t know! LeaderReplica Last 100 docs all match Some older docs are missing

11. Recovery (from 100 fts.) •  Updates are versioned -  Timestamp+counter •  Index has fingerprint (checksum of doc) •  If there is other updates missing, fingerprint will fail -  Consistency safety net if others fail

12. 01 Leader-Initiated Recovery •  Partitioning Leader from Replica, but not ZK •  Leader will send recovery requests to replica (with retries) •  If Replica went down, it will do normal recovery process anyway •  If replica is partitioned and up, it will still serve stale reads :( •  (can happen during update forwarding phase of recovery)

13. 01 LIR problems - SOLR-9555 •  Race condition between LIR and standard Recovery •  Leader overwrites RECOVERING state •  Follower waits leader to see until timeout •  Mike Drob’s patch is almost done -  Solves problem with partitioned replicas too with ZK watches

14. 01 AutoAddReplica •  Using shared file system (e.g. HDFS) -  Provides durability -  Instances share index folders •  Move cores to live nodes on failure •  Use same index folder •  Benefits -  Durability with rep factor 1 -  Handle perm. node loss •  Lots of fix from Mark Miller lately •  Rewrite in SOLR-10397

15. Thank You

Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera

Similar to Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera (20)

More from Lucidworks

More from Lucidworks (20)

Recently uploaded

Recently uploaded (20)

Solr Consistency and Recovery Internals - Mano Kovacs, Cloudera