Successfully reported this slideshow.
Your SlideShare is downloading. ×

Queue Based Solr Indexing with Collection Management: Presented by Devansh Dhutia, Gannett Co.

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 21 Ad

More Related Content

Slideshows for you (20)

Viewers also liked (16)

Advertisement

Similar to Queue Based Solr Indexing with Collection Management: Presented by Devansh Dhutia, Gannett Co. (20)

More from Lucidworks (20)

Advertisement

Recently uploaded (20)

Queue Based Solr Indexing with Collection Management: Presented by Devansh Dhutia, Gannett Co.

  1. 1. O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
  2. 2. Queue Based Indexing & Collection Management Devansh Dhutia Platform Architect
  3. 3. 3 01 •  National & Local newspaper/media company •  92+ Markets in 33 states
  4. 4. 4 03 Current/Future Architecture
  5. 5. 5 01 Agenda •  Solr @ Gannett •  Current State •  Collection Management •  Queuing Solution •  Future Work •  Questions
  6. 6. 6 02 @ Site Search CMS Search Analytics Personalization 40+ Applications 20M+ Integral pillar of Gannett’s Digital Platform total documents 800,000+ per month Growing rapidly 100,000+ requests per minute Highly Available ~100ms average response time Extremely Fast 8 nodes 256 gb memory per availability zone
  7. 7. 7 03 Current State
  8. 8. 8 01 Current State •  Synchronous Operations •  Near Realtime •  Time Consuming schema changes •  Visible outage impact
  9. 9. 9 01 Collection Management •  Create Collection •  Deploy Batch Indexer •  Index new Collection •  Update Alias to new Collection •  Run catch up •  Deploy Search/Index Apps
  10. 10. 10 01 Realtime Changes / Queries
  11. 11. 11 01 Prep Alternate Collection
  12. 12. 12 01 Deploy
  13. 13. 13 01 Outage Problems •  Spinning Wheel •  Duplicate content •  Unable to find new content •  Frustrated editors •  Ux & other presentation layers
  14. 14. 14 01 Enter Queues •  Asynchronous Write Operations •  Near Realtime •  Faster schema changes •  Auto scale indexing workers •  Low authoring outage impact •  Eventually consistent
  15. 15. Queue Based Indexing
  16. 16. 16 01 RabbitMQ •  Clustered & Highly Available •  FIFO •  pub/sub model •  Consistent Hash / Multiple Queues
  17. 17. 17 01 RabbitMQ
  18. 18. 18 01 Components •  Realtime Queue •  Batch Queue •  Prep Queue •  Deadletter Queue •  Indexing Service •  Prep mode •  Batch Push Service
  19. 19. 19 01 Future Work •  Continuous Delivery of schema •  Build payload in one zone only •  Automated Deadletter handling •  Earlier notification of potential failure
  20. 20. 20 01 Thank you Interested in joining our team at Gannett? http://www.gannett.com/careers
  21. 21. 21 01 Questions?

×