Your SlideShare is downloading. ×
Scaling search with Solr Cloud
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Scaling search with Solr Cloud

5,841
views

Published on

Enterprise search can grow big, really big! And growing. Tens, yes hundreds of servers may be involved, locally or in the cloud. Managing this has been complex and time consuming - until now …

Enterprise search can grow big, really big! And growing. Tens, yes hundreds of servers may be involved, locally or in the cloud. Managing this has been complex and time consuming - until now :)

SolrCloud to the rescue
Using the world's most popular Open Source search engine, Apache Solr™, we will show you how the new upcoming version 4.0 makes scaling search in the cloud really simple and robust. A new feature called SolrCloud adds centralized configuration, distributed indexing & searching, automatic failover, recovery and leader election. Scaling is now as simple as adding a new server to your cluster and it will find its role where it is most needed and start serving searches.


0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,841
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
106
Comments
0
Likes
11
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 1 Jan Høydahl Scaling search withCominvent AS SolrCloud
  • 2. 2Jan Høydahl 1995: Developer telecom 1998: Java developer 2000: Search - FAST 2006: Lucene 2007: new Cominvent() 2009: Lucene/Solr 2011: Lucene committer 2012: Lucene PMC > 100 projects
  • 3. 3
  • 4. 4 About Cominvent Business critical search Domain knowledge & best practices:Consulting Training Support
  • 5. 5SolrTraining.com Next course in Oslo: SEPTEMBER 2012 MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Calendar from www.calendar-of-2012.com
  • 6. 6http://www.meetup.com/Oslo-Solr-Community CommunityZone talk: «Solr 101» Thursday 14:20
  • 7. 7http://www.meetup.com/Oslo-Solr-Community next MeetUp:
  • 8. 8ApacheCon Europe 2012• Sinsheim, Germany • Lucene/Solr track• November 5-8 • www.apachecon.eu
  • 9. 9Agenda• Intro to Solr• Scaling search - before• Introduction to SolrCloud• Demo with Wikipedia data• Plans for Solr going forward• Q&A
  • 10. 10Intro to Solr
  • 11. 11Apache Solr Search Server
  • 12. 12Completely HTTP based
  • 13. 13
  • 14. 14Areas of use
  • 15. 15Example: e-commerce www.libris.no
  • 16. 16Boosting by functionBoosting on review popularityand sales numbers:log(sum(popularity,numsold))
  • 17. 17Auto suggest & phonetic normalization
  • 18. 18Example: classifieds/auctions www.finn.no
  • 19. 18Example: classifieds/auctions www.finn.no
  • 20. 19Who use Apache Lucene/Solr™ ?..and many more:http://wiki.apache.org/solr/PublicServers
  • 21. 20Versions• Current stable = 3.6.1• Latest release = 4.0-beta• Next release = 4.0-FINAL --- «soon» :-) v3.6.1 v1.1 v1.3 v1.4 v3.1 v3.3 v3.6 v4.0a v4.0ß v4.0 01/2007 09/2008 11/2009 03/2011 06/2011 04/2012 06/2012 08/2012 ??/2012 07/2012
  • 22. 21Scaling search
  • 23. 22Why scale?• One single Solr server handles... –millions of documents (per shard) –hundreds of queries per second (per replica)• We need to scale if... –data volume increases –query volume increases –we need high availability / fault tolerance
  • 24. 23Scaling search - before Solr shard 1 - config, schema - synonyms
  • 25. 23Scaling search - before - Add shard node - Manually copy config - Manually index to right shard - Manually shards query parameter Solr shard 1 Solr shard 2 - config, schema - config, schema - synonyms - synonyms
  • 26. 23Scaling search - before - Add shard node - Manually copy config - Manually index to right shard - Manually shards query parameter Solr shard 1 Solr shard 2 - config, schema - config, schema - synonyms - synonyms Solr 1 replica Solr 2 replica - config, schema - config, schema - synonyms - synonyms - Add replica node - Copy config - Setup poll based replication - No indexing failover - Monitor every node
  • 27. 24Solr Cloud
  • 28. 25What is SolrCloud?• New in Solr 4.0• Easier scaling ZooKeeper: «Because• Centralized config coordinating distributed systems is a Zoo»• Fault tolerant indexing and querying• Using Apache ZooKeeper as «registry»
  • 29. 26What is SolrCloud
  • 30. 26What is SolrCloud
  • 31. 26What is SolrCloud
  • 32. 26What is SolrCloud Logical collection
  • 33. 26What is SolrCloud Logical collection Soft commit Transaction log
  • 34. 27Scaling search - with SolrCloudSolr master 1ZK aware Apache ZooKeeper
  • 35. 27Scaling search - with SolrCloud - Add shard node, point it to ZK - It assumes the role of shard 2 - Automatic document distribution - Automatic querying across cluster - Centralized config & monitoringSolr master 1 Solr master 2ZK aware ZK aware Apache ZooKeeper
  • 36. 27Scaling search - with SolrCloud - Add shard node, point it to ZK - It assumes the role of shard 2 - Automatic document distribution - Automatic querying across cluster - Centralized config & monitoringSolr master 1 Solr master 2ZK aware ZK aware Apache ZooKeeperSolr replica 1 Solr replica 2ZK aware ZK aware - Add replica node(s) - Auto role assignment - Push based replication - Indexing failover - Leader election through ZK
  • 37. 27Scaling search - with SolrCloudSolr master 1 Solr master 2ZK aware ZK aware Apache ZooKeeperSolr replica 1 Solr replica 2ZK aware ZK aware
  • 38. 27Scaling search - with SolrCloudSolr master 1 Solr master 2ZK aware ZK aware Apache ZooKeeperSolr replica 1 Solr replica 2 masterZK aware ZK aware
  • 39. 27Scaling search - with SolrCloudSolr master 1 Solr master 2 replicaZK aware ZK aware Apache ZooKeeperSolr replica 1 Solr replica 2 masterZK aware ZK aware
  • 40. 28ConfigurationSolr master 1 Solr master 2ZK aware ZK aware ZKSolr replica 1 Solr replica 2ZK aware ZK aware
  • 41. 28Configuration -DzkRunSolr master 1 -Dcollection.configName=jz Solr master 2 -DnumShards=2ZK aware -Dbootstrap_confdir=./solr/coll/conf ZK aware ZKSolr replica 1 Solr replica 2ZK aware ZK aware
  • 42. 28Configuration -DzkRunSolr master 1 -Dcollection.configName=jz Solr master 2 -DnumShards=2 -DzkHost=localhost:xxxxZK aware -Dbootstrap_confdir=./solr/coll/conf ZK aware ZKSolr replica 1 Solr replica 2 -DzkHost=localhost:xxxx -DzkHost=localhost:xxxxZK aware ZK aware
  • 43. 29 Demoindexing & querying
  • 44. 30Solr 4.0 and beyond• Other news in v4.0 FINAL (expected later this autumn) –NRT –Real-time GET –Smaller index & memory footprint –New «modern» Admin GUI –Incremental updates –Pseudo-join• Future plans –More shard distribution mechanisms –Re-balancing cluster (split shards) –...
  • 45. 31Recap• Apache Solr open source enterprise search• Scaling Solr was hard• Solr 4.0 with SolrCloud makes it easy :) –Centralized config –Effortless scaling of cluster –Fault tolerant indexing & querying• Download the 4.0-beta today, 4.0-FINAL soon
  • 46. 32Remember Next Solr course in Oslo: SEPTEMBER 2012 MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 CommunityZone talk: «Solr 101» 17 18 19 20 21 22 23 Thursday 14:20 24 25 26 27 28 29 30 Calendar from www.calendar-of-2012.com www.solrkurs.no
  • 47. 33? Jan Høydahl Cominvent AS @cominvent

×