0
Programmet            starter...Sponsors:
2                      th  MeetUp May 8 2011                                      ()– Velkommen  Bakgrunnen for MeetUpen– ...
3Scaling & HA (redundancy)                                    ()– Index up to 25-100 million documents on a single server*...
4Solr scaling example   ()
5         Replication                                            ()         – Goals:               • Increase QPS capacity...
6Sharding                                                             ()– Goals:   • Split an index too large for one box ...
7         Solr Cloud                                                     ()         – Solr Cloud is the popular name for a...
8         Solr Cloud...                                                    ()         – Setting up SolrCloud for our YP ex...
9         Solr Cloud...                                                ()         – Solr Cloud will resolve all shards and...
10Solr Cloud, 2x2 setup                                       ()   localhost:8983               localhost:7973   Run ZK: l...
Upcoming SlideShare
Loading in...5
×

First oslo solr community meetup lightning talk janhoy

2,110

Published on

Lightning talk by Jan Høydahl on SolrCloud

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,110
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "First oslo solr community meetup lightning talk janhoy"

  1. 1. Programmet starter...Sponsors:
  2. 2. 2 th MeetUp May 8 2011 ()– Velkommen Bakgrunnen for MeetUpen– (Reklamepause)– Presentasjonsrunde– Ønsker for MeetUp-gruppen (diskusjon)– Lyn-taler á 10min (ca kl 18:30-19:00) • Sture Svensson ""Querying Solr in various ways" • Jan Høydahl ""What can I do with SolrCloud today" • NN?– Formelt slutt (ca 19:15)– Mingling...
  3. 3. 3Scaling & HA (redundancy) ()– Index up to 25-100 million documents on a single server* • Scale linearly by adding servers (shards)– Query up to 50-1000 QPS on a single server • Scale linearly by adding servers (replicas)– Add redundancy or backup through extra replicas– Built-in software Load Balancer, auto failover– Indexing redundancy not out of the box • But possible to have every row do index+search– High Availability for config/admin using Apache ZooKeeper (TRUNK)
  4. 4. 4Solr scaling example ()
  5. 5. 5 Replication () – Goals: • Increase QPS capacity • High availability of search – Replication adds another "search row" – Done as a PULL from slave – ReplicationHandler is configured in solrconfig.xmlhttp://wiki.apache.org/solr/SolrReplication
  6. 6. 6Sharding ()– Goals: • Split an index too large for one box into smaller chunks • Lower HW footprint by smart partitioning of data – News search: One shard for last month, one shard per year • Lower latency by having smaller index per node– A shard is a core which participates in a collection • Shards A and B may thus be on different or same host • Shards A and B should but do not need to share schema– Shard distribution must be done by client application, adding documents to correct shard based on some policy • Most common policy is hash-based distribution • May also be date based or whatever client chooses– Work under way to add shard distribution natively to Solr, see SOLR-2358
  7. 7. 7 Solr Cloud () – Solr Cloud is the popular name for an initiative to make Solr more easily scalable and managable in a distributed world – Enables centralized configuration and cluster status monitoring – Solr TRUNK contains the first features • Apache ZooKeeper support, including built-in ZK • Support for easy distrib=true query (by means of ZK) • NOTE: Still experimental, work in progress – Expected features to come • Auto index shard distribution using ZK • Tools to manage the config in ZK • Easy addition of row/shard through API – NOTE: We do not know when SolrCloud will be included in a released version of Solr. If you need it, use TRUNKhttp://wiki.apache.org/solr/SolrCloud
  8. 8. 8 Solr Cloud... () – Setting up SolrCloud for our YP example • Well setup a 4-node cluster on our laptops using four instances of Jetty, on different ports • Well have 2 shards, each with one replica • Well index 5000 listings to each shard • And finally do distributed queries • For convenience, well use the ZK shipping with Solr – Bootstrapping ZooKeeper to create a config "yp-conf" • java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=yp-conf -DzkRun -jar start.jar – Starting the other Jetty nodes • java -Djetty.port=<port> -DhostPort=<port> -DzkHost=localhost:9983 -jar start.jar – Zookeeper admin • http://localhost:8983/solr/yp/admin/zookeeper.jsphttp://wiki.apache.org/solr/SolrCloud
  9. 9. 9 Solr Cloud... () – Solr Cloud will resolve all shards and replicas in a collection based on what is configured in solr.xml – Querying /solr/yp/select?q=foo&distrib=true on this core will cause SolrCloud to resolve the core name to "yp-cloud" and then distribute the request to each of the shards which are members of the same collection – Often, the core name and collection name will be the same – SolrCloud will load balance between replicas within the same shardhttp://wiki.apache.org/solr/SolrCloud
  10. 10. 10Solr Cloud, 2x2 setup () localhost:8983 localhost:7973 Run ZK: localhost:9983 Run ZK: no -DzkHost=localhost:9983 Core: yp Core: yp Shard: A (master) Shard: B (master) Colleciton: yp-collection Colleciton: yp-collection localhost:6963 localhost:5953 Run ZK: no Run ZK: N/A -DzkHost=localhost:9983 -DzkHost=localhost:9983 Core: yp Core: yp Shard: A (replica) Shard: B (replica) Colleciton: yp-collection Colleciton: yp-collection
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×