Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Lucene basics
Next
Download to read offline and view in fullscreen.

25

Share

Lucandra

Download to read offline

Lucandra presentation by Jake Luciani

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Lucandra

  1. 1. Lucandra Lucene + Cassandra http://github/tjake/Lucandra http://twitter.com/tjake Jake Luciani
  2. 2. What we'll cover today: <ul><ul><li>Search use-cases </li></ul></ul><ul><ul><li>Problems scaling and maintaining Lucene/Solr </li></ul></ul><ul><ul><li>Cassandra </li></ul></ul><ul><ul><li>Lucandra </li></ul></ul><ul><ul><li>Lucandra in Action  </li></ul></ul><ul><ul><li>Q&A </li></ul></ul>
  3. 3. Types of search apps: <ul><li>  </li></ul>
  4. 4. Types of search apps: <ul><li>  </li></ul>
  5. 5. Lucene/Solr Scaling Problems <ul><ul><li>Writes are expensive on a live system </li></ul></ul><ul><ul><ul><li>Merge, Reopen, Optimize, Sorting </li></ul></ul></ul><ul><ul><li>&quot;Too many open files&quot; </li></ul></ul><ul><ul><li>Solr replication too many moving parts </li></ul></ul><ul><ul><li>Scaling writes requires client side sharding </li></ul></ul><ul><ul><li>Lots of grid management -> ZooKeeper? </li></ul></ul><ul><ul><li>Backups? Monitoring? Failures? Ops Team? Oh my! </li></ul></ul><ul><li>This sounds a lot like mysql doesn't it?.... </li></ul>
  6. 6. Cassandra - Love Child of BigTable and Dynamo <ul><ul><li>Peer to peer (easy to add new nodes) </li></ul></ul><ul><ul><li>CAP Configurable </li></ul></ul><ul><ul><li>Multi-level TreeMap (sorta) </li></ul></ul><ul><ul><li>Pluggable replication/sorting </li></ul></ul><ul><ul><li>Writes are very fast! </li></ul></ul><ul><ul><li>Low latency  </li></ul></ul><ul><ul><li>Integrates with Hadoop  </li></ul></ul><ul><ul><li>Major adoption and development </li></ul></ul>
  7. 7. Cassandra's Data Model <ul><li>{ &quot;bloghost.com&quot; :                                                   // Keyspace   </li></ul><ul><li>   { &quot;Posts&quot; :                                                            // ColumnFamily </li></ul><ul><li>       { &quot; tjake.bloghost.com &quot; :                                   // Key </li></ul><ul><li>           { &quot;20100426-Lucandra&quot; : &quot;lucandra talk today!&quot; } // Columns        </li></ul><ul><li>        }   </li></ul><ul><li>     }, </li></ul><ul><li>     { &quot;Comments&quot; :                                         // SuperColumnFamily </li></ul><ul><li>         { &quot; tjake.bloghost.com &quot; :                        // Key </li></ul><ul><li>           { &quot;20100426-Lucandra-1&quot;:                // SuperColumn </li></ul><ul><li>               {&quot;From&quot; : &quot;Otis&quot;,&quot;Comment&quot;: &quot;Don't Suck!&quot;}, // Columns    </li></ul><ul><li>           }, </li></ul><ul><li>           { &quot;20100426-Lucandra-2&quot;:                // SuperColumn </li></ul><ul><li>               {&quot;From&quot; : &quot;Jake&quot;,&quot;Comment&quot;: &quot;O.K.&quot;},  // Columns              </li></ul><ul><li>           }, </li></ul><ul><li>     } </li></ul><ul><li>}} </li></ul>
  8. 8. Cassandra - Partitioning
  9. 9. Cassandra - Scale Up / Scale Down
  10. 10. Cassandra - Replication
  11. 11. Solr/Lucene Components
  12. 12. Lucandra Components
  13. 13. How is an index stored? <ul><li>{ &quot;Lucandra&quot; : </li></ul><ul><li>   { &quot;Docs&quot; :                  </li></ul><ul><li>       {  &quot;Index1/Doc1&quot; :  { &quot;Field1&quot; : &quot;T1 T2 T1&quot;, ... }, </li></ul><ul><li>       {  &quot;Index1/Doc2&quot; :  { &quot;Field1&quot; : &quot;T3 T1&quot;, ... } </li></ul><ul><li>   }, </li></ul><ul><li>   {&quot;TermVectors&quot; : </li></ul><ul><li>       {&quot;Index1/Field1/T1&quot; : { &quot;Doc1&quot;: [0, 2], &quot;Doc2&quot;:[1] }, </li></ul><ul><li>       {&quot;Index1/Field1/T2&quot; : { &quot;Doc1&quot;: [1] }, </li></ul><ul><li>       {&quot;Index1/Field1/T3&quot; : { &quot;Doc2&quot;: [1] }, </li></ul><ul><li>   } </li></ul><ul><li>} </li></ul>
  14. 14. Lucandra Deployed
  15. 15. Lucandra In Action Sparse.ly and Wikassandra
  16. 16. sparse.ly -  twitter search for friends only <ul><ul><li>~4k Indexes on 2 boxes </li></ul></ul>
  17. 17. Wikassandra - Search wikipedia <ul><ul><li>4 node cluster </li></ul></ul><ul><ul><li>3k writes per sec (over thrift from single node) </li></ul></ul><ul><ul><li>Solr interface </li></ul></ul>
  • ShriLata1

    Aug. 3, 2017
  • rrsk

    May. 15, 2012
  • usamanada

    May. 4, 2012
  • utopiazh

    Mar. 29, 2012
  • umitgunduz

    Sep. 18, 2011
  • pouderstream

    Apr. 10, 2011
  • tengteng2007

    Jan. 10, 2011
  • MarkHarwood

    Oct. 27, 2010
  • robfrankie

    Oct. 21, 2010
  • pipoket

    Aug. 17, 2010
  • thachlan20002000

    Aug. 5, 2010
  • ccasado

    Jun. 22, 2010
  • yutuki

    Jun. 3, 2010
  • addame

    May. 27, 2010
  • quinode

    May. 13, 2010
  • imjingle

    May. 6, 2010
  • liqweed

    May. 5, 2010
  • appler

    Apr. 28, 2010
  • artob

    Apr. 28, 2010
  • tim.lossen.de

    Apr. 28, 2010

Lucandra presentation by Jake Luciani

Views

Total views

15,107

On Slideshare

0

From embeds

0

Number of embeds

3,222

Actions

Downloads

247

Shares

0

Comments

0

Likes

25

×