Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lucandra

14,514 views

Published on

Lucandra presentation by Jake Luciani

  • thanks
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi, has the video been uploaded yet. It would be great if you would post a link to the video that goes along with the lecture. Thank you.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Good job!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Lucandra

  1. 1. Lucandra Lucene + Cassandra http://github/tjake/Lucandra http://twitter.com/tjake Jake Luciani
  2. 2. What we'll cover today: <ul><ul><li>Search use-cases </li></ul></ul><ul><ul><li>Problems scaling and maintaining Lucene/Solr </li></ul></ul><ul><ul><li>Cassandra </li></ul></ul><ul><ul><li>Lucandra </li></ul></ul><ul><ul><li>Lucandra in Action  </li></ul></ul><ul><ul><li>Q&A </li></ul></ul>
  3. 3. Types of search apps: <ul><li>  </li></ul>
  4. 4. Types of search apps: <ul><li>  </li></ul>
  5. 5. Lucene/Solr Scaling Problems <ul><ul><li>Writes are expensive on a live system </li></ul></ul><ul><ul><ul><li>Merge, Reopen, Optimize, Sorting </li></ul></ul></ul><ul><ul><li>&quot;Too many open files&quot; </li></ul></ul><ul><ul><li>Solr replication too many moving parts </li></ul></ul><ul><ul><li>Scaling writes requires client side sharding </li></ul></ul><ul><ul><li>Lots of grid management -> ZooKeeper? </li></ul></ul><ul><ul><li>Backups? Monitoring? Failures? Ops Team? Oh my! </li></ul></ul><ul><li>This sounds a lot like mysql doesn't it?.... </li></ul>
  6. 6. Cassandra - Love Child of BigTable and Dynamo <ul><ul><li>Peer to peer (easy to add new nodes) </li></ul></ul><ul><ul><li>CAP Configurable </li></ul></ul><ul><ul><li>Multi-level TreeMap (sorta) </li></ul></ul><ul><ul><li>Pluggable replication/sorting </li></ul></ul><ul><ul><li>Writes are very fast! </li></ul></ul><ul><ul><li>Low latency  </li></ul></ul><ul><ul><li>Integrates with Hadoop  </li></ul></ul><ul><ul><li>Major adoption and development </li></ul></ul>
  7. 7. Cassandra's Data Model <ul><li>{ &quot;bloghost.com&quot; :                                                   // Keyspace   </li></ul><ul><li>   { &quot;Posts&quot; :                                                            // ColumnFamily </li></ul><ul><li>       { &quot; tjake.bloghost.com &quot; :                                   // Key </li></ul><ul><li>           { &quot;20100426-Lucandra&quot; : &quot;lucandra talk today!&quot; } // Columns        </li></ul><ul><li>        }   </li></ul><ul><li>     }, </li></ul><ul><li>     { &quot;Comments&quot; :                                         // SuperColumnFamily </li></ul><ul><li>         { &quot; tjake.bloghost.com &quot; :                        // Key </li></ul><ul><li>           { &quot;20100426-Lucandra-1&quot;:                // SuperColumn </li></ul><ul><li>               {&quot;From&quot; : &quot;Otis&quot;,&quot;Comment&quot;: &quot;Don't Suck!&quot;}, // Columns    </li></ul><ul><li>           }, </li></ul><ul><li>           { &quot;20100426-Lucandra-2&quot;:                // SuperColumn </li></ul><ul><li>               {&quot;From&quot; : &quot;Jake&quot;,&quot;Comment&quot;: &quot;O.K.&quot;},  // Columns              </li></ul><ul><li>           }, </li></ul><ul><li>     } </li></ul><ul><li>}} </li></ul>
  8. 8. Cassandra - Partitioning
  9. 9. Cassandra - Scale Up / Scale Down
  10. 10. Cassandra - Replication
  11. 11. Solr/Lucene Components
  12. 12. Lucandra Components
  13. 13. How is an index stored? <ul><li>{ &quot;Lucandra&quot; : </li></ul><ul><li>   { &quot;Docs&quot; :                  </li></ul><ul><li>       {  &quot;Index1/Doc1&quot; :  { &quot;Field1&quot; : &quot;T1 T2 T1&quot;, ... }, </li></ul><ul><li>       {  &quot;Index1/Doc2&quot; :  { &quot;Field1&quot; : &quot;T3 T1&quot;, ... } </li></ul><ul><li>   }, </li></ul><ul><li>   {&quot;TermVectors&quot; : </li></ul><ul><li>       {&quot;Index1/Field1/T1&quot; : { &quot;Doc1&quot;: [0, 2], &quot;Doc2&quot;:[1] }, </li></ul><ul><li>       {&quot;Index1/Field1/T2&quot; : { &quot;Doc1&quot;: [1] }, </li></ul><ul><li>       {&quot;Index1/Field1/T3&quot; : { &quot;Doc2&quot;: [1] }, </li></ul><ul><li>   } </li></ul><ul><li>} </li></ul>
  14. 14. Lucandra Deployed
  15. 15. Lucandra In Action Sparse.ly and Wikassandra
  16. 16. sparse.ly -  twitter search for friends only <ul><ul><li>~4k Indexes on 2 boxes </li></ul></ul>
  17. 17. Wikassandra - Search wikipedia <ul><ul><li>4 node cluster </li></ul></ul><ul><ul><li>3k writes per sec (over thrift from single node) </li></ul></ul><ul><ul><li>Solr interface </li></ul></ul>

×