Lucandra

14,062 views
13,717 views

Published on

Lucandra presentation by Jake Luciani

3 Comments
25 Likes
Statistics
Notes
  • thanks
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi, has the video been uploaded yet. It would be great if you would post a link to the video that goes along with the lecture. Thank you.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Good job!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
14,062
On SlideShare
0
From Embeds
0
Number of Embeds
3,219
Actions
Shares
0
Downloads
245
Comments
3
Likes
25
Embeds 0
No embeds

No notes for slide
  • test
  • Lucandra

    1. 1. Lucandra Lucene + Cassandra http://github/tjake/Lucandra http://twitter.com/tjake Jake Luciani
    2. 2. What we'll cover today: <ul><ul><li>Search use-cases </li></ul></ul><ul><ul><li>Problems scaling and maintaining Lucene/Solr </li></ul></ul><ul><ul><li>Cassandra </li></ul></ul><ul><ul><li>Lucandra </li></ul></ul><ul><ul><li>Lucandra in Action  </li></ul></ul><ul><ul><li>Q&A </li></ul></ul>
    3. 3. Types of search apps: <ul><li>  </li></ul>
    4. 4. Types of search apps: <ul><li>  </li></ul>
    5. 5. Lucene/Solr Scaling Problems <ul><ul><li>Writes are expensive on a live system </li></ul></ul><ul><ul><ul><li>Merge, Reopen, Optimize, Sorting </li></ul></ul></ul><ul><ul><li>&quot;Too many open files&quot; </li></ul></ul><ul><ul><li>Solr replication too many moving parts </li></ul></ul><ul><ul><li>Scaling writes requires client side sharding </li></ul></ul><ul><ul><li>Lots of grid management -> ZooKeeper? </li></ul></ul><ul><ul><li>Backups? Monitoring? Failures? Ops Team? Oh my! </li></ul></ul><ul><li>This sounds a lot like mysql doesn't it?.... </li></ul>
    6. 6. Cassandra - Love Child of BigTable and Dynamo <ul><ul><li>Peer to peer (easy to add new nodes) </li></ul></ul><ul><ul><li>CAP Configurable </li></ul></ul><ul><ul><li>Multi-level TreeMap (sorta) </li></ul></ul><ul><ul><li>Pluggable replication/sorting </li></ul></ul><ul><ul><li>Writes are very fast! </li></ul></ul><ul><ul><li>Low latency  </li></ul></ul><ul><ul><li>Integrates with Hadoop  </li></ul></ul><ul><ul><li>Major adoption and development </li></ul></ul>
    7. 7. Cassandra's Data Model <ul><li>{ &quot;bloghost.com&quot; :                                                   // Keyspace   </li></ul><ul><li>   { &quot;Posts&quot; :                                                            // ColumnFamily </li></ul><ul><li>       { &quot; tjake.bloghost.com &quot; :                                   // Key </li></ul><ul><li>           { &quot;20100426-Lucandra&quot; : &quot;lucandra talk today!&quot; } // Columns        </li></ul><ul><li>        }   </li></ul><ul><li>     }, </li></ul><ul><li>     { &quot;Comments&quot; :                                         // SuperColumnFamily </li></ul><ul><li>         { &quot; tjake.bloghost.com &quot; :                        // Key </li></ul><ul><li>           { &quot;20100426-Lucandra-1&quot;:                // SuperColumn </li></ul><ul><li>               {&quot;From&quot; : &quot;Otis&quot;,&quot;Comment&quot;: &quot;Don't Suck!&quot;}, // Columns    </li></ul><ul><li>           }, </li></ul><ul><li>           { &quot;20100426-Lucandra-2&quot;:                // SuperColumn </li></ul><ul><li>               {&quot;From&quot; : &quot;Jake&quot;,&quot;Comment&quot;: &quot;O.K.&quot;},  // Columns              </li></ul><ul><li>           }, </li></ul><ul><li>     } </li></ul><ul><li>}} </li></ul>
    8. 8. Cassandra - Partitioning
    9. 9. Cassandra - Scale Up / Scale Down
    10. 10. Cassandra - Replication
    11. 11. Solr/Lucene Components
    12. 12. Lucandra Components
    13. 13. How is an index stored? <ul><li>{ &quot;Lucandra&quot; : </li></ul><ul><li>   { &quot;Docs&quot; :                  </li></ul><ul><li>       {  &quot;Index1/Doc1&quot; :  { &quot;Field1&quot; : &quot;T1 T2 T1&quot;, ... }, </li></ul><ul><li>       {  &quot;Index1/Doc2&quot; :  { &quot;Field1&quot; : &quot;T3 T1&quot;, ... } </li></ul><ul><li>   }, </li></ul><ul><li>   {&quot;TermVectors&quot; : </li></ul><ul><li>       {&quot;Index1/Field1/T1&quot; : { &quot;Doc1&quot;: [0, 2], &quot;Doc2&quot;:[1] }, </li></ul><ul><li>       {&quot;Index1/Field1/T2&quot; : { &quot;Doc1&quot;: [1] }, </li></ul><ul><li>       {&quot;Index1/Field1/T3&quot; : { &quot;Doc2&quot;: [1] }, </li></ul><ul><li>   } </li></ul><ul><li>} </li></ul>
    14. 14. Lucandra Deployed
    15. 15. Lucandra In Action Sparse.ly and Wikassandra
    16. 16. sparse.ly -  twitter search for friends only <ul><ul><li>~4k Indexes on 2 boxes </li></ul></ul>
    17. 17. Wikassandra - Search wikipedia <ul><ul><li>4 node cluster </li></ul></ul><ul><ul><li>3k writes per sec (over thrift from single node) </li></ul></ul><ul><ul><li>Solr interface </li></ul></ul>

    ×