Your SlideShare is downloading. ×
Lucandra
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Lucandra

13,145
views

Published on

Lucandra presentation by Jake Luciani

Lucandra presentation by Jake Luciani


3 Comments
25 Likes
Statistics
Notes
  • thanks
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi, has the video been uploaded yet. It would be great if you would post a link to the video that goes along with the lecture. Thank you.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Good job!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
13,145
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
242
Comments
3
Likes
25
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • test
  • Transcript

    • 1. Lucandra Lucene + Cassandra http://github/tjake/Lucandra http://twitter.com/tjake Jake Luciani
    • 2. What we'll cover today:
        • Search use-cases
        • Problems scaling and maintaining Lucene/Solr
        • Cassandra
        • Lucandra
        • Lucandra in Action 
        • Q&A
    • 3. Types of search apps:
      •  
    • 4. Types of search apps:
      •  
    • 5. Lucene/Solr Scaling Problems
        • Writes are expensive on a live system
          • Merge, Reopen, Optimize, Sorting
        • "Too many open files"
        • Solr replication too many moving parts
        • Scaling writes requires client side sharding
        • Lots of grid management -> ZooKeeper?
        • Backups? Monitoring? Failures? Ops Team? Oh my!
      • This sounds a lot like mysql doesn't it?....
    • 6. Cassandra - Love Child of BigTable and Dynamo
        • Peer to peer (easy to add new nodes)
        • CAP Configurable
        • Multi-level TreeMap (sorta)
        • Pluggable replication/sorting
        • Writes are very fast!
        • Low latency 
        • Integrates with Hadoop 
        • Major adoption and development
    • 7. Cassandra's Data Model
      • { "bloghost.com" :                                                   // Keyspace  
      •    { "Posts" :                                                            // ColumnFamily
      •        { " tjake.bloghost.com " :                                   // Key
      •            { "20100426-Lucandra" : "lucandra talk today!" } // Columns       
      •         }  
      •      },
      •      { "Comments" :                                         // SuperColumnFamily
      •          { " tjake.bloghost.com " :                        // Key
      •            { "20100426-Lucandra-1":                // SuperColumn
      •                {"From" : "Otis","Comment": "Don't Suck!"}, // Columns   
      •            },
      •            { "20100426-Lucandra-2":                // SuperColumn
      •                {"From" : "Jake","Comment": "O.K."},  // Columns             
      •            },
      •      }
      • }}
    • 8. Cassandra - Partitioning
    • 9. Cassandra - Scale Up / Scale Down
    • 10. Cassandra - Replication
    • 11. Solr/Lucene Components
    • 12. Lucandra Components
    • 13. How is an index stored?
      • { "Lucandra" :
      •    { "Docs" :                 
      •        {  "Index1/Doc1" :  { "Field1" : "T1 T2 T1", ... },
      •        {  "Index1/Doc2" :  { "Field1" : "T3 T1", ... }
      •    },
      •    {"TermVectors" :
      •        {"Index1/Field1/T1" : { "Doc1": [0, 2], "Doc2":[1] },
      •        {"Index1/Field1/T2" : { "Doc1": [1] },
      •        {"Index1/Field1/T3" : { "Doc2": [1] },
      •    }
      • }
    • 14. Lucandra Deployed
    • 15. Lucandra In Action Sparse.ly and Wikassandra
    • 16. sparse.ly -  twitter search for friends only
        • ~4k Indexes on 2 boxes
    • 17. Wikassandra - Search wikipedia
        • 4 node cluster
        • 3k writes per sec (over thrift from single node)
        • Solr interface