Lucandra
Upcoming SlideShare
Loading in...5
×
 

Lucandra

on

  • 16,294 views

Lucandra presentation by Jake Luciani

Lucandra presentation by Jake Luciani

Statistics

Views

Total Views
16,294
Views on SlideShare
13,161
Embed Views
3,133

Actions

Likes
25
Downloads
241
Comments
3

16 Embeds 3,133

http://www.jroller.com 1936
http://log.medcl.net 441
http://jroller.com 288
http://www.slideshare.net 274
http://wiki.github.com 171
http://webcache.googleusercontent.com 4
http://translate.googleusercontent.com 4
http://www.linkedin.com 3
http://www.taaza.com 2
http://skasuya.jp 2
http://project.home.veryhuman.com 2
http://cache.baidu.com 2
https://p.yammer.com 1
http://memo.skasuya.jp 1
http://wwww.jroller.com 1
https://www.linkedin.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • thanks
    Are you sure you want to
    Your message goes here
    Processing…
  • Hi, has the video been uploaded yet. It would be great if you would post a link to the video that goes along with the lecture. Thank you.
    Are you sure you want to
    Your message goes here
    Processing…
  • Good job!
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • test

Lucandra Lucandra Presentation Transcript

  • Lucandra Lucene + Cassandra http://github/tjake/Lucandra http://twitter.com/tjake Jake Luciani
  • What we'll cover today:
      • Search use-cases
      • Problems scaling and maintaining Lucene/Solr
      • Cassandra
      • Lucandra
      • Lucandra in Action 
      • Q&A
  • Types of search apps:
    •  
  • Types of search apps:
    •  
  • Lucene/Solr Scaling Problems
      • Writes are expensive on a live system
        • Merge, Reopen, Optimize, Sorting
      • "Too many open files"
      • Solr replication too many moving parts
      • Scaling writes requires client side sharding
      • Lots of grid management -> ZooKeeper?
      • Backups? Monitoring? Failures? Ops Team? Oh my!
    • This sounds a lot like mysql doesn't it?....
  • Cassandra - Love Child of BigTable and Dynamo
      • Peer to peer (easy to add new nodes)
      • CAP Configurable
      • Multi-level TreeMap (sorta)
      • Pluggable replication/sorting
      • Writes are very fast!
      • Low latency 
      • Integrates with Hadoop 
      • Major adoption and development
  • Cassandra's Data Model
    • { "bloghost.com" :                                                   // Keyspace  
    •    { "Posts" :                                                            // ColumnFamily
    •        { " tjake.bloghost.com " :                                   // Key
    •            { "20100426-Lucandra" : "lucandra talk today!" } // Columns       
    •         }  
    •      },
    •      { "Comments" :                                         // SuperColumnFamily
    •          { " tjake.bloghost.com " :                        // Key
    •            { "20100426-Lucandra-1":                // SuperColumn
    •                {"From" : "Otis","Comment": "Don't Suck!"}, // Columns   
    •            },
    •            { "20100426-Lucandra-2":                // SuperColumn
    •                {"From" : "Jake","Comment": "O.K."},  // Columns             
    •            },
    •      }
    • }}
  • Cassandra - Partitioning
  • Cassandra - Scale Up / Scale Down
  • Cassandra - Replication
  • Solr/Lucene Components
  • Lucandra Components
  • How is an index stored?
    • { "Lucandra" :
    •    { "Docs" :                 
    •        {  "Index1/Doc1" :  { "Field1" : "T1 T2 T1", ... },
    •        {  "Index1/Doc2" :  { "Field1" : "T3 T1", ... }
    •    },
    •    {"TermVectors" :
    •        {"Index1/Field1/T1" : { "Doc1": [0, 2], "Doc2":[1] },
    •        {"Index1/Field1/T2" : { "Doc1": [1] },
    •        {"Index1/Field1/T3" : { "Doc2": [1] },
    •    }
    • }
  • Lucandra Deployed
  • Lucandra In Action Sparse.ly and Wikassandra
  • sparse.ly -  twitter search for friends only
      • ~4k Indexes on 2 boxes
  • Wikassandra - Search wikipedia
      • 4 node cluster
      • 3k writes per sec (over thrift from single node)
      • Solr interface