NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10

on

  • 4,185 views

My Cassandra talk at the STS 3-10-10 Meeting

My Cassandra talk at the STS 3-10-10 Meeting

Statistics

Views

Total Views
4,185
Views on SlideShare
2,501
Embed Views
1,684

Actions

Likes
5
Downloads
71
Comments
0

8 Embeds 1,684

http://www.startupmonkeys.com 1461
http://startupmonkeys.com 181
http://www.slideshare.net 25
http://web.archive.org 13
http://webcache.googleusercontent.com 1
http://translate.googleusercontent.com 1
http://honyaku.yahoofs.jp 1
http://ranksit.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10 Presentation Transcript

  • 1. A highly scalable, eventually consistent, distributed, structured key-value store. How we use it @ Frugal Mechanic Eric Peters (@ericpeters) STS 3-10-10
  • 2. About
    • Search 2.5M unique Auto Parts, fitting 250M car configurations for over 100 retailers
    • Data Data Data, We Need To:
    • Quickly Process More than 50 Feeds & Data Sources
    • Support 10M source-specific SKUs
    • Handle 300M SKU Part Fitments
    • Be flexible with new columns
    • Store and persist raw data before we cherry pick which data to use
    Eric Peters (@ericpeters) STS 3-10-10
  • 3. Who Uses Cassandra? Eric Peters (@ericpeters) STS 3-10-10
  • 4. Cassandra Design Goals
    • High availability
    • Eventual consistency
      • trade-off strong consistency in favor of high availability
    • Incremental scalability
    • Optimistic Replication
    • “ Knobs” to tune tradeoffs between consistency, durability and latency
    • Low total cost of ownership
    • Minimal administration
    Slide “borrowed” from Avinash Lakshman Eric Peters (@ericpeters) STS 3-10-10
  • 5. Cassandra write properties
    • No reads
    • No seeks
    • Fast
    • Atomic within ColumnFamily
    • Always writable
    Eric Peters (@ericpeters) STS 3-10-10 Slide “borrowed” from Avinash Lakshman
  • 6. Cassandra read properties
    • Read multiple SSTables
    • Slower than writes (but still fast)
    • Seeks can be mitigated with more RAM
    • Scales to billions of rows
    Eric Peters (@ericpeters) STS 3-10-10 Slide “borrowed” from Avinash Lakshman
  • 7. MySQL Comparison
    • MySQL > 50 GB Data Writes Average : ~300 ms Reads Average : ~350 ms
    • Cassandra > 50 GB Data Writes Average : 0.12 ms Reads Average : 15 ms
    Slide “borrowed” from Avinash Lakshman Eric Peters (@ericpeters) STS 3-10-10
  • 8. ColumnFamilies { // this is a column name: "emailAddress", value: ”eric@example.com", timestamp: 123456789 } Examples from http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamo Eric Peters (@ericpeters) STS 3-10-10
  • 9. Super ColumnFamilies { // this is a SuperColumn name: "homeAddress”, // with an infinite list of Columns value: { // note the keys is the name of the Column street: {name: "street", value: "1234 x street", timestamp: 123456789}, city: {name: "city", value: "san francisco", timestamp: 123456789}, zip: {name: "zip", value: "94107", timestamp: 123456789}, } } Examples from http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamo Eric Peters (@ericpeters) STS 3-10-10
  • 10. JSON Column Example
    • UserProfile = { // this is a ColumnFamily
    • phatduckk: { // this is the key to this Row inside the CF
    • // now we have an infinite # of columns in this row
    • username: "phatduckk",
    • email: "phatduckk@example.com",
    • phone: "(900) 976-6666”
    • }, // end row
    • ieure: { // this is the key to another row in the CF
    • // now we have another infinite # of columns in this row
    • username: "ieure”,
    • email: "ieure@example.com",
    • phone: "(888) 555-1212”,
    • age: "66",
    • gender: "undecided”
    • },
    • }
    Examples from: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf Eric Peters (@ericpeters) STS 3-10-10
  • 11. JSON Super Column Example
    • AddressBook = { // this is a ColumnFamily of type Super
    • phatduckk: { // this is the key to this row inside the Super CF
    • // the key here is the name of the owner of the address book
    • // now we have an infinite # of super columns in this row
    • // the keys inside the row are the names for the SuperColumns
    • // each of these SuperColumns is an address book entry
    • friend1: {street: "8th street", zip: "90210", city: "Beverley Hills", state: "CA"},
    • // this is the address book entry for John in phatduckk's address book
    • John: {street: "Howard street", zip: "94404", city: "FC", state: "CA"},
    • Kim: {street: "X street", zip: "87876", city: "Balls", state: "VA"},
    • // we can have an infinite # of ScuperColumns (aka address book entries)
    • }, // end row
    • ieure: { // this is the key to another row in the Super CF
    • // all the address book entries for ieure
    • joey: {street: "A ave", zip: "55485", city: "Hell", state: "NV"},
    • William: {street: "Armpit Dr", zip: "93301", city: "Bakersfield", state: "CA"},
    • },
    • }
    Examples from: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf Eric Peters (@ericpeters) STS 3-10-10
  • 12. Why Cassandra?
    • It offers column-oriented data storage, so you have a bit more structure than plain key/value stores.
    • Fast! Writes (.12ms) (300M == 10hrs)
    • Written in Java + Apache Foundation Project
    • People smarter than us are using it to solve even bigger problems than ours, if they can scale it, we will be able to
    Eric Peters (@ericpeters) STS 3-10-10
  • 13. Frugal Mechanic’s Data Eric Peters (@ericpeters) STS 3-10-10
  • 14. Modeling Part Information
    • cassandra> get FrugalMechanic.RawParts[' amazon/b0002jmuwk ']
    • => (column=thumb_image_url, value=http://ecx.images-amazon.com/images/I/31JQJNSAV6L._SL75_.jpg, timestamp=1264701499346)
    • => (column=sku/FA1632, value=manufacturer, timestamp=1264701499346)
    • => (column=sku/B0002JMUWK, value=asin, timestamp=1266339317829)
    • => (column=name, value=Motorcraft FA1632 Air Filter, timestamp=1264701499346)
    • => (column=large_image_url, value=http://ecx.images-amazon.com/images/I/31JQJNSAV6L.jpg, timestamp=1264701499346)
    • => (column=description, value=Motorcraft Air Filter is designed to filter outside air that enters the vehicle. It is manufactured from leak proof polyurethane seals. This air filter chemically treats dry type cleaner elements to withstand damage from oil and moisture. It is resistant to temperature extremes and has a 98.5% efficiency standard. This air filter is treated to enhance capacity and efficiency as well as facilitates hassle free installation. It is backed by a 12 month warranty.<br/><br/>
    • Features:<br />
    • <ul>
    • <li>Efficiently filters outside air</li>
    • <li>Withstands damage from oil and moisture</li>
    • <li>Easy to install</li>
    • <li>12 months warranty</li>
    • <li>Leak proof</li>
    • </ul>
    • , timestamp=1264701499346)
    • => (column=category/name, value=Automotive / Categories / Replacement Parts / Filters / Air Filters & Accessories / Air Filters, timestamp=1264701499346)
    • => (column=category/browsenodeid, value=15727081,15727321, timestamp=1264701499346)
    • => (column=cassandra_write_date, value=2010-02-16T16:55:17.809Z, timestamp=1266339317813)
    • => (column=brand/name, value=Motorcraft, timestamp=1264701499346)
    • => (column=brand/manufacturer, value=Motorcraft, timestamp=1264701499346)
    • Returned 11 results.
    Eric Peters (@ericpeters) STS 3-10-10
  • 15. Modeling Part Prices
    • cassandra> get FrugalMechanic.RawPartPrices['amazon/b0002jmuwk']
    • => (column= ATVPDKIKX0DER , value= {&quot;site&quot;:&quot;ATVPDKIKX0DER&quot;,&quot;price&quot;:&quot;13.45&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;ATVPDKIKX0DER&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;} , timestamp=1264701499346)
    • => (column=AOMQHH38LHK76, value={&quot;site&quot;:&quot;AOMQHH38LHK76&quot;,&quot;price&quot;:&quot;8.64&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;AOMQHH38LHK76&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346)
    • => (column=ADG953YR6NRBF, value={&quot;site&quot;:&quot;ADG953YR6NRBF&quot;,&quot;price&quot;:&quot;16.5&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;ADG953YR6NRBF&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346)
    • => (column=A8F3HAQ1FDLH8, value={&quot;site&quot;:&quot;A8F3HAQ1FDLH8&quot;,&quot;price&quot;:&quot;12.94&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A8F3HAQ1FDLH8&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346)
    • => (column=A3TW1WCPSO49LP, value={&quot;site&quot;:&quot;A3TW1WCPSO49LP&quot;,&quot;price&quot;:&quot;19.72&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A3TW1WCPSO49LP&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346)
    • => (column=A3NMYM0J8WG63N, value={&quot;price&quot;:&quot;$18.14&quot;,&quot;buyUrlVar2&quot;:&quot;A3NMYM0J8WG63N&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;}, timestamp=1260473908931)
    • => (column=A1DPIC5NQU31S0, value={&quot;site&quot;:&quot;A1DPIC5NQU31S0&quot;,&quot;price&quot;:&quot;16.13&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A1DPIC5NQU31S0&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346)
    • => (column=A1ATZ3MAARQNEF, value={&quot;site&quot;:&quot;A1ATZ3MAARQNEF&quot;,&quot;price&quot;:&quot;16.0&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A1ATZ3MAARQNEF&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346)
    • Returned 8 results.
    • cassandra>
  • 16. Modeling Part Fitments
    • cassandra> get FrugalMechanic.RawPartFitments['amazon/b0002jmuwk']
    • => (column= {&quot;year&quot;:&quot;2009&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;F-150&quot;,&quot;engine&quot;:&quot;5.4L V8&quot;,&quot;notes&quot;:&quot;TYPE: 269 - HEIGHT: 7.81 - OUTSIDE: 4.26B - INSIDE: 6.10T&quot;} , value= 1 , timestamp=1266339317864)
    • => (column={&quot;year&quot;:&quot;2009&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;F-150&quot;,&quot;engine&quot;:&quot;4.6L V8&quot;,&quot;notes&quot;:&quot;TYPE: 269 - HEIGHT: 7.81 - OUTSIDE: 4.26B - INSIDE: 6.10T&quot;}, value=1, timestamp=1266339317862)
    • ...
    • => (column={&quot;year&quot;:&quot;1997&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;E-250 Econoline&quot;,&quot;engine&quot;:&quot;5.4L V8 CNG&quot;,&quot;notes&quot;:&quot;GAS ENG&quot;}, value=1, timestamp=1266339317860)
    • => (column={&quot;year&quot;:&quot;1997&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;E-150 Econoline&quot;,&quot;engine&quot;:&quot;5.4L V8&quot;,&quot;notes&quot;:&quot;All&quot;}, value=1, timestamp=1266339317860)
    • => (column={&quot;year&quot;:&quot;1997&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;E-150 Econoline Club Wagon&quot;,&quot;engine&quot;:&quot;5.4L V8&quot;,&quot;notes&quot;:&quot;All&quot;}, value=1, timestamp=1266339317860)
    • => (column={&quot;year&quot;:&quot;1996-1999&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;All&quot;,&quot;engine&quot;:&quot;4.6L V8 DOHC&quot;,&quot;notes&quot;:&quot;All&quot;}, value=1, timestamp=1266339317860)
    • Returned 200 results.
    • cassandra>
    Eric Peters (@ericpeters) STS 3-10-10
  • 17. Great Resources
    • NoSQL West Intro: http://cloudera-todd.s3.amazonaws.com/nosql.pdf (Video: http://www.vimeo.com/5145059 )
    • Cassandra Talk (Rackspace): Vid+PPT: http://www.parleys.com/#st=5&id=1866
    • Cassandra Talk (Facebook): PPT: http://static.last.fm/johan/nosql-20090611/cassandra_nosql.ppt Video: http://vimeo.com/5185526
    • Cassandra Talk (Digg): http://nosql.mypopescu.com/post/334198583/presentation-cassandra-in-production-digg-arin
    • WTF is a Super Column : http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf
    • Get Up and Running w/Cassandra: http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/
    Eric Peters (@ericpeters) STS 3-10-10
  • 18. Questions? Eric Peters (@ericpeters) STS 3-10-10