A highly scalable, eventually consistent, distributed, structured key-value store.  How we use it @ Frugal Mechanic Eric P...
About  <ul><li>Search 2.5M unique Auto Parts, fitting 250M car configurations for over 100 retailers </li></ul><ul><li>Dat...
Who Uses Cassandra? Eric Peters (@ericpeters) STS 3-10-10
Cassandra Design Goals <ul><li>High availability </li></ul><ul><li>Eventual consistency </li></ul><ul><ul><li>trade-off st...
Cassandra write properties <ul><li>No reads  </li></ul><ul><li>No seeks  </li></ul><ul><li>Fast   </li></ul><ul><li>Atomic...
Cassandra read properties <ul><li>Read multiple SSTables  </li></ul><ul><li>Slower than writes (but still fast)  </li></ul...
MySQL Comparison <ul><li>MySQL > 50 GB Data  Writes Average : ~300 ms Reads Average : ~350 ms </li></ul><ul><li>Cassandra ...
ColumnFamilies {  // this is a column name: &quot;emailAddress&quot;, value: ”eric@example.com&quot;, timestamp: 123456789...
Super ColumnFamilies {  // this is a SuperColumn name: &quot;homeAddress”, // with an infinite list of Columns value: { //...
JSON Column Example <ul><li>UserProfile = { // this is a ColumnFamily </li></ul><ul><li>phatduckk: {  // this is the key t...
JSON Super Column Example <ul><li>AddressBook = { // this is a ColumnFamily of type Super </li></ul><ul><li>phatduckk: {  ...
Why Cassandra? <ul><li>It offers column-oriented data storage, so you have a bit more structure than plain key/value store...
Frugal Mechanic’s Data Eric Peters (@ericpeters) STS 3-10-10
Modeling Part Information <ul><li>cassandra> get FrugalMechanic.RawParts[' amazon/b0002jmuwk '] </li></ul><ul><li>=> (colu...
Modeling Part Prices <ul><li>cassandra> get FrugalMechanic.RawPartPrices['amazon/b0002jmuwk'] </li></ul><ul><li>=> (column...
Modeling Part Fitments <ul><li>cassandra> get FrugalMechanic.RawPartFitments['amazon/b0002jmuwk'] </li></ul><ul><li>=> (co...
Great Resources <ul><li>NoSQL West Intro:  http://cloudera-todd.s3.amazonaws.com/nosql.pdf  (Video:  http://www.vimeo.com/...
Questions? Eric Peters (@ericpeters) STS 3-10-10
Upcoming SlideShare
Loading in...5
×

NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10

3,206

Published on

My Cassandra talk at the STS 3-10-10 Meeting

Published in: Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,206
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
73
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10

  1. 1. A highly scalable, eventually consistent, distributed, structured key-value store. How we use it @ Frugal Mechanic Eric Peters (@ericpeters) STS 3-10-10
  2. 2. About <ul><li>Search 2.5M unique Auto Parts, fitting 250M car configurations for over 100 retailers </li></ul><ul><li>Data Data Data, We Need To: </li></ul><ul><li>Quickly Process More than 50 Feeds & Data Sources </li></ul><ul><li>Support 10M source-specific SKUs </li></ul><ul><li>Handle 300M SKU Part Fitments </li></ul><ul><li>Be flexible with new columns </li></ul><ul><li>Store and persist raw data before we cherry pick which data to use </li></ul>Eric Peters (@ericpeters) STS 3-10-10
  3. 3. Who Uses Cassandra? Eric Peters (@ericpeters) STS 3-10-10
  4. 4. Cassandra Design Goals <ul><li>High availability </li></ul><ul><li>Eventual consistency </li></ul><ul><ul><li>trade-off strong consistency in favor of high availability </li></ul></ul><ul><li>Incremental scalability </li></ul><ul><li>Optimistic Replication </li></ul><ul><li>“ Knobs” to tune tradeoffs between consistency, durability and latency </li></ul><ul><li>Low total cost of ownership </li></ul><ul><li>Minimal administration </li></ul>Slide “borrowed” from Avinash Lakshman Eric Peters (@ericpeters) STS 3-10-10
  5. 5. Cassandra write properties <ul><li>No reads </li></ul><ul><li>No seeks </li></ul><ul><li>Fast </li></ul><ul><li>Atomic within ColumnFamily </li></ul><ul><li>Always writable </li></ul>Eric Peters (@ericpeters) STS 3-10-10 Slide “borrowed” from Avinash Lakshman
  6. 6. Cassandra read properties <ul><li>Read multiple SSTables </li></ul><ul><li>Slower than writes (but still fast) </li></ul><ul><li>Seeks can be mitigated with more RAM </li></ul><ul><li>Scales to billions of rows </li></ul>Eric Peters (@ericpeters) STS 3-10-10 Slide “borrowed” from Avinash Lakshman
  7. 7. MySQL Comparison <ul><li>MySQL > 50 GB Data Writes Average : ~300 ms Reads Average : ~350 ms </li></ul><ul><li>Cassandra > 50 GB Data Writes Average : 0.12 ms Reads Average : 15 ms </li></ul>Slide “borrowed” from Avinash Lakshman Eric Peters (@ericpeters) STS 3-10-10
  8. 8. ColumnFamilies { // this is a column name: &quot;emailAddress&quot;, value: ”eric@example.com&quot;, timestamp: 123456789 } Examples from http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamo Eric Peters (@ericpeters) STS 3-10-10
  9. 9. Super ColumnFamilies { // this is a SuperColumn name: &quot;homeAddress”, // with an infinite list of Columns value: { // note the keys is the name of the Column street: {name: &quot;street&quot;, value: &quot;1234 x street&quot;, timestamp: 123456789}, city: {name: &quot;city&quot;, value: &quot;san francisco&quot;, timestamp: 123456789}, zip: {name: &quot;zip&quot;, value: &quot;94107&quot;, timestamp: 123456789}, } } Examples from http://www.slideshare.net/jbellis/cassandra-open-source-bigtable-dynamo Eric Peters (@ericpeters) STS 3-10-10
  10. 10. JSON Column Example <ul><li>UserProfile = { // this is a ColumnFamily </li></ul><ul><li>phatduckk: { // this is the key to this Row inside the CF </li></ul><ul><li>// now we have an infinite # of columns in this row </li></ul><ul><li>username: &quot;phatduckk&quot;, </li></ul><ul><li>email: &quot;phatduckk@example.com&quot;, </li></ul><ul><li>phone: &quot;(900) 976-6666” </li></ul><ul><li>}, // end row </li></ul><ul><li>ieure: { // this is the key to another row in the CF </li></ul><ul><li>// now we have another infinite # of columns in this row </li></ul><ul><li>username: &quot;ieure”, </li></ul><ul><li>email: &quot;ieure@example.com&quot;, </li></ul><ul><li>phone: &quot;(888) 555-1212”, </li></ul><ul><li>age: &quot;66&quot;, </li></ul><ul><li>gender: &quot;undecided” </li></ul><ul><li>}, </li></ul><ul><li>} </li></ul>Examples from: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf Eric Peters (@ericpeters) STS 3-10-10
  11. 11. JSON Super Column Example <ul><li>AddressBook = { // this is a ColumnFamily of type Super </li></ul><ul><li>phatduckk: { // this is the key to this row inside the Super CF </li></ul><ul><li>// the key here is the name of the owner of the address book </li></ul><ul><li>// now we have an infinite # of super columns in this row </li></ul><ul><li>// the keys inside the row are the names for the SuperColumns </li></ul><ul><li>// each of these SuperColumns is an address book entry </li></ul><ul><li>friend1: {street: &quot;8th street&quot;, zip: &quot;90210&quot;, city: &quot;Beverley Hills&quot;, state: &quot;CA&quot;}, </li></ul><ul><li> // this is the address book entry for John in phatduckk's address book </li></ul><ul><li>John: {street: &quot;Howard street&quot;, zip: &quot;94404&quot;, city: &quot;FC&quot;, state: &quot;CA&quot;}, </li></ul><ul><li>Kim: {street: &quot;X street&quot;, zip: &quot;87876&quot;, city: &quot;Balls&quot;, state: &quot;VA&quot;}, </li></ul><ul><li>… </li></ul><ul><li> // we can have an infinite # of ScuperColumns (aka address book entries) </li></ul><ul><li>}, // end row </li></ul><ul><li>ieure: { // this is the key to another row in the Super CF </li></ul><ul><li>// all the address book entries for ieure </li></ul><ul><li>joey: {street: &quot;A ave&quot;, zip: &quot;55485&quot;, city: &quot;Hell&quot;, state: &quot;NV&quot;}, </li></ul><ul><li>William: {street: &quot;Armpit Dr&quot;, zip: &quot;93301&quot;, city: &quot;Bakersfield&quot;, state: &quot;CA&quot;}, </li></ul><ul><li>}, </li></ul><ul><li>} </li></ul>Examples from: http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf Eric Peters (@ericpeters) STS 3-10-10
  12. 12. Why Cassandra? <ul><li>It offers column-oriented data storage, so you have a bit more structure than plain key/value stores. </li></ul><ul><li>Fast! Writes (.12ms) (300M == 10hrs) </li></ul><ul><li>Written in Java + Apache Foundation Project </li></ul><ul><li>People smarter than us are using it to solve even bigger problems than ours, if they can scale it, we will be able to </li></ul>Eric Peters (@ericpeters) STS 3-10-10
  13. 13. Frugal Mechanic’s Data Eric Peters (@ericpeters) STS 3-10-10
  14. 14. Modeling Part Information <ul><li>cassandra> get FrugalMechanic.RawParts[' amazon/b0002jmuwk '] </li></ul><ul><li>=> (column=thumb_image_url, value=http://ecx.images-amazon.com/images/I/31JQJNSAV6L._SL75_.jpg, timestamp=1264701499346) </li></ul><ul><li>=> (column=sku/FA1632, value=manufacturer, timestamp=1264701499346) </li></ul><ul><li>=> (column=sku/B0002JMUWK, value=asin, timestamp=1266339317829) </li></ul><ul><li>=> (column=name, value=Motorcraft FA1632 Air Filter, timestamp=1264701499346) </li></ul><ul><li>=> (column=large_image_url, value=http://ecx.images-amazon.com/images/I/31JQJNSAV6L.jpg, timestamp=1264701499346) </li></ul><ul><li>=> (column=description, value=Motorcraft Air Filter is designed to filter outside air that enters the vehicle. It is manufactured from leak proof polyurethane seals. This air filter chemically treats dry type cleaner elements to withstand damage from oil and moisture. It is resistant to temperature extremes and has a 98.5% efficiency standard. This air filter is treated to enhance capacity and efficiency as well as facilitates hassle free installation. It is backed by a 12 month warranty.<br/><br/> </li></ul><ul><li>Features:<br /> </li></ul><ul><li><ul> </li></ul><ul><li><li>Efficiently filters outside air</li> </li></ul><ul><li><li>Withstands damage from oil and moisture</li> </li></ul><ul><li><li>Easy to install</li> </li></ul><ul><li><li>12 months warranty</li> </li></ul><ul><li><li>Leak proof</li> </li></ul><ul><li></ul> </li></ul><ul><li>, timestamp=1264701499346) </li></ul><ul><li>=> (column=category/name, value=Automotive / Categories / Replacement Parts / Filters / Air Filters & Accessories / Air Filters, timestamp=1264701499346) </li></ul><ul><li>=> (column=category/browsenodeid, value=15727081,15727321, timestamp=1264701499346) </li></ul><ul><li>=> (column=cassandra_write_date, value=2010-02-16T16:55:17.809Z, timestamp=1266339317813) </li></ul><ul><li>=> (column=brand/name, value=Motorcraft, timestamp=1264701499346) </li></ul><ul><li>=> (column=brand/manufacturer, value=Motorcraft, timestamp=1264701499346) </li></ul><ul><li>Returned 11 results. </li></ul>Eric Peters (@ericpeters) STS 3-10-10
  15. 15. Modeling Part Prices <ul><li>cassandra> get FrugalMechanic.RawPartPrices['amazon/b0002jmuwk'] </li></ul><ul><li>=> (column= ATVPDKIKX0DER , value= {&quot;site&quot;:&quot;ATVPDKIKX0DER&quot;,&quot;price&quot;:&quot;13.45&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;ATVPDKIKX0DER&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;} , timestamp=1264701499346) </li></ul><ul><li>=> (column=AOMQHH38LHK76, value={&quot;site&quot;:&quot;AOMQHH38LHK76&quot;,&quot;price&quot;:&quot;8.64&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;AOMQHH38LHK76&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346) </li></ul><ul><li>=> (column=ADG953YR6NRBF, value={&quot;site&quot;:&quot;ADG953YR6NRBF&quot;,&quot;price&quot;:&quot;16.5&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;ADG953YR6NRBF&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346) </li></ul><ul><li>=> (column=A8F3HAQ1FDLH8, value={&quot;site&quot;:&quot;A8F3HAQ1FDLH8&quot;,&quot;price&quot;:&quot;12.94&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A8F3HAQ1FDLH8&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346) </li></ul><ul><li>=> (column=A3TW1WCPSO49LP, value={&quot;site&quot;:&quot;A3TW1WCPSO49LP&quot;,&quot;price&quot;:&quot;19.72&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A3TW1WCPSO49LP&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346) </li></ul><ul><li>=> (column=A3NMYM0J8WG63N, value={&quot;price&quot;:&quot;$18.14&quot;,&quot;buyUrlVar2&quot;:&quot;A3NMYM0J8WG63N&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;}, timestamp=1260473908931) </li></ul><ul><li>=> (column=A1DPIC5NQU31S0, value={&quot;site&quot;:&quot;A1DPIC5NQU31S0&quot;,&quot;price&quot;:&quot;16.13&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A1DPIC5NQU31S0&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346) </li></ul><ul><li>=> (column=A1ATZ3MAARQNEF, value={&quot;site&quot;:&quot;A1ATZ3MAARQNEF&quot;,&quot;price&quot;:&quot;16.0&quot;,&quot;buyUrlVar1&quot;:&quot;B0002JMUWK&quot;,&quot;buyUrlVar2&quot;:&quot;A1ATZ3MAARQNEF&quot;,&quot;updatedOn&quot;:&quot;2010-01-28T17:58:19.346Z&quot;}, timestamp=1264701499346) </li></ul><ul><li>Returned 8 results. </li></ul><ul><li>cassandra> </li></ul>
  16. 16. Modeling Part Fitments <ul><li>cassandra> get FrugalMechanic.RawPartFitments['amazon/b0002jmuwk'] </li></ul><ul><li>=> (column= {&quot;year&quot;:&quot;2009&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;F-150&quot;,&quot;engine&quot;:&quot;5.4L V8&quot;,&quot;notes&quot;:&quot;TYPE: 269 - HEIGHT: 7.81 - OUTSIDE: 4.26B - INSIDE: 6.10T&quot;} , value= 1 , timestamp=1266339317864) </li></ul><ul><li>=> (column={&quot;year&quot;:&quot;2009&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;F-150&quot;,&quot;engine&quot;:&quot;4.6L V8&quot;,&quot;notes&quot;:&quot;TYPE: 269 - HEIGHT: 7.81 - OUTSIDE: 4.26B - INSIDE: 6.10T&quot;}, value=1, timestamp=1266339317862) </li></ul><ul><li>... </li></ul><ul><li>=> (column={&quot;year&quot;:&quot;1997&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;E-250 Econoline&quot;,&quot;engine&quot;:&quot;5.4L V8 CNG&quot;,&quot;notes&quot;:&quot;GAS ENG&quot;}, value=1, timestamp=1266339317860) </li></ul><ul><li>=> (column={&quot;year&quot;:&quot;1997&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;E-150 Econoline&quot;,&quot;engine&quot;:&quot;5.4L V8&quot;,&quot;notes&quot;:&quot;All&quot;}, value=1, timestamp=1266339317860) </li></ul><ul><li>=> (column={&quot;year&quot;:&quot;1997&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;E-150 Econoline Club Wagon&quot;,&quot;engine&quot;:&quot;5.4L V8&quot;,&quot;notes&quot;:&quot;All&quot;}, value=1, timestamp=1266339317860) </li></ul><ul><li>=> (column={&quot;year&quot;:&quot;1996-1999&quot;,&quot;make&quot;:&quot;Ford&quot;,&quot;model&quot;:&quot;All&quot;,&quot;engine&quot;:&quot;4.6L V8 DOHC&quot;,&quot;notes&quot;:&quot;All&quot;}, value=1, timestamp=1266339317860) </li></ul><ul><li>Returned 200 results. </li></ul><ul><li>cassandra> </li></ul>Eric Peters (@ericpeters) STS 3-10-10
  17. 17. Great Resources <ul><li>NoSQL West Intro: http://cloudera-todd.s3.amazonaws.com/nosql.pdf (Video: http://www.vimeo.com/5145059 ) </li></ul><ul><li>Cassandra Talk (Rackspace): Vid+PPT: http://www.parleys.com/#st=5&id=1866 </li></ul><ul><li>Cassandra Talk (Facebook): PPT: http://static.last.fm/johan/nosql-20090611/cassandra_nosql.ppt Video: http://vimeo.com/5185526 </li></ul><ul><li>Cassandra Talk (Digg): http://nosql.mypopescu.com/post/334198583/presentation-cassandra-in-production-digg-arin </li></ul><ul><li>WTF is a Super Column : http://arin.s3.amazonaws.com/pub/docs/WTF-is-a-SuperColumn.pdf </li></ul><ul><li>Get Up and Running w/Cassandra: http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/ </li></ul>Eric Peters (@ericpeters) STS 3-10-10
  18. 18. Questions? Eric Peters (@ericpeters) STS 3-10-10
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×