Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Is My App A Good Fit ForCassandra?	Eric Lubow	@elubow	elubow@simplereach.com
Overview	•    Planning	•    Data Stores	•    Comparisons	•    Use/Cases	•    Final Thoughts	•    Questions	     Is My App ...
Where am I	•    Planning Stages	•    MVP (Minimum Viable Product)	•    Iteration	•      Final Decision 	     Is My App A G...
What Am I Building	•    User App	•    Hobby Project	•    Learning Project	•    Big Data System	     Is My App A Good Fit F...
What is Big Data	•    Depends on the user	•    Bigger Than Excel	•    Bigger Than One Server	•    Bigger Than One Rack	   ...
Big Data Truth Bomb	•     Even with the right tools, 80% of the work of building a      big data system is acquiring and r...
Planning Questions	•    What are my query patterns?	          •    Are my display requirements             •    Is the enc...
Tools	                                  C*Is My App A Good Fit For Cassandra	   Eric Lubow   @elubow
Languages	Is My App A Good Fit For Cassandra	   Eric Lubow   @elubow
Right Tool For The Job	Is My App A Good Fit For Cassandra	   Eric Lubow   @elubow
Cassandra	                                                                C*•    Large data volume ingestion at high veloc...
Cassandra Data	•     RowKey: 1345161600000:b198fa61-833a-6e78-fb83-233ec50b356e	•     => (column=facebook:1345162260136000...
MongoDB	•    Fast atomic increments (Node.js is native JSON)	•    Sharding	•    Solid ORM for Rails (MongoID)	•    Fast ac...
MongoDB Data	•     { "_id" : ObjectId("505a089275885cc53cd66520"),	•       "account_id" : ObjectId("4e87f81ca782f340420000...
Redis	•    Supports hundreds of thousands transactions per second	•    Great caching engine	•    Supports useful variable ...
Cons	•     Redis 	     •     Can only utilize a single core	     •     Data must be smaller than memory	     •     No clus...
Use Cases	•    Time Series	•    Counters	•    Feed Based Activity	•    Large Amounts of Data	     Is My App A Good Fit For...
The Cloud	•    Open source libraries into the API	•    Auto-scaling for magical scalabilty	•    Quickly test assumptions	•...
Support and Expertise	•    What happens when you need help?	•    How do you become experts?	•    What happens when you nee...
Summary	•    Have answers to the important questions	•    Know your data read/write patterns	•    Know the tools available...
Questions are guaranteed in life.	Answers aren’t.	                Eric Lubow	                @elubow	                elubo...
Upcoming SlideShare
Loading in …5
×

C*ollege Credit: Is My App a Good Fit for Cassandra?

3,580 views

Published on

  • Be the first to comment

  • Be the first to like this

C*ollege Credit: Is My App a Good Fit for Cassandra?

  1. 1. Is My App A Good Fit ForCassandra? Eric Lubow @elubow elubow@simplereach.com
  2. 2. Overview •  Planning •  Data Stores •  Comparisons •  Use/Cases •  Final Thoughts •  Questions Is My App A Good Fit For Cassandra Eric Lubow @elubow
  3. 3. Where am I •  Planning Stages •  MVP (Minimum Viable Product) •  Iteration •  Final Decision Is My App A Good Fit For Cassandra Eric Lubow @elubow
  4. 4. What Am I Building •  User App •  Hobby Project •  Learning Project •  Big Data System Is My App A Good Fit For Cassandra Eric Lubow @elubow
  5. 5. What is Big Data •  Depends on the user •  Bigger Than Excel •  Bigger Than One Server •  Bigger Than One Rack Is My App A Good Fit For Cassandra Eric Lubow @elubow
  6. 6. Big Data Truth Bomb •  Even with the right tools, 80% of the work of building a big data system is acquiring and refining the raw data into usable data. Is My App A Good Fit For Cassandra Eric Lubow @elubow
  7. 7. Planning Questions •  What are my query patterns? •  Are my display requirements •  Is the encryption/authentication/ •  How fault tolerant is the system? for realtime data? authorization support sufficient for Is my data ingestion high volume/high my needs? What supporting tools do I need? Tech •  •  Data velocity? •  Do I need to aggregate data on the fly? •  Are there monitoring architectures •  Is there support for my language? •  Am I batch loading data? already built? •  Is my data structured or•  Am I write heavy or read heavy? unstructured? •  Are there best practices guides already •  Are data relationships important? •  Does my data lend itself to a specific design pattern? •  Will the data need to be•  Does my data need to be immediately distributed? available everywhere? Data Tech Financial Other •  Am I cloud based? Do I have legal requirements (HIPAA/FIPS/Sarbanes Oxley/PII)? Financial Other • •  Am I hardware based? •  What kind of enterprise support is available? •  Am I a cloud/iron hybrid? •  What is the community like? •  How much am I willing to spend? •  Does the product roadmap pertain to my roadmap? •  How much am I willing to spend if something goes wrong? Is My App A Good Fit For Cassandra Eric Lubow @elubow
  8. 8. Tools C*Is My App A Good Fit For Cassandra Eric Lubow @elubow
  9. 9. Languages Is My App A Good Fit For Cassandra Eric Lubow @elubow
  10. 10. Right Tool For The Job Is My App A Good Fit For Cassandra Eric Lubow @elubow
  11. 11. Cassandra C*•  Large data volume ingestion at high velocity •  Really fast writes to many locations (eventual consistency) •  Query by column groups within rows (slicing) •  Opscenter •  Data toolkit: more than a data storage layer •  TTLs for small group aggregation Is My App A Good Fit For Cassandra Eric Lubow @elubow
  12. 12. Cassandra Data •  RowKey: 1345161600000:b198fa61-833a-6e78-fb83-233ec50b356e •  => (column=facebook:1345162260136000, value={"like_count":17,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp=1345162260136000) •  => (column=facebook:1345162260167000, value={"like_count":18,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp= 1345162260167000) •  => (column=facebook:1345162260261564, value={"like_count":21,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp= 1345162260261564) •  => (column=pageviews:1345162259307830, value={"user-agent":"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727)","languages":"es-ve","user-id":"aede5694-3eb3-4cd0-810d-99d6bc2e0cb5","ip":"186.24.6.80"}, timestamp=1345162259307830) •  => (column=pageviews:1345162259302140, value={"user-agent":"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/ 5.0)","languages":"en-US","user-id":"a85679ab-9fd7-4aeb-93ab-2b66eddcf66a","ip":"192.168.255.182"}, timestamp=1345162259302140) •  => (column=pageviews:1345162259302000, value={"referrer":"http://www.tv-links.eu/_gate_way.html?data=VfMjMzOTE2Nw==","user- agent":"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)","languages":"en-NZ","user-id":"ba0c6320- c4ca-4cb8-b5d4-e6ef21dbdc3c","ip":"219.89.75.163"}, timestamp=1345162259302000) •  => (column=pageviews:1345162259402000, value={"referrer":"http://foo.com/pop-culture/2012/09/40-Most-Weird-Comics-Ever","user- agent":"Mozilla/5.0 (Windows NT 6.0; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/ 535.11","languages":"en-US,en;q=0.8","user-id":"899f51ab-3e08-475a-9392-7eee5446edc3","ip":"24.118.178.215"}, timestamp=1345162259402000) •  => (column=twitter:1345162260246000, value={"count":17,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp=1345162260246000) Is My App A Good Fit For Cassandra Eric Lubow @elubow
  13. 13. MongoDB •  Fast atomic increments (Node.js is native JSON) •  Sharding •  Solid ORM for Rails (MongoID) •  Fast access for pub/sub of durable/persisted documents •  B-Tree Indexes •  Document based via JSON •  TTLs for ephemeral data Is My App A Good Fit For Cassandra Eric Lubow @elubow
  14. 14. MongoDB Data •  { "_id" : ObjectId("505a089275885cc53cd66520"), •  "account_id" : ObjectId("4e87f81ca782f3404200000a"), •  "day" : ISODate("2012-01-01T00:00:00Z"), •  "md5" : "54f762d1025aadd6e2687005db657dac", •  "stats" : { •  "sum" : { "fb" : 108, "fba" : 108, "fbc" : 1, "fbl" : 71, "fbr" : 326, "fbs" : 36, "gp" : 2, "gpa" : 2, "li" : 3, "lia" : 3, "p" : 1840, "pi" : 1, "pia" : 1, "pspv" : 859.384, "soca" : 173, "socr" : 493, "srchr" : 27, "srt" : 86.48748533542772, "su" : 1, "sua" : 1, "tw" : 58, "twa" : 58, "twflc" : 4025418, "twfrc" : 139758, "twp" : 50, "twpa" : 50, "twr" : 167 }, •  "18" : { "sum" : { "fb" : 4, "fba" : 4, "fbl" : 2, "fbr" : 2, "fbs" : 2, "p" : 179, "pspv" : 105.1336, "soca" : 18, "socr" : 8, "srchr" : 10, "srt" : 60.337503923146954, "srtv" : 89.4550357667952, "tw" : 14, "twa" : 14, "twflc" : 107842, "twfrc" : 108111, "twg" : 8, "twp" : 7, "twpa" : 7, "twr" : 6 } }, •  "19" : { "sum" : { "fb" : 63, "fba" : 63, "fbl" : 40, "fbr" : 179, "fbs" : 23, "gp" : 2, "gpa" : 2, "p" : 498, "pi" : 1, "pia" : 1, "pspv" : 278.6148999999999, "soca" : 74, "socr" : 200, "srchr" : 5, "srt" : 74.27775496525277, "srtv" : 89.71819309386892, "tw" : 8, "twa" : 8, "twflc" : 9941, "twfrc" : 4228, "twg" : 7, "twp" : 7, "twpa" : 7, "twr" : 21 } } } } Is My App A Good Fit For Cassandra Eric Lubow @elubow
  15. 15. Redis •  Supports hundreds of thousands transactions per second •  Great caching engine •  Supports useful variable types like sets, sorted set, lists •  Everything is guaranteed to Memory Mapped (mmap) •  Transactional and supports bulk operations •  Centralized queueing and locking system Is My App A Good Fit For Cassandra Eric Lubow @elubow
  16. 16. Cons •  Redis •  Can only utilize a single core •  Data must be smaller than memory •  No clustering •  Cassandra •  No btree indexes •  Mongo •  Non-hashed shard keys •  Indexes must fit in memory. •  Forced replica ping times. Is My App A Good Fit For Cassandra Eric Lubow @elubow
  17. 17. Use Cases •  Time Series •  Counters •  Feed Based Activity •  Large Amounts of Data Is My App A Good Fit For Cassandra Eric Lubow @elubow
  18. 18. The Cloud •  Open source libraries into the API •  Auto-scaling for magical scalabilty •  Quickly test assumptions •  Spot Instances Is My App A Good Fit For Cassandra Eric Lubow @elubow
  19. 19. Support and Expertise •  What happens when you need help? •  How do you become experts? •  What happens when you need more experts? Is My App A Good Fit For Cassandra Eric Lubow @elubow
  20. 20. Summary •  Have answers to the important questions •  Know your data read/write patterns •  Know the tools available to you •  Know your compromises Is My App A Good Fit For Cassandra Eric Lubow @elubow
  21. 21. Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com Thank you.

×