Is My App A Good Fit ForCassandra?	Eric Lubow	@elubow	elubow@simplereach.com
Overview	•    Planning	•    Data Stores	•    Comparisons	•    Use/Cases	•    Final Thoughts	•    Questions	     Is My App ...
Where am I	•    Planning Stages	•    MVP (Minimum Viable Product)	•    Iteration	•      Final Decision 	     Is My App A G...
What Am I Building	•    User App	•    Hobby Project	•    Learning Project	•    Big Data System	     Is My App A Good Fit F...
What is Big Data	•    Depends on the user	•    Bigger Than Excel	•    Bigger Than One Server	•    Bigger Than One Rack	   ...
Big Data Truth Bomb	•     Even with the right tools, 80% of the work of building a      big data system is acquiring and r...
Planning Questions	•    What are my query patterns?	          •    Are my display requirements             •    Is the enc...
Tools	                                  C*Is My App A Good Fit For Cassandra	   Eric Lubow   @elubow
Languages	Is My App A Good Fit For Cassandra	   Eric Lubow   @elubow
Right Tool For The Job	Is My App A Good Fit For Cassandra	   Eric Lubow   @elubow
Cassandra	                                                                C*•    Large data volume ingestion at high veloc...
Cassandra Data	•     RowKey: 1345161600000:b198fa61-833a-6e78-fb83-233ec50b356e	•     => (column=facebook:1345162260136000...
MongoDB	•    Fast atomic increments (Node.js is native JSON)	•    Sharding	•    Solid ORM for Rails (MongoID)	•    Fast ac...
MongoDB Data	•     { "_id" : ObjectId("505a089275885cc53cd66520"),	•       "account_id" : ObjectId("4e87f81ca782f340420000...
Redis	•    Supports hundreds of thousands transactions per second	•    Great caching engine	•    Supports useful variable ...
Cons	•     Redis 	     •     Can only utilize a single core	     •     Data must be smaller than memory	     •     No clus...
Use Cases	•    Time Series	•    Counters	•    Feed Based Activity	•    Large Amounts of Data	     Is My App A Good Fit For...
The Cloud	•    Open source libraries into the API	•    Auto-scaling for magical scalabilty	•    Quickly test assumptions	•...
Support and Expertise	•    What happens when you need help?	•    How do you become experts?	•    What happens when you nee...
Summary	•    Have answers to the important questions	•    Know your data read/write patterns	•    Know the tools available...
Questions are guaranteed in life.	Answers aren’t.	                Eric Lubow	                @elubow	                elubo...
Upcoming SlideShare
Loading in...5
×

C*ollege Credit: Is My App a Good Fit for Cassandra?

2,572

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,572
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

C*ollege Credit: Is My App a Good Fit for Cassandra?

  1. 1. Is My App A Good Fit ForCassandra? Eric Lubow @elubow elubow@simplereach.com
  2. 2. Overview •  Planning •  Data Stores •  Comparisons •  Use/Cases •  Final Thoughts •  Questions Is My App A Good Fit For Cassandra Eric Lubow @elubow
  3. 3. Where am I •  Planning Stages •  MVP (Minimum Viable Product) •  Iteration •  Final Decision Is My App A Good Fit For Cassandra Eric Lubow @elubow
  4. 4. What Am I Building •  User App •  Hobby Project •  Learning Project •  Big Data System Is My App A Good Fit For Cassandra Eric Lubow @elubow
  5. 5. What is Big Data •  Depends on the user •  Bigger Than Excel •  Bigger Than One Server •  Bigger Than One Rack Is My App A Good Fit For Cassandra Eric Lubow @elubow
  6. 6. Big Data Truth Bomb •  Even with the right tools, 80% of the work of building a big data system is acquiring and refining the raw data into usable data. Is My App A Good Fit For Cassandra Eric Lubow @elubow
  7. 7. Planning Questions •  What are my query patterns? •  Are my display requirements •  Is the encryption/authentication/ •  How fault tolerant is the system? for realtime data? authorization support sufficient for Is my data ingestion high volume/high my needs? What supporting tools do I need? Tech •  •  Data velocity? •  Do I need to aggregate data on the fly? •  Are there monitoring architectures •  Is there support for my language? •  Am I batch loading data? already built? •  Is my data structured or•  Am I write heavy or read heavy? unstructured? •  Are there best practices guides already •  Are data relationships important? •  Does my data lend itself to a specific design pattern? •  Will the data need to be•  Does my data need to be immediately distributed? available everywhere? Data Tech Financial Other •  Am I cloud based? Do I have legal requirements (HIPAA/FIPS/Sarbanes Oxley/PII)? Financial Other • •  Am I hardware based? •  What kind of enterprise support is available? •  Am I a cloud/iron hybrid? •  What is the community like? •  How much am I willing to spend? •  Does the product roadmap pertain to my roadmap? •  How much am I willing to spend if something goes wrong? Is My App A Good Fit For Cassandra Eric Lubow @elubow
  8. 8. Tools C*Is My App A Good Fit For Cassandra Eric Lubow @elubow
  9. 9. Languages Is My App A Good Fit For Cassandra Eric Lubow @elubow
  10. 10. Right Tool For The Job Is My App A Good Fit For Cassandra Eric Lubow @elubow
  11. 11. Cassandra C*•  Large data volume ingestion at high velocity •  Really fast writes to many locations (eventual consistency) •  Query by column groups within rows (slicing) •  Opscenter •  Data toolkit: more than a data storage layer •  TTLs for small group aggregation Is My App A Good Fit For Cassandra Eric Lubow @elubow
  12. 12. Cassandra Data •  RowKey: 1345161600000:b198fa61-833a-6e78-fb83-233ec50b356e •  => (column=facebook:1345162260136000, value={"like_count":17,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp=1345162260136000) •  => (column=facebook:1345162260167000, value={"like_count":18,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp= 1345162260167000) •  => (column=facebook:1345162260261564, value={"like_count":21,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp= 1345162260261564) •  => (column=pageviews:1345162259307830, value={"user-agent":"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; OfficeLiveConnector.1.3; OfficeLivePatch.0.0; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET CLR 2.0.50727)","languages":"es-ve","user-id":"aede5694-3eb3-4cd0-810d-99d6bc2e0cb5","ip":"186.24.6.80"}, timestamp=1345162259307830) •  => (column=pageviews:1345162259302140, value={"user-agent":"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/ 5.0)","languages":"en-US","user-id":"a85679ab-9fd7-4aeb-93ab-2b66eddcf66a","ip":"192.168.255.182"}, timestamp=1345162259302140) •  => (column=pageviews:1345162259302000, value={"referrer":"http://www.tv-links.eu/_gate_way.html?data=VfMjMzOTE2Nw==","user- agent":"Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)","languages":"en-NZ","user-id":"ba0c6320- c4ca-4cb8-b5d4-e6ef21dbdc3c","ip":"219.89.75.163"}, timestamp=1345162259302000) •  => (column=pageviews:1345162259402000, value={"referrer":"http://foo.com/pop-culture/2012/09/40-Most-Weird-Comics-Ever","user- agent":"Mozilla/5.0 (Windows NT 6.0; WOW64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/ 535.11","languages":"en-US,en;q=0.8","user-id":"899f51ab-3e08-475a-9392-7eee5446edc3","ip":"24.118.178.215"}, timestamp=1345162259402000) •  => (column=twitter:1345162260246000, value={"count":17,"url":"http://mysite.com/586352/celebrities-with-children/"}, timestamp=1345162260246000) Is My App A Good Fit For Cassandra Eric Lubow @elubow
  13. 13. MongoDB •  Fast atomic increments (Node.js is native JSON) •  Sharding •  Solid ORM for Rails (MongoID) •  Fast access for pub/sub of durable/persisted documents •  B-Tree Indexes •  Document based via JSON •  TTLs for ephemeral data Is My App A Good Fit For Cassandra Eric Lubow @elubow
  14. 14. MongoDB Data •  { "_id" : ObjectId("505a089275885cc53cd66520"), •  "account_id" : ObjectId("4e87f81ca782f3404200000a"), •  "day" : ISODate("2012-01-01T00:00:00Z"), •  "md5" : "54f762d1025aadd6e2687005db657dac", •  "stats" : { •  "sum" : { "fb" : 108, "fba" : 108, "fbc" : 1, "fbl" : 71, "fbr" : 326, "fbs" : 36, "gp" : 2, "gpa" : 2, "li" : 3, "lia" : 3, "p" : 1840, "pi" : 1, "pia" : 1, "pspv" : 859.384, "soca" : 173, "socr" : 493, "srchr" : 27, "srt" : 86.48748533542772, "su" : 1, "sua" : 1, "tw" : 58, "twa" : 58, "twflc" : 4025418, "twfrc" : 139758, "twp" : 50, "twpa" : 50, "twr" : 167 }, •  "18" : { "sum" : { "fb" : 4, "fba" : 4, "fbl" : 2, "fbr" : 2, "fbs" : 2, "p" : 179, "pspv" : 105.1336, "soca" : 18, "socr" : 8, "srchr" : 10, "srt" : 60.337503923146954, "srtv" : 89.4550357667952, "tw" : 14, "twa" : 14, "twflc" : 107842, "twfrc" : 108111, "twg" : 8, "twp" : 7, "twpa" : 7, "twr" : 6 } }, •  "19" : { "sum" : { "fb" : 63, "fba" : 63, "fbl" : 40, "fbr" : 179, "fbs" : 23, "gp" : 2, "gpa" : 2, "p" : 498, "pi" : 1, "pia" : 1, "pspv" : 278.6148999999999, "soca" : 74, "socr" : 200, "srchr" : 5, "srt" : 74.27775496525277, "srtv" : 89.71819309386892, "tw" : 8, "twa" : 8, "twflc" : 9941, "twfrc" : 4228, "twg" : 7, "twp" : 7, "twpa" : 7, "twr" : 21 } } } } Is My App A Good Fit For Cassandra Eric Lubow @elubow
  15. 15. Redis •  Supports hundreds of thousands transactions per second •  Great caching engine •  Supports useful variable types like sets, sorted set, lists •  Everything is guaranteed to Memory Mapped (mmap) •  Transactional and supports bulk operations •  Centralized queueing and locking system Is My App A Good Fit For Cassandra Eric Lubow @elubow
  16. 16. Cons •  Redis •  Can only utilize a single core •  Data must be smaller than memory •  No clustering •  Cassandra •  No btree indexes •  Mongo •  Non-hashed shard keys •  Indexes must fit in memory. •  Forced replica ping times. Is My App A Good Fit For Cassandra Eric Lubow @elubow
  17. 17. Use Cases •  Time Series •  Counters •  Feed Based Activity •  Large Amounts of Data Is My App A Good Fit For Cassandra Eric Lubow @elubow
  18. 18. The Cloud •  Open source libraries into the API •  Auto-scaling for magical scalabilty •  Quickly test assumptions •  Spot Instances Is My App A Good Fit For Cassandra Eric Lubow @elubow
  19. 19. Support and Expertise •  What happens when you need help? •  How do you become experts? •  What happens when you need more experts? Is My App A Good Fit For Cassandra Eric Lubow @elubow
  20. 20. Summary •  Have answers to the important questions •  Know your data read/write patterns •  Know the tools available to you •  Know your compromises Is My App A Good Fit For Cassandra Eric Lubow @elubow
  21. 21. Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com Thank you.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×