Your SlideShare is downloading. ×
C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

C* Summit 2013: Dude, Where's My Tweet? Taming the Twitter Firehose by Andrew Noonan

648

Published on

Gnip ingests and must serve out hundreds of millions of social activities every day and social platforms are only growing. This makes the scalability of applications essential for Gnip. Enter …

Gnip ingests and must serve out hundreds of millions of social activities every day and social platforms are only growing. This makes the scalability of applications essential for Gnip. Enter Cassandra. Problem solved, right? Not exactly, Gnip's relationship with Cassandra was not all rainbows and unicorns. In this session we will walk you through why we began looking at Cassandra as a data store in the first place and the valuable lessons we with Cassandra that has made it an invaluable part of our infrastructure.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
648
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
17
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. #Cassandra2013Dude, Where’s My Tweet?Taming the Twitter FirehoseAndrew NoonanSoftware Engineer at Gnip@noonanisms
  • 2. #Cassandra2013GnipCassandraRainbowsUnicorns???  
  • 3. #Cassandra2013
  • 4. #Cassandra2013Social Data
  • 5. #Cassandra201390% of Fortune 500120 Billion Activities Delivered Per Month
  • 6. #Cassandra2013Lots-O-DataRedundancy & ReliabilityAvailability
  • 7. #Cassandra2013
  • 8. #Cassandra2013High Write Throughput✔Scalable✔Highly Available✔Persistent✔
  • 9. #Cassandra2013Right?
  • 10. #Cassandra2013Not Exactly…
  • 11. #Cassandra2013No Maintenance? Bad IdeaBegin Maintenance -> 2X Data GrowthScalable, Right?Bootstrap Failures Due To Cluster Load
  • 12. #Cassandra2013Reconsider (Life) Choices?
  • 13. #Cassandra2013Size Tiered Compaction vs Leveled CompactionHow Much Data To Store Per NodeYour Write Pattern Matters Too
  • 14. #Cassandra2013compaction_throughput_mb_per_sec16-32X write rate?Lots-o-options – explore them
  • 15. #Cassandra2013Lookup by Tweet IDRead Rate < Write RateDynamic ColumnFamilies
  • 16. #Cassandra2013For realz this time!?
  • 17. #Cassandra2013
  • 18. #Cassandra2013Bloom Filter False Positive RateIndex IntervalsOnly Change Schema On One Node! (For Now)
  • 19. #Cassandra2013You Won’t Always Fit The Mold and That’s OkayExplore Your Options No Matter WhatUnderstand The Consequences Of Your ChoicesStaging Environment Identical To Production
  • 20. #Cassandra2013www.gnip.com@noonanisms

×