Introduction to Cassandra (June 2010)

24,369 views
24,225 views

Published on

Presented to the Silicon Valley Cloud Computing Group. 17 June 2010.

0 Comments
41 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
24,369
On SlideShare
0
From Embeds
0
Number of Embeds
14,775
Actions
Shares
0
Downloads
569
Comments
0
Likes
41
Embeds 0
No embeds

No notes for slide
  • Data growth has been expanding.
  • Historical industry leaders
  • 32 core processor machines are expensiveCosts go way up when you try to scale these databasesAlso-instability.
  • Terabytes of data~1,000,000 ops/secondSchema changes are difficult (impossible)Manual sharding takes a lot of effortAutomated sharding + replication is difficult
  • 100 M users, 25 TB data
  • Horizontal – commodity hardware, not specialized boxes
  • Cluster is a logical storage ringNode placement divides the ring into ranges that represent start/stop points for keysAutomatic or manual token assignment (use another slide for that) Closer together means less responsibility and data
  • Token
  • Bootstrapping
  • Hinting not designed for long failures.
  • RDBMS focus on consistency. Limits scale.
  • No multi-key transactions
  • Sstable proliferation degrades performance.
  • DistributedScalableSchema-freeSparse tableEventually consistentTunable (throughput and fault-tolerance)
  • ×