Your SlideShare is downloading. ×
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Data Modeling with Cassandra Column Families

10,617

Published on

Slide notes I used for my presentation at ICOODB 2010.

Slide notes I used for my presentation at ICOODB 2010.

Published in: Technology
3 Comments
19 Likes
Statistics
Notes
No Downloads
Views
Total Views
10,617
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
3
Likes
19
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • It’s not about one data model vs. another.It’s not about one storage engine vs. another.Cassandra excels at replicating data and achieving high sustained write throughput.
  • The right tool for the right job
  • Shaped by distribution model
  • Shaped by distribution model
  • Shaped by distribution model
  • Sparse – do not have to exist in every row.
  • Flexible column namingYou define the sort orderNot required to have a specific column just because another row does
  • Look familiar?
  • Arise because of distribution model, not CF model.
  • * Atomic @ CF row. Not isolated.* Large trans apps push down to node (shared nothing)* Guaranteeing ACID constraints across nodes is a hard problem.
  • OTOH, you do get a lot of things:Data redundancyVery fast writes, fast reads
  • Relational>formally defined>correctQuery first>not formally defined>somehow incorrectYou get some things in exchange:ScalabilityAvailabilityReplication
  • Relational>formally defined>correctQuery first>not formally defined>somehow incorrectYou get some things in exchange:ScalabilityAvailabilityReplication
  • Focus on query & analysis.B+treesUpdate once*Cassandra typically becomes IO bound before becoming CPU bound.
  • Not set in stone.Your application may require a different approach.
  • Recognize non-starters: Is my dataset going to become Very Large? Will I need to sustain high write throughput?Also, what are the common operations? Optimize CFs for those operations.
  • *columns sorted. Choose keys and columns.you need to think about how you plan to slice your data.Related data is close to reduce io
  • DenormalizeUse the disk.Don’t be afraid to create another CF that duplicates some data.
  • Composite column namesPainful updates of denormalized partsFast reads & insertions
  • Key
  • Normal attributes
  • Composite column names.Pulling in relationshipsPainful updates. Denormalization is best when data doesn’t change.
  • Commit log – separate diskMemtableSstable
  • Transcript

    • 1. Needle Meet HaystackAdapting your data models for Cassandra
      Gary Dusbabek • Rackspace• ICOODB 2010
    • 2. Outline
      First Things First
      Column Families
      Trade Offs
      Procedures & Best Practices
      Internals
    • 3. It’s all about scalability
    • 4. We can all be friends
    • 5. Column Families
    • 6.
    • 7.
    • 8.
    • 9.
    • 10.
    • 11.
    • 12.
    • 13.
    • 14. 2.TradeOffs
    • 15. No Transactions
    • 16. No
      Adhoc Queries
    • 17. No Joins
    • 18. No Flexible Indexes
    • 19. Don’t
      Panic!
    • 20. Scalability
      Availability
      Replication & Backup
    • 21. 3. Procedures & Practices
    • 22. Relational Way
      Define entities
      Normalize
      Identify Many-to-many
      Query any way you want
    • 23. How Come?
      Scarcity
      Efficiency
    • 24. Cassandra Way
      Know your app
      Queries first
      Denormalize
    • 25. Know Your App
    • 26. Queries First
    • 27. Nobody is Normal
    • 28. Relational Example
    • 29. Column Family Example
    • 30. Column Family Example
    • 31. Column Family Example
    • 32. Column Family Example
    • 33. Does it feel strange?
    • 34. 4. Internals
    • 35. Sequential Writes
      Always
    • 36. Consistency Level
    • 37. Partitioning
    • 38. Slices
      Data Locality
    • 39. Summary
      • The goal is to scale
      • 40. ColumnFamilies != Relational tables
      • 41. Trade-offs: you win some, you lose some
      • 42. Know your application
      • 43. Queries first
      • 44. Denormalization is OK
      • 45. Cassandra was built for this
    • Links
      http://cassandra.apache.org
      http://wiki.apache.org/cassandra
      irc: #cassandra on freenode
      gdusbabek@gmail.com
      @gdusbabek
    • 46. Image Credits
      haystack http://www.flickr.com/photos/james_lumb/3921968993
      pyramids http://www.flickr.com/photos/gracewong/93631410
      scales http://www.flickr.com/photos/eflon/3465042138
      friends http://www.flickr.com/photos/ngmmemuda/4166182931
      television http://www.flickr.com/photos/angelrravelor/314306023
      columns http://www.flickr.com/photos/nostri-imago/3564300653
      devil http://www.flickr.com/photos/52890443@N02/4887855756
      angel http://www.flickr.com/photos/75001512@N00/4938623021
      transaction http://www.flickr.com/photos/neubie/2273635564
      queries http://www.flickr.com/photos/-bast-/349497988
      rings http://www.flickr.com/photos/baldur/4395738741
      indexes http://www.flickr.com/photos/waferboard/4137041591
      panic http://www.flickr.com/photos/pasukaru76/3998981988
      procedures "The Anatomy Lesson of Dr. NicolaesTuip" by Rembrandt
      relational http://www.flickr.com/photos/35536700@N07/3292544674
      desert http://www.flickr.com/photos/waldenpond/4252575735
      jet http://www.flickr.com/photos/rmahle/709685
      queries http://www.flickr.com/photos/andreanna/2812118063
      blackboard http://www.flickr.com/photos/shonk/418180402
      normal http://www.flickr.com/photos/infrogmation/3180606117
      phonograph http://www.flickr.com/photos/shiyazuni/4770244591
      dodo http://www.flickr.com/photos/wheatfields/2071347416
      Internals http://www.flickr.com/photos/37hz/4057856826
      writing http://www.flickr.com/photos/stevendepolo/3877225152
      consistency http://www.flickr.com/photos/betsyweber/4962297050
      partitioning http://www.flickr.com/photos/featheredtar/3137028766
      slices http://www.flickr.com/photos/free-stock/4899674517
      summary http://www.flickr.com/photos/jkdsphotography/4061838798
      links http://www.flickr.com/photos/creative_stock/3397559016

    ×