Apache Cassandra, part 2 – data model example, machinery

V. Data model example - Twissandra

Twissandra Use Cases Get the friends of a username Get the followers of a username Get a timeline of a specific user’s tweets Create a tweet Create a user Add friends to a user

Twissandra – DB User User id user_name password

Twissandra - DB Followers User User Followers id user_name password id user_name password user_id follower_id

Twissandra - DB Following User User Following id user_name password id user_name password user_id following_id

Twissandra – DB Tweets User Tweet id user_name password id user_id body timestamp

Twissandra column families User Username Friends, Followers Tweet Userline Timeline

Twissandra – Users CF <<CF>> User <<CF>> Username <<RowKey>> userid + username + password <<RowKey>> username + userid

Twissandra–Friends and Followers CFs <<CF>> Friends <<CF>> Followers <<RowKey>> userid <<RowKey>> userid friendid followerid timestamp timestamp

Twissandra – Tweet CF <<CF>> Tweet <<RowKey>> tweetid + userid + body + timestamp

Twissandra–Userline and Timeline CFs <<CF>> Userline <<CF>> Timeline <<RowKey>> userid <<RowKey>> userid timestamp timestamp tweetid tweetid

Cassandra QL – User creation BATCH BEGIN BATCH INSERT INTO User (KEY, username, password) VALUES (‘id', ‘konstantin’, ‘******’) INSERT INTO Username (KEY, userid) VALUES ( ‘konstantin’, ‘id’) APPLY BATCH

Cassandra QL – following a friend BATCH BEGIN BATCH INSERT INTO Friends (KEY, friendid) VALUES (‘userid‘, ‘friendid’) INSERT INTO Followers (KEY, userid) VALUES (‘friendid ‘, ‘userid’) APPLY BATCH

Cassandra QL – Tweet creation BATCH BEGIN BATCH INSERT INTO Tweet (KEY, userid, body, timestamp) VALUES (‘tweetid‘, ‘userid’, ’@ericflo thanks for Twissandra, it helps!’, 123656459847) INSERT INTO Userline (KEY, 123656459847) VALUES ( ‘userid’, ‘tweetid’) INSERT INTO Timeline (KEY, 123656459847) VALUES ( ‘userid’, ‘tweetid’) …….. INSERT INTO Timeline (KEY, 123656459847) VALUES ( ‘followerid’, ‘tweetid’) …… APPLY BATCH

Cassandra QL – Getting user tweets SELECT * FROM Userline KEY = ‘userid’ SELECT * FROM Tweet WHERE KEY IN (‘tweetid1’, ‘tweetid2’, ‘tweetid3’, …., ‘tweetidn’)

Cassandra QL – Getting user timeline SELECT * FROM Timeline KEY = ‘userid’ SELECT * FROM Tweet WHERE KEY IN (‘tweetid1’, ‘tweetid2’, ‘tweetid3’, …., ‘tweetidn’)

Design patterns Materialized View create a second column family to represent additional queries Valueless Column use column names for values Aggregate Key If you need to find sub item, use composite key

Indexes <<CF>> Item_Properties <<CF>> Container_Items <<RowKey>> item_id <<RowKey>> container_id property_name item_id property_value insertion_timestamp

Indexes <<CF>> Container_Items_Property_Index <<RowKey>> container_id + property_name composite(property_value, item_id, entry_timestamp) item_id Comparator: compositecomparer.CompositeType

Problem with eventual consistency When we update value, we should add new value to index, and remove old value. However, eventual consistency and lack of transactions make it impossible

Solution <<CF>> Container_Item_Property_Index_Entries <<RowKey>> container_id + item_id + property_name entry_timestamp property_value

Partitioners Partitioners decide where a key maps onto the ring. Key 1 Key 2 Key 3 Key 4

Partitioners RandomPartitioner OrderPreservingPartitioner ByteOrderedPartitioner CollatingOrderPreservingPartitioner

Replication Replication controlled by the replication_factor setting in the keyspace definition The actual placement of replicas in the cluster is determined by the Replica Placement Strategies.

Placement Strategies SimpleStrategy - returns the nodes that are next to each other on the ring.

Placement Strategies OldNetworkTopologyStrategy - places one replica in a different data center while placing the others on different racks in the current data center.

Placement Strategies NetworkTopologyStrategy - Allows you to configure the number of replicas per data center as specified in the strategy_options.

Snitches Give Cassandra information about the network topology of the cluster Endpoint snitch – gives information about network topology. Dynamic snitch – monitor read latencies

Endpoint Snitch Implementations SimpleSnitch(default)- can be efficient for locating nodes in clusters limited to a single data center.

Endpoint Snitch Implementations RackInferringSnitch - extrapolates the topolology of the network by analyzing IP addresses. 192.168.191.71 In the same rack 192.168.191.21 192.168.191.71 In the same datacenter 192.168.171.21 192.78.19.71 In different datacenters 192.18.11.21

Endpoint Snitch Implementations PropertyFileSnitch - determines the location of nodes by referring to a user-defined description of the network details located in the property file cassandra-topology.properties.

sequential writes onlyMemtable ,[object Object],SSTable ,[object Object]

indexesMemtables, SSTables, Commit Logs

Write properties Write properties No reads No seeks Fast Atomic within ColumnFamily Always writable

Write/Read properties Read properties Read multiple SSTables Slower than writes (but still fast) Seeks can be mitigated with more RAM Scales to billions of rows

Commit Log durability Durability settings reflects PostgreSQL settings. Periodic sync of commit log. With potential probability for data loss. Batch sync of commit log. Write is acknowledged only if commit log is flushed on disk. It is strongly recommended to have separate device for commit log in such case.

Gossip protocol Intra-ring communication Runs periodically Failure detection,hinted handoffs and nodes exchange

Gossip protocol org.apache.cassandra.gms.Gossiper Has the list of nodes that are alive and dead Chooses a random node and starts “chat” with it. One gossip round requires three messages Failure detection uses a suspicion level to decide whether the node is alive or dead

Hinted handoff Write Hint Cassandra is always available for write

Tombstones The data is not immediately deleted Deleted values are marked Tombstones will be suppressed during next compaction GCGraceSeconds – amount of seconds that server will wait to garbage-collect a tombstone

Compaction Merging SSTables into one merging keys combining columns creating new index Main aims: Free up space Reduce number of required seeks

Compaction Minor: Triggered when at least N SSTables have been flushed on disk (N is tunable, 4 – by default) Merging SSTables of the similar size Major: Merging all SSTables Done manually through nodetool compact discarding tombstones

Replica synchronization Anti-entropy Read repair

Anti-entropy During major compaction the node exchanges Merkle trees (hash of its data) with another nodes If the trees don’t match, they are repaired Nodes maintain timestamp index and exchange only the most recent updates

Read repair During read operation replicas with stale values are brought up to date Week consistency level (ONE): after the data is returned Strong consistency level (QUORUM, ALL): before the data is returned

Bloom filters A bit array Test whether value is a member of set Reduce disk access (improve performance)

Bloom filters On write:` several hashes are generated per key bits for each hash are marked On read: hashes are generated for the key if all bits of this hashes are non-empty then the key may probably exist in SSTable if at least one bit is empty then the key has been never written to SSTable

Bloom filters Read Write 1 0 0 Hash1 Hash1 0 0 0 Key1 Hash2 Key2 Hash2 0 1 0 Hash3 1 Hash3 0 SSTable

Apache Cassandra, part 2 – data model example, machinery

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (6)

Similar to Apache Cassandra, part 2 – data model example, machinery

Similar to Apache Cassandra, part 2 – data model example, machinery (20)

Recently uploaded

Recently uploaded (20)

Apache Cassandra, part 2 – data model example, machinery

Editor's Notes