3. Entity based approach
• Select from orders where state=‘Dirty’
OrderID State others
x ‘Dirty’ ..
y ‘Synchronized’ ..
Z ‘Completed’ ..
4. What we tried
• Partition Key:‘Status’, Clustering Key: ’Order Id’
• Status Change was Delete and an insert.
• tombstones
5. tombstones
• t0(’dirty’) , t1(’synch’), t2(‘dirty’),…. tn
• n tombstones for the same record.
• They stay for 10 days
• Queries pick them even if the current state is not ‘dirty’
9. Log
• Append only
• Totally-ordered
• Append-at-end(order of occurrence = order of insert)
• e.g. change log, application log, kafka
10. Comparison
SQL based Query Cassandra Based Query Cassandra Based Log
Complexity O(# of dirty orders) O(# of times dirtied in the
last 10 days)
O(1)
11. Log Log*
inserts No-intermediate Intermediate
lookup offset key
query getNext Key, getNext*, Range query,
Key vs offset distinction yes no
getNext implementation native simulated