Your SlideShare is downloading. ×
0
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Cassandra - Beyond Read-Modify-Write by Al Tobey
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cassandra - Beyond Read-Modify-Write by Al Tobey

3,450

Published on

YouTube Video: TBA …

YouTube Video: TBA
As we move into the world of Big Data and the Internet of Things, the systems architectures and data models we've relied on for decades are becoming a hindrance. At the core of the problem is the read-modify-write cycle. In this session, Al will talk about how to build systems that don't rely on RMW, with a focus on Cassandra. Finally, for those times when RMW is unavoidable, he will cover how and when to use Cassandra's lightweight transactions and collections.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,450
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Beyond Read-Modify-Write @AlTobey Open Source Mechanic | Datastax Apache Cassandra のオープンソースエバンジェリスト ©2014 DataStax Obsessed with infrastructure my whole life. See distributed systems everywhere. !1
  • 2. The Problem Skype video call with grandma at 3 hrs. Using Netflix solo at 18 months. Plays many games with online metrics at 4 years old.
  • 3. The Problem ! Users expect their infrastructure to Just Work. Explain to my 4 year old why he can’t watch cartoons on Netflix.
  • 4. The Problem Lag kills.
  • 5. The Problem Even when everything is working, or even especially then, usage can explode, often unexpectedly.
  • 6. Evolution 3-tier + read scaled DB + cache Classic 3-tier A high-level overview of internet architectures. Client/Server
  • 7. Client-server Client - server database architecture. Obsolete. The original “funnel-shaped” architecture
  • 8. 3-tier Client - client - server database architecture. Still suitable for small applications. e.g. LAMP, RoR
  • 9. 3-tier master/slave slave Client - client - server database architecture. Still suitable for small applications. master slave
  • 10. 3-tier + caching cache slave more complex cache coherency is a hard problem cascading failures are common Next: Out: the funnel. In: The ring. master slave
  • 11. Webscale outer ring: clients (cell phones, etc.) middle ring: application servers inside ring: Cassandra servers ! Serving millions of clients with mere hundreds or thousands of nodes requires a different approach to applications!
  • 12. When it Rains scale out … but at a cost
  • 13. Beyond Read-Modify-Write •Practical Safety •Eventual Consistency •Overwrites •Key / Value •Journal / Logging / Time-series •Content-addressable-storage •Cassandra Collection Types •Cassandra Lightweight Transactions
  • 14. Theory & Practice In theory there is no difference between theory and practice. In practice there is. ! -Yogi Berra I could talk about CAP theorem, but there’s plenty of that going around. And then there’s this quote.
  • 15. Safety Fun happens when you turn the safety off. ! But you can’t sell it.
  • 16. Safety But you don’t have to throw it out entirely. - roll cages - kill switches - rev limiters - protective clothing
  • 17. Safety max speed is 320 km/h (200 mph) Tested to 443 km/h (275 mph) & 581 km/h (361 mph) (world record) Safety is ultimately a property of the system. But it can be expensive. - lots of maintenance - inspections - reputation
  • 18. Read-Modify-Write UPDATE  Employees  SET  Rank=4,  Promoted=2014-­‐01-­‐24   WHERE  EmployeeID=1337; EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********3 Promoted****null This might be what it looks like from SQL / CQL, but … ! EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********4 Promoted****2014501524
  • 19. Read-Modify-Write UPDATE  Employees  SET  Rank=4,  Promoted=2014-­‐01-­‐24   WHERE  EmployeeID=1337; EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********4 Promoted****2014501524 EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********3 Promoted****null RDBMS TNSTAAFL 無償の昼食なんてものはありません TNSTAAFL … If you’re lucky, the cell is in cache. Otherwise, it’s a disk access to read, another to write.
  • 20. Eventual Consistency UPDATE  Employees  SET  Rank=4,  Promoted=2014-­‐01-­‐24   WHERE  EmployeeID=1337; EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********3 Promoted****null EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********4 Promoted****2014501524 Explain distributed RMW More complicated. Will talk about how it’s abstracted in CQL later. Coordinator
  • 21. Eventual Consistency UPDATE  Employees  SET  Rank=4,  Promoted=2014-­‐01-­‐24   WHERE  EmployeeID=1337; EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********3 Promoted****null EmployeeID**1337 Name********アルトビー StartDate***2013510501 Rank********4 Promoted****2014501524 Coordinator read write Memory replication on write, depending on RF, usually RF=3. Reads AND writes remain available through partitions. Hinted handoff.
  • 22. Overwriting CREATE TABLE host_lookup ( name varchar, id uuid, PRIMARY KEY(name) ); ! INSERT INTO host_uuid (name,id) VALUES (“www.tobert.org”, “463b03ec-fcc1-4428-bac8-80ccee1c2f77”); ! INSERT INTO host_uuid (name,id) VALUES (“tobert.org”, “463b03ec-fcc1-4428-bac8-80ccee1c2f77”); ! INSERT INTO host_uuid (name,id) VALUES (“www.tobert.org”, “463b03ec-fcc1-4428-bac8-80ccee1c2f77”); ! SELECT id FROM host_lookup WHERE name=“tobert.org”; Beware of expensive compaction Best for: small indexes, lookup tables Compaction handles RMW at storage level in the background. Under heavy writes, clock synchronization is very important to avoid timestamp collisions. In practice, this isn’t a problem very often and even when it goes wrong, not much harm done.
  • 23. Key/Value CREATE TABLE keyval ( key VARCHAR, value blob, PRIMARY KEY(key) ); ! INSERT INTO keyval (key,value) VALUES (?, ?); ! SELECT value FROM keyval WHERE key=?; e.g. memcached Don’t do this. But it works when you really need it.
  • 24. Journaling / Logging / Time-series CREATE TABLE tsdb ( time_bucket timestamp, time timestamp, value blob, PRIMARY KEY(time_bucket, time) ); ! INSERT INTO tsdb (time_bucket, time, value) VALUES ( “2014-10-24”, -- 1-day bucket (UTC) “2014-10-24T12:12:12Z”, -- ALWAYS USE UTC ‘{“foo”: “bar”}’ ); Oversimplified, use normalization over blobs whenever possible. ALWAYS USE UTC :)
  • 25. Journaling / Logging / Time-series 2014(01(24 2014(01(24T12:12:12Z 2014(01(24T21:21:21Z {“key”:" value”} {“key”:"“value”} 2014(01(25 2014(01(25T13:13:13Z {“key”:"“value”} {"“2014(01(24”"=>"{ """"“2014(01(24T12:12:12Z”"=>"{ """"""""‘{“foo”:"“bar”}’ """"} } Oversimplified, use normalization over blobs whenever possible. ALWAYS USE UTC :)
  • 26. Content Addressable Storage CREATE TABLE objects ( cid varchar, content blob, PRIMARY KEY(cid) ); ! INSERT INTO objects (cid,content) VALUES (?, ?); ! SELECT content FROM objects WHERE cid=?; The address of the data can be created by using the data itself. e.g. SHA1 (160 bits), MD5, Whirpool, etc.
  • 27. Content Addressable Storage require  'cql'   require  ‘digest/sha1'   ! dbh  =  Cql::Client.connect(hosts:  ['127.0.0.1'])   dbh.use('cas')   ! data  =  {  :timestamp  =>  1390436043,  :value  =>  1234  }   ! cid  =  Digest::SHA1.new.digest(data.to_s).unpack(‘H*’)   ! sth  =  dbh.prepare(     'SELECT  content  FROM  objects  WHERE  cid=?')   ! sth.execute(root_cid).first[‘content’] Oversimplified! e.g. data.to_s is a BAD idea ALWAYS USE UTC :)
  • 28. In Practice • In practice, RMW is sometimes unavoidable • Recent versions of Cassandra support RMW • Use them only when necessary • Or when performance hit is mitigated elsewhere or irrelevant
  • 29. Cassandra Collections CREATE TABLE posts ( id uuid, body varchar, created timestamp, authors set<varchar>, tags set<varchar>, PRIMARY KEY(id) ); ! INSERT INTO posts (id,body,created,authors,tags) VALUES ( ea4aba7d-9344-4d08-8ca5-873aa1214068, ‘アルトビーの犬はばかね’, ‘now', [‘アルトビー’, ’ィオートビー’], [‘dog’, ‘silly’, ’犬’, ‘ばか’] ); quick story about 犬ばかね sets & maps are CRDTs, safe to modify
  • 30. Cassandra Collections CREATE TABLE metrics ( bucket timestamp, time timestamp, value blob, labels map<varchar,varchar>, PRIMARY KEY(bucket) ); sets & maps are CRDTs, safe to modify
  • 31. Lightweight Transactions • Cassandra 2.0 and on support LWT based on PAXOS • PAXOS is a distributed consensus protocol • Given a constraint, Cassandra ensures correct ordering
  • 32. Lightweight Transactions UPDATE  users          SET  username=‘tobert’    WHERE  id=68021e8a-­‐9eb0-­‐436c-­‐8cdd-­‐aac629788383          IF  username=‘renice’;   ! INSERT  INTO  users  (id,  username)   VALUES  (68021e8a-­‐9eb0-­‐436c-­‐8cdd-­‐aac629788383,  ‘renice’)   IF  NOT  EXISTS;   ! ! Client error on conflict.
  • 33. Conclusion • Businesses are scaling further and faster than ever • Assume you have to provide utility-grade service • Data models and application architectures need to change to keep up • Avoiding Read/Modify/Write makes high-performance easier • Cassandra provides tools for safe RMW when you need it ! • Questions?

×