Consistency in
Distributed Systems
Mike Miller
Co-Founder, Chief Scientist
@mlmilleratmit
2014-06-12 2
Want to learn more?
P. Bailis: “Coordination and the Art of Scaling”
2014-06-12 3
{Introductions: ‘Me’}
Background -- Big Systems
2014-06-12 4
MobileBig Data
=> Stress models for consistency, transactional reasoning
2014-06-12
This is your problem when…
!
… data doesn’t fit on one server.
… data replicated between servers (e.g. read sla...
2014-06-12
This is now everyone’s problem
6
2014-06-12
Good news — market response:
NewSQL, NoSQL, Cloud, …
7
2014-06-12
Let’s view this from the
developer’s perspective
8
2014-06-12 9
ships with a mobile strategy
2014-06-12
{Install: ‘Cloudant’}
You do this: We give you:
https://<username>.cloudant.com
Done!
Sign Up
Step 1 Step 2 Ste...
2014-06-12
{Cloudant: ‘API’}
11
JSON
Documents
Primary
Index
Secondary
Indexes
Search &
Geospatial
2014-06-12
{Write: ‘Local’, Sync: ‘Later’}
Embedded,
Edge, Satellites
Desktop,
Browser
Cloud
12
2014-06-12
{Grow: ‘More’}
13
Multitenant or Dedicated
30+ Locations: Softlayer, Rackspace, Azure, AWS, …
2014-06-12
So…
How do you code for that?
How does that compare to <X>?
What about transactions?
14
2014-06-12
You do need to understand your
datastore.
15
2014-06-12 16
http://www.wired.com/wiredenterprise/2012/08/google-as-xerox-parc/
2014-06-12 17
Google File System (2003)
http://research.google.com/archive/gfs.html
!
Google MapReduce (2004)
http://resea...
2014-06-12 18
2014-06-12 19
{Sacrificed: ‘SPOFs’}
Replaced with self healing systems
2014-06-12 20
{Sacrificed: ‘Manual Sharding’}
http://static.googleusercontent.com/external_content/untrusted_dlcp/research...
2014-06-12 21
{Sacrificed: ‘Locks’}
Block (optionally) on reads, not writes
2014-06-12 22
{Sacrificed: ‘Schemas’}
“Schema on read”
2014-06-12 23
{Sacrificed: ‘Tables’}
SQL and JSON poorly matched
MySQL, MongoDB,
CouchDB, SOLR, …
Dynamo, Cloudant,
Cassandra, Riak, …
2014-06-12 25
…
…
http://www.bailis.org/papers/ramp-sigmod2014.pdf
{Sacrificed: ‘Transactions’}
Fundamental reason: CAP Th...
2014-06-12 26
{Consistency: ‘Eventual’}
https://amplab.cs.berkeley.edu/wp-content/uploads/2013/04/p20-bailis.pdf
Excellent...
2014-06-12 27
{Consistency: ‘Eventual’}
https://amplab.cs.berkeley.edu/wp-content/uploads/2012/06/p776_peterbailis_vldb201...
2014-06-12 28
{Consistency: ‘Eventual’}
“AP” “C”
2014-06-12 29
{Consistency: ‘Eventual’}
https://amplab.cs.berkeley.edu/wp-content/uploads/2013/04/p20-bailis.pdf
FoM = Ben...
2014-06-12 30
{Consistency: ‘Eventual’}
3 minutes, 100 points (Dow Jones)
2014-06-12 31
{Consistency: ‘Eventual’}
What is the penalty? Hedge strategy?
2014-06-12 32
{Strategy: ‘Immutability’}
Write-only state machine
2014-06-12 33
http://www.infoq.com/presentations/Value-Values
{Spokesperson: ‘Rich Hickey’}
2014-06-12
Immutability isn’t new
!
‣ “Accountants don’t use erasers”
‣ Functional, concurrent, distributed languages (e.g...
2014-06-12
‣ Don’t update in place
‣ Keep old versions
‣ Query for newest version
‣ Even works for deletions (write a “tom...
2014-06-12 36
{Strategy 2: ‘Minimize Contention’}
‣ Break out one-to-many, many-to-many relationships using foreign keys a...
2014-06-12 37
{Strategy 3: ‘Think Commutative’}
‣ Store “deltas”, just like your checkbook
Account Value via Materialized ...
2014-06-12 38
{Strategy 3: ‘Think Commutative’}
Commutative Replicated Data Types
(2010)
http://pagesperso-systeme.lip6.fr...
2014-06-12
Future Work
‣ Additional explicit data modeling examples
‣ Advanced reasoning for “AP” systems
‣ CRDTs
‣ Second...
2014-06-12 40
Keep Learning
2014-06-12 41
AMP on Consistency
https://amplab.cs.berkeley.edu/tag/consistency/
2014-06-12
cloudant.com
mike@cloudant.com
@mlmilleratmit
#Cloudant
Thanks!
42
IRC
2014-06-12 43
Upcoming SlideShare
Loading in...5
×

Consistency in Distributed Systems

660

Published on

A stable data model provides numerous advantages in developing Big Data and NoSQL systems, especially when sharing data over the cloud. Adopting an immutable model eases much of the pain of achieving consistency, especially at great scale. There can be trade-offs however that you need to be aware of.

This webinar will present details and examples of immutable data models as applied to various NoSQL systems, including MongoDB, Cloudant, Riak and Cassandra. The emphasis will be on the impact to application designers and architects, as well as the technical trade-offs and advantages. The discussion will be well-grounded in real world examples from within and beyond the enterprise.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
660
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
48
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Consistency in Distributed Systems

  1. 1. Consistency in Distributed Systems Mike Miller Co-Founder, Chief Scientist @mlmilleratmit
  2. 2. 2014-06-12 2 Want to learn more? P. Bailis: “Coordination and the Art of Scaling”
  3. 3. 2014-06-12 3 {Introductions: ‘Me’} Background -- Big Systems
  4. 4. 2014-06-12 4 MobileBig Data => Stress models for consistency, transactional reasoning
  5. 5. 2014-06-12 This is your problem when… ! … data doesn’t fit on one server. … data replicated between servers (e.g. read slaves). … data spread between data centers. … state spread across more than one device (mobile!) … mixed workloads with concurrency. … state spread across more than one process. 5
  6. 6. 2014-06-12 This is now everyone’s problem 6
  7. 7. 2014-06-12 Good news — market response: NewSQL, NoSQL, Cloud, … 7
  8. 8. 2014-06-12 Let’s view this from the developer’s perspective 8
  9. 9. 2014-06-12 9 ships with a mobile strategy
  10. 10. 2014-06-12 {Install: ‘Cloudant’} You do this: We give you: https://<username>.cloudant.com Done! Sign Up Step 1 Step 2 Step 3 10
  11. 11. 2014-06-12 {Cloudant: ‘API’} 11 JSON Documents Primary Index Secondary Indexes Search & Geospatial
  12. 12. 2014-06-12 {Write: ‘Local’, Sync: ‘Later’} Embedded, Edge, Satellites Desktop, Browser Cloud 12
  13. 13. 2014-06-12 {Grow: ‘More’} 13 Multitenant or Dedicated 30+ Locations: Softlayer, Rackspace, Azure, AWS, …
  14. 14. 2014-06-12 So… How do you code for that? How does that compare to <X>? What about transactions? 14
  15. 15. 2014-06-12 You do need to understand your datastore. 15
  16. 16. 2014-06-12 16 http://www.wired.com/wiredenterprise/2012/08/google-as-xerox-parc/
  17. 17. 2014-06-12 17 Google File System (2003) http://research.google.com/archive/gfs.html ! Google MapReduce (2004) http://research.google.com/archive/mapreduce.html ! Google BigTable (2006) http://research.google.com/archive/bigtable.html ! Amazon’s Dynamo (2007) http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
  18. 18. 2014-06-12 18
  19. 19. 2014-06-12 19 {Sacrificed: ‘SPOFs’} Replaced with self healing systems
  20. 20. 2014-06-12 20 {Sacrificed: ‘Manual Sharding’} http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/spanner-osdi2012.pdf
  21. 21. 2014-06-12 21 {Sacrificed: ‘Locks’} Block (optionally) on reads, not writes
  22. 22. 2014-06-12 22 {Sacrificed: ‘Schemas’} “Schema on read”
  23. 23. 2014-06-12 23 {Sacrificed: ‘Tables’} SQL and JSON poorly matched
  24. 24. MySQL, MongoDB, CouchDB, SOLR, … Dynamo, Cloudant, Cassandra, Riak, …
  25. 25. 2014-06-12 25 … … http://www.bailis.org/papers/ramp-sigmod2014.pdf {Sacrificed: ‘Transactions’} Fundamental reason: CAP Theorem
  26. 26. 2014-06-12 26 {Consistency: ‘Eventual’} https://amplab.cs.berkeley.edu/wp-content/uploads/2013/04/p20-bailis.pdf Excellent high level overview
  27. 27. 2014-06-12 27 {Consistency: ‘Eventual’} https://amplab.cs.berkeley.edu/wp-content/uploads/2012/06/p776_peterbailis_vldb2012.pdf
  28. 28. 2014-06-12 28 {Consistency: ‘Eventual’} “AP” “C”
  29. 29. 2014-06-12 29 {Consistency: ‘Eventual’} https://amplab.cs.berkeley.edu/wp-content/uploads/2013/04/p20-bailis.pdf FoM = Benefit - Cost*Rate
  30. 30. 2014-06-12 30 {Consistency: ‘Eventual’} 3 minutes, 100 points (Dow Jones)
  31. 31. 2014-06-12 31 {Consistency: ‘Eventual’} What is the penalty? Hedge strategy?
  32. 32. 2014-06-12 32 {Strategy: ‘Immutability’} Write-only state machine
  33. 33. 2014-06-12 33 http://www.infoq.com/presentations/Value-Values {Spokesperson: ‘Rich Hickey’}
  34. 34. 2014-06-12 Immutability isn’t new ! ‣ “Accountants don’t use erasers” ‣ Functional, concurrent, distributed languages (e.g. Erlang) ‣ File systems (e.g. ZFS) ‣ Storage engines (e.g. LevelDB) & Databases (CouchDB, Datomic, …) ‣ Data model 34
  35. 35. 2014-06-12 ‣ Don’t update in place ‣ Keep old versions ‣ Query for newest version ‣ Even works for deletions (write a “tombstone”) 35 {Strategy 1: ‘Write Only’}
  36. 36. 2014-06-12 36 {Strategy 2: ‘Minimize Contention’} ‣ Break out one-to-many, many-to-many relationships using foreign keys and links. ‣ Normalize! Learn your indexing options!
  37. 37. 2014-06-12 37 {Strategy 3: ‘Think Commutative’} ‣ Store “deltas”, just like your checkbook Account Value via Materialized View
  38. 38. 2014-06-12 38 {Strategy 3: ‘Think Commutative’} Commutative Replicated Data Types (2010) http://pagesperso-systeme.lip6.fr/Marc.Shapiro/papers/RR-6956.pdf
  39. 39. 2014-06-12 Future Work ‣ Additional explicit data modeling examples ‣ Advanced reasoning for “AP” systems ‣ CRDTs ‣ Secondary index consistency, maintenance (RAMP) ‣ “New” transactional systems (HAT, Google Spanner) ‣ “Call me maybe”: • (http://aphyr.com/posts/281-call-me-maybe-carly-rae-jepsen-and-the-perils-of-network- partitions) 39
  40. 40. 2014-06-12 40 Keep Learning
  41. 41. 2014-06-12 41 AMP on Consistency https://amplab.cs.berkeley.edu/tag/consistency/
  42. 42. 2014-06-12 cloudant.com mike@cloudant.com @mlmilleratmit #Cloudant Thanks! 42 IRC
  43. 43. 2014-06-12 43
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×