Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dissolving the Problem: Making an ACID-Compliant Database Out of Apache Kafka

836 views

Published on

Speaker: Tim Berglund, Senior Director of Developer Experience, Confluent

It has become a truism in the past decade that building systems at scale, using non-relational databases, requires giving up on the transactional guarantees afforded by the relational databases of yore. ACID transactional semantics are fine, but we all know you can’t have them all in a distributed system. Or can we?

In this talk, I will argue that by designing our systems around a distributed log like Apache Kafka®, we can in fact achieve ACID semantics at scale. We can ensure that distributed write operations can be applied atomically, consistently, in isolation between services, and of course with durability. What seems to be a counterintuitive conclusion ends up being straightforwardly achievable using existing technologies, as an elusive set of properties becomes relatively easy to achieve with the right architectural paradigm underlying the application.

Published in: Technology
  • Be the first to comment

Dissolving the Problem: Making an ACID-Compliant Database Out of Apache Kafka

  1. 1. Dissolving Problem the @tlberglund (making an ACID-compliant database out of Ka?a)
  2. 2. https://www.amazon.com/dp/1449373321
  3. 3. What is a Database?
  4. 4. A program that remembers things.
  5. 5. A program that remembers things and has a data model.
  6. 6. A program that remembers things and has a data model and ACID transactional properties.
  7. 7. What is ACID?
  8. 8. Atomicity Consistency Isolation Durability
  9. 9. Durability
  10. 10. picture of tape https://www.flickr.com/photos/phrenologist/3252001011/
  11. 11. picture of disks https://www.flickr.com/photos/philipus/29711988683
  12. 12. Broker 4Broker 3Broker 2Broker 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 4 Topic 1 Partition 4
  13. 13. Atomicity
  14. 14. Database Transaction BEGIN; UPDATE account
 SET balance += 100
 WHERE username = tlberglund; UPDATE account 
 SET balance -= 100 
 WHERE username = gwenshap; COMMIT;
  15. 15. ApplicationEvents
  16. 16. Isolation
  17. 17. process 1 process 2 Is there a Tim? nope Is there a Tim? Awesome, make a Tim nope No prob! Okay, hoss Cool, make a Tim
  18. 18. process 1 process 2 Is there a Tim? nope Is there a Tim? Awesome, make a Tim nope No prob! Okay, hoss Cool, make a Tim
  19. 19. Consistency
  20. 20. Consistency • Invariants/constraints • Unique usernames • Account balances greater than zero
  21. 21. What is Kafka?
  22. 22. Topics
  23. 23. K V
  24. 24. K V
  25. 25. K V
  26. 26. K V
  27. 27. K V
  28. 28. K V
  29. 29. K V
  30. 30. K V • Log of events • Strict ordering guarantee • Constant-time reads and writes • Persistent on disk
  31. 31. Partitioning
  32. 32. ………
  33. 33. … … … Partition 0 Partition 1 Partition 2 K V Run the key through a hash function
  34. 34. … … … Partition 0 Partition 1 Partition 2 K V
  35. 35. … … … Partition 0 Partition 1 Partition 2 K V
  36. 36. … … … Partition 0 Partition 1 Partition 2 K V
  37. 37. … … … Partition 0 Partition 1 Partition 2 K V
  38. 38. • Provides scalable writes, storage, and consumption • Ordering is within partition only • Key selection becomes a data modeling concern … … …
  39. 39. Replication
  40. 40. Broker 4Broker 3Broker 2Broker 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 4
  41. 41. Broker 4Broker 3Broker 2Broker 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 1 Topic 1 Partition 1
  42. 42. Broker 4Broker 3Broker 2Broker 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 2
  43. 43. Broker 4Broker 3Broker 2Broker 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 3
  44. 44. Broker 4Broker 3Broker 2Broker 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 1 Topic 1 Partition 1 Topic 1 Partition 2 Topic 1 Partition 2 Topic 1 Partition 3 Topic 1 Partition 3 Topic 1 Partition 4 Topic 1 Partition 4 Topic 1 Partition 4
  45. 45. Producers
  46. 46. … … … partition 0 partition 1 partition 2 Partitioned Topic producer
  47. 47. … … … partition 0 partition 1 partition 2 Partitioned Topic producer • A client application • Puts messages into topics • Handles partitioning, network protocol • Java, Go, .NET, C/C++, Python • Also every other language
  48. 48. Consumers
  49. 49. consumer A … … … partition 0 partition 1 partition 2 Partitioned Topic
  50. 50. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic
  51. 51. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic
  52. 52. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A
  53. 53. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A consumer A
  54. 54. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A consumer A • A client application • Reads messages from topics • Horizontally, elastically scalable (if stateless) • Java, Go, .NET, C/C++, Python, everything else
  55. 55. Kafka Streams
  56. 56. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A consumer A
  57. 57. consumer A consumer A consumer A
  58. 58. Turn streams into tables Enrich a stream with a table Aggregate streams Join one stream with another Scale stateful applications
  59. 59. Functional Java API Abstractions for streams and tables Scalable, fault-tolerant state
  60. 60. consumer A consumer A consumer A
  61. 61. Streams Application Streams Application Streams Application
  62. 62. • Java API • Filter, join, aggregate, etc. • Locates stream processing with your application • Scales like a Consumer Group (but better!) KTable<Long, Movie> movies = builder.table(“movies”, Materialized. <Long, Movie,KeyValueStore< Bytes, byte[]>> as(“movies-store") .withValueSerde(movieSerde) .withKeySerde(Serdes.Long()) );
  63. 63. KSQL
  64. 64. CREATE TABLE movie_ratings AS SELECT title, SUM(rating)/COUNT(rating) AS avg_rating, COUNT(rating) AS num_ratings FROM ratings LEFT OUTER JOIN movies ON ratings.movie_id = movies.movie_id GROUP BY title;
  65. 65. producer consumer KSQL Cluster KSQL Server KSQL Server
  66. 66. • Declarative stream processing language • Provides stream and table abstractions • Filter, join, aggregate • Run on horizontally scalable KSQL cluster CREATE TABLE movie_ratings AS SELECT title, SUM(rating)/COUNT(rating) AS avg_rating, COUNT(rating) AS num_ratings FROM ratings LEFT OUTER JOIN movies ON ratings.movie_id = movies.movie_id GROUP BY title;
  67. 67. Coolbut can I ACIDwith Kafka?
  68. 68. Durability
  69. 69. consumer A consumer B … … … partition 0 partition 1 partition 2 Partitioned Topic consumer A consumer A
  70. 70. Atomicity
  71. 71. ApplicationEvents
  72. 72. ApplicationEvents
  73. 73. ApplicationEvents
  74. 74. ApplicationEvents
  75. 75. ApplicationEvents
  76. 76. ApplicationEvents
  77. 77. ApplicationEvents
  78. 78. ApplicationEvents
  79. 79. ApplicationEvents
  80. 80. ApplicationEvents
  81. 81. Isolation
  82. 82. process 1 process 2 Is there a Tim? nope Is there a Tim? Awesome, make a Tim nope No prob! Okay, hoss Cool, make a Tim
  83. 83. process 1 process 2 I would like a Tim I would like a Tim Tim-1 Tim-2
  84. 84. process 1 process 2 Tim-1 Tim-2 Users Ale Yeva Vik
  85. 85. process 1 process 2 Tim-1 Tim-2 Users Ale Yeva Vik Tim
  86. 86. process 1 process 2 Users Ale Yeva Vik Tim yesIs there a Tim? Is there a Tim? yes
  87. 87. Consistency
  88. 88. What’s a database anyway?
  89. 89. SQL
  90. 90. Tabular Model
  91. 91. Storage Engine
  92. 92. Commit Log
  93. 93. You are not just writing microservices.
  94. 94. You are building an inside-out database
  95. 95. with ACID semantics
  96. 96. and that is a good thing.
  97. 97. Thank You! @tlberglund http://slackpass.io/confluentcommunity http://confluent.io/ksql

×