Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Transaction Support in Pulsar 2.5.0

155 views

Published on

In Apache Pulsar Beijing Meetup, Sijie Guo and Yong Zhang gave a preview of transaction support in Pulsar 2.5.0. Sijie Guo started with the current state of messaging semantics in Pulsar and talked about the implementation of message deduplication introduced by PIP-6. Then he went into the details of why do we need transaction and how do we implement transaction in Pulsar. Finally Yong walked through the whole transaction execution flow.

Published in: Internet
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (Unlimited) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/qURD } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/qURD } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/qURD } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/qURD } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/qURD } ......................................................................................................................... Download doc Ebook here { https://soo.gd/qURD } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Transaction Support in Pulsar 2.5.0

  1. 1. Sijie Guo Apache Pulsar / BookKeeper PMC Member Transaction Support in Pulsar Yong Zhang Apache Pulsar Contributor
  2. 2. • At-most once • At-least once • Exactly once Messaging Semantics
  3. 3. • At-most once • At-least once • Exactly once Messaging Semantics Before 1.20.0-incubating
  4. 4. • At-most once • At-least once • Exactly once Messaging Semantics PIP-6: Guaranteed Message Deduplication
  5. 5. Revisit Existing Semantics
  6. 6. Pulsar’s Existing Semantics Log BrokerProducer send(m1)
  7. 7. Pulsar’s Existing Semantics Log BrokerProducer append(m1)
  8. 8. Pulsar’s Existing Semantics Log BrokerProducer m1
  9. 9. Pulsar’s Existing Semantics Log BrokerProducer ack(m1) m1
  10. 10. Pulsar’s Existing Semantics Log BrokerProducer ack(m1) m1
  11. 11. Pulsar’s Existing Semantics Log BrokerProducer send(m2) m1
  12. 12. Pulsar’s Existing Semantics Log BrokerProducer append(m2) m1 m2
  13. 13. Pulsar’s Existing Semantics Log BrokerProducer m1 m2 ack(m2)
  14. 14. Pulsar’s Existing Semantics Log BrokerProducer m1 m2 ack(m2) What do we do now?
  15. 15. At Least Once Log BrokerProducer m1 m2 send(m2)
  16. 16. At Least Once Log BrokerProducer m1 m2 append(m2) m2
  17. 17. At Least Once Log BrokerProducer m1 m2 append(m2) m2 Duplicates !!
  18. 18. • Broker can fail • The request from Producer to Broker can fail • Producer or Consumer can fail Why the duplicates are introduced?
  19. 19. I want exactly-once
  20. 20. • Producer: Idempotent Producer • Broker: Guaranteed Message Deduplication (PIP-6) • Consumer: Reader + Checkpoints (Flink / Spark) Message Deduplication
  21. 21. • Producer Name - Identify who is producing the messages • Sequence ID - Identify the message • Producer Name + Sequence ID: The unique identifier for a message Idempotent Producer
  22. 22. • Broker maintains a map between Producer Name and Last- Produced-Sequence-ID • Broker accepts messages if the sequence id of a new message is larger than its last produced sequence id • Broker treats messages whose sequence id are smaller • Broker keeps the map in a de-duplication cursor (stored in bookkeeper) Guaranteed Message Deduplication
  23. 23. Exactly Once Log BrokerProducer send(1, m1)
  24. 24. Exactly Once Log BrokerProducer append(1, m1) 1,m1
  25. 25. Exactly Once Log BrokerProducer append(2, m2) 1,m1 2,m2
  26. 26. Exactly Once Log BrokerProducer 1,m1 2,m2 ack(2, m2) What do we do now?
  27. 27. Exactly Once Log BrokerProducer 1,m1 2,m2 send(2, m2)
  28. 28. Exactly Once Log BrokerProducer 1,m1 2,m2 append(2, m2)
  29. 29. Exactly Once Log BrokerProducer 1,m1 2,m2 append(2, m2) Duplicate detected
  30. 30. Exactly Once Log BrokerProducer 1,m1 2,m2 ack(2, m2)
  31. 31. • `bin/pulsar-admin set-deduplication -e tenant/namespace` • Set producer name when creating a Producer • Specify increasing sequence id when producing messages Enable Exactly Once
  32. 32. • It only works when producing messages to one partition • It only works for producing one message • There is no atomicity when producing multiple messages to one partition or many partitions • Consumers are required to store the MessageId along with its state and seek back to the MessageId when restoring the state Limitations
  33. 33. Introducing Transactions
  34. 34. PulsarCash
  35. 35. PulsarCash Transfer $10 Alice Bob
  36. 36. • Transfer Topic : record the transfer requests • Cash Transfer Function: perform the cash transfer action • BalanceUpdate Topic: record the balance-update requests PulsarCash, powered by Apache Pulsar
  37. 37. PulsarCash Cash Transfer Function Balance user:alice, debit($10) balance update balance update user:bob, credit($10) (100,0,0): transfer($10, alice -> bob) Transfer Topic
  38. 38. Ack Transfer Cash Transfer function Balance user:alice, debit($10) balance update balance update user:bob, credit($10) (100,0,0): transfer($10, alice -> bob) Ack: (100,0,0)
  39. 39. Reprocessed Transfer! Cash Transfer function Balance user:alice, debit($10) balance update balance update user:bob, credit($10) (100,0,0): transfer($10, alice -> bob) Ack: (100,0,0)
  40. 40. Lost Money! Cash Transfer function Balance user:alice, debit($10) balance update balance update user:bob, credit($10) (100,0,0): transfer($10, alice -> bob) Ack: (100,0,0)
  41. 41. Pulsar Transaction Explained
  42. 42. • Atomic writes across multiple partitions • Atomic acknowledges across multiple subscriptions • All the actions made within one transaction either all succeed or all fail • Consumers are *ONLY* allowed to read committed messages Transaction Semantics
  43. 43. Message<String> message = inputConsumer.receive(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage().value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage().value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId()); Without Transaction API
  44. 44. Broker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Data Log Data Log Pulsar Client Input Consumer Producer 1 Producer 2 0) Receive Message 1) Produce Messages 2) Ack Messages
  45. 45. Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get(); Transaction API
  46. 46. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2
  47. 47. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2
  48. 48. • TC: transaction manager, coordinating committing and aborting transactions • In-Memory + Transaction Log • Transaction Log is powered by a partitioned Pulsar topic • `pulsar/system/__transaction_coordinator_log` • Locating a TC is locating a partition of the transaction log topic Transaction Coordinator (TC)
  49. 49. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2
  50. 50. • TB: store and index transaction data per topic partition • TB is implemented using another ML (managed-ledger) as TB log • Messages are appended to into TB log • Transaction Index is maintained in memory and snapshotted to ledgers • Transaction Index can be replayed from TB log Transaction Buffer (TB)
  51. 51. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2
  52. 52. • Introduce ACK_PENDING state • Add response for acknowledgement, aka ack-on-ack • Ack state is updated to cursor ledger • Ack state can be replayed from cursor ledger Transactional Subscription State
  53. 53. Transaction Execution Flow
  54. 54. Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get(); Transaction API - New Transaction
  55. 55. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2 Txn New Txn 1. New Txn Tx1
  56. 56. Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get(); Transaction API - Produce Messages
  57. 57. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2 Txn New Txn 2.0 Add Produced Topics To Txn Tx1 Tx1: add [T1, T2] Tx1: M1 Tx1: M2 2.1 Produced Messages To Topics with Txn
  58. 58. Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get(); Transaction API - Acknowledges
  59. 59. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2 Txn New Txn 3.0 Add Acked Subscriptions To Txn Tx1 Tx1: add [T1, T2] Tx1: M1 Tx1: M2 3.0 Ack messages with Txn Tx1: ACK (M0) Tx1: add [S0]
  60. 60. Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get(); Transaction API - Commit
  61. 61. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2 Txn New Txn 4.0 Commit Txn Tx1 Tx1: add [T1, T2] Tx1: M1 Tx1: M2 Tx1: ACK (M0) Tx1: add [S0] 4.0 Committing Txn Tx1: Committing
  62. 62. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2 Txn New Txn Tx1 Tx1: add [T1, T2] Tx1: M1 Tx1: M2 Tx1: ACK (M0) Tx1: add [S0] 4.1.0 Commit Txn On Topics 4.1.1 Commit Txn On Subscriptions Tx1 (c) Tx1 (c) Tx1: Committing Tx1: Committed Tx1: Committed
  63. 63. CoordinatorBroker-0 Broker-1 InputTopic OutputTopic-1 OutputTopic-2 Cursor Transaction Log Data Log Txn Buffer Data Log Txn Buffer Pulsar Client Input Consumer Producer 1 Producer 2 Txn New Txn Tx1 Tx1: add [T1, T2] Tx1: M1 Tx1: M2 Tx1: ACK (M0) Tx1: add [S0] Tx1: Committing Tx1 (c) Tx1 (c) Tx1: Committed Tx1: Committed 4.2 Committed Txn
  64. 64. inputConsumer.receiveAsync().thenCompose(message -> { return client.newTransaction().withTransactionTimeout(…).build().thenCompose(txn -> { producer1.newMessage(txn).value(“output-message-1”).sendAsync(); producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn);
 return txn.commit(); }); }) Transaction API - Async Example
  65. 65. PulsarCash Cash Transfer function Balance user:alice, debit($10) balance update balance update user:bob, credit($10) (100,0,0): transfer($10, alice -> bob)Ack: (100,0,0)
  66. 66. PulsarCash Cash Transfer function Balance user:alice, debit($10) balance update balance update user:bob, credit($10) (100,0,0): transfer($10, alice -> bob)Ack: (100,0,0) Transaction
  67. 67. Make Event Streaming easy, simple, and reliable for everyone Pulsar Transaction
  68. 68. Available to use in Pulsar 2.5.0 When is it available?
  69. 69. • Transaction support in other languages (e.g. C++, Go) • Transaction in Pulsar Functions & Pulsar IO • Transaction in Kafka-on-Pulsar (KOP) • Transaction for Flink / Spark job • Transaction for State storage in Pulsar Functions • … Roadmap
  70. 70. • Ivan Kelly • Matteo Merli • Jia Zhai • Penghui Li • Marvin Cai • Yong Zhang • … and many other Pulsar users & contributors Credits
  71. 71. Thanks! Sijie Guo (@sijieg) Yong Zhang
  72. 72. Thanks! Sam ple

×