Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Preview of Apache Pulsar 2.5.0

19 views

Published on

Apache Pulsar Meetup 1116

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Preview of Apache Pulsar 2.5.0

  1. 1. Preview of Apache Pulsar 2.5.0 Transactional streaming Sticky consumer Batch receiving Namespace change events
  2. 2. Messaging semantics - 1 1. At least once try { Message msg = consumer.receive() // processing consumer.acknowledge(msg) } catch (Exception e) { consumer.negativeAcknowledge(msg) } try { Message msg = consumer.receive() // processing } catch (Exception e) { log.error(“processing error”, e) } finally { consumer.acknowledge(msg) } 2. At most once 3. Exactly once ?
  3. 3. Messaging semantics - 2 idempotent produce and idempotent consume be used more in practice
  4. 4. Messaging semantics - 3 Effectively once ledgerId + messageId -> sequenceId + Broker deduplication
  5. 5. Messaging semantics - 4 Limitations in effectively once 1. Only works with one partition producing 2. Only works with one message producing 3. Only works with on partition consuming 4. Consumers are required to store the message id and state for restoring
  6. 6. Streaming processing - 1 ATopic-1 Topic-2f (A) B 1 1. Received message A from Topic-1 and do some processing
  7. 7. Streaming processing - 2 ATopic-1 Topic-2f (A) B 2 2. Write the result message B to Topic-2
  8. 8. Streaming processing - 3 ATopic-1 Topic-2f (A) B 3 3. Get send response from Topic-2 How to handle get response timeout or consumer/function crash? Ack message A = At most once Nack message A = At least once
  9. 9. Streaming processing - 4 ATopic-1 Topic-2f (A) B4 4. Ack message A How to handle ack failed or consumer/function crash?
  10. 10. Transactional streaming semantics 1. Atomic multi-topic publish and acknowledge 2.Message only dispatch to one consumer until transaction abort 3.Only committed message can be read by consumer READ_COMMITTED https://github.com/apache/pulsar/wiki/PIP-31%3A-Transaction-Support
  11. 11. Transactional streaming demo Message<String> message = inputConsumer.receive(); Transaction txn = client.newTransaction().withTransactionTimeout(…).build().get(); CompletableFuture<MessageId> sendFuture1 = producer1.newMessage(txn).value(“output-message-1”).sendAsync(); CompletableFuture<MessageId> sendFuture2 = producer2.newMessage(txn).value(“output-message-2”).sendAsync(); inputConsumer.acknowledgeAsync(message.getMessageId(), txn); txn.commit().get(); MessageId msgId1 = sendFuture1.get(); MessageId msgId2 = sendFuture2.get();
  12. 12. Sticky consumer
  13. 13. Sticky consumer https://github.com/apache/pulsar/wiki/PIP-34%3A-Add-new-subscribe-type-Key_shared Consumer consumer1 = client.newConsumer() .topic(“my-topic“) .subscription(“my-subscription”) .subscriptionType(SubscriptionType.Key_Shared) .keySharedPolicy(KeySharedPolicy.sticky() .ranges(Range.of(0, 32767))) ).subscribe(); Consumer consumer2 = client.newConsumer() .topic(“my-topic“) .subscription(“my-subscription”) .subscriptionType(SubscriptionType.Key_Shared) .keySharedPolicy(KeySharedPolicy.sticky() .ranges(Range.of(32768, 65535))) ).subscribe();
  14. 14. Batch receiving messages Consumer consumer = client.newConsumer() .topic(“my-topic“) .subscription(“my-subscription”) .batchReceivePolicy(BatchReceivePolicy.builder() .maxNumMessages(100) .maxNumBytes(2 * 1024 * 1024) .timeout(1, TimeUnit.SECONDS) ).subscribe(); Messages msgs = consumer.batchReceive(); // doing some batch operate https://github.com/apache/pulsar/wiki/PIP-38%3A-Batch-Receiving-Messages
  15. 15. Namespace change events https://github.com/apache/pulsar/wiki/PIP-39%3A-Namespace-Change-Events persistent://tenant/ns/__change_events class PulsarEvent { EventType eventType; ActionType actionType; TopicEvent topicEvent; }
  16. 16. Thanks Penghui Li
  17. 17. Bo Cong / 丛搏 Pulsar Schema 智联招聘消息系统研发⼯程师 Pulsar schema、HDFS Offload 核⼼贡献者
  18. 18. Schema Evolution 2 Data management can't escape the evolution of schema
  19. 19. Single version schema 3 message 1 message 2 message 3 version 1
  20. 20. Multiple version schemas 4 message 1 message 2 message 3 version 1 version 2 Version 3
  21. 21. Schema compatibility can read Deserialization=
  22. 22. Compatibility strategy evolution Back Ward Back Ward Transitive version 2 version 1 version 0 version 2 version 1 version 0 can read can read can read can read can read may can’t read
  23. 23. Evolution of the situation 7 Class Person { @Nullable String name; } Version 1 Class Person { String name; } Class Person { @Nullable @AvroDefault(""Zhang San"") String name; } Version 2 Version 3 Can read Can readCan’t read
  24. 24. Compatibility check Separate schema compatibility checker for producer and consumer Producer Check if exist Consumer isAllowAutoUpdateSchema = false
  25. 25. Upgrade way BACKWORD Different strategy with different upgrade way BACKWORD_TRANSITIVE FORWORD FORWORD_TRANSITIVE Full Full_TRANSITIVE Consumers Producers Any order
  26. 26. Produce Different Message 10 Producer<V1Data> p = pulsarClient.newProducer(Schema.AVRO(V1Data.class)) .topic(topic).create(); Consumer<V2Data> c = pulsarClient.newConsumer(Schema.AVRO(V2Data.class)) .topic(topic) .subscriptionName("sub1").subscribe() p.newMessage().value(data1).send(); p.newMessage(Schema.AVRO(V2Data.class)).value(data2).send(); p.newMessage(Schema.AVRO(V1Data.class)).value(data3).send(); Message<V2Data> msg1 = c.receive(); V2Data msg1Value = msg1.getValue(); Message<V2Data> msg2 = c.receive(); Message<V2Data> msg3 = c.receive(); V2Data msg3Value = msg3.getValue();
  27. 27. Thanks Bo Cong
  28. 28. 翟佳 Kafka On Pulsar(KOP)
  29. 29. What is Apache Pulsar? Flexible Pub/Sub Messaging backed by Durable log Storage
  30. 30. Barrier for user? Unified Messaging Protocol Apps Build on old systems
  31. 31. How Pulsar handles it? Pulsar Kafka Wrapper on Kafka Java API https://pulsar.apache.org/docs/en/adaptors-kafka/ Pulsar IO Connect https://pulsar.apache.org/docs/en/io-overview/
  32. 32. Kafka on Pulsar (KoP)
  33. 33. KoP Feasibility — Log Topic
  34. 34. KoP Feasibility — Log Topic Producer Consumer
  35. 35. KoP Feasibility — Log Topic Producer Consumer Kafka
  36. 36. KoP Feasibility — Log Topic Producer Consumer Pulsar
  37. 37. KoP Feasibility — Others Producer Consumer Topic Lookup Produce Consume Offset Consumption State
  38. 38. KoP Overview Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  39. 39. KoP Implementation Topic flat map: Broker sets `kafkaNamespace` Message ID and Offset: LedgerId + EntryId Message: Convert Key/value/timestamp/headers(properties) Topic Lookup: Pulsar admin topic lookup -> owner broker Produce: Convert, then call PulsarTopic.publishMessage Consume: Convert, then call non-durable-cursor.readEntries Group Coordinator: Keep in topic `public/__kafka/__offsets`
  40. 40. KoP Now Layered Architecture Independent Scale Instant Recovery Balance-free expand
  41. 41. Ordering Guaranteed ordering Multi-tenancy A single cluster can support many tenants and use cases High throughput Can reach 1.8 M messages/s in a single partition Durability Data replicated and synced to disk Geo-replication Out of box support for geographically distributed applications Unified messaging model Support both Streaming and Queuing Delivery Guarantees At least once, at most once and effectively once Low Latency Low publish latency of 5ms Highly scalable & available Can support millions of topics HA KoP Now
  42. 42. Demo https://kafka.apache.org/quickstart Demo1: Kafka Producer / Consumer Demo2: Kafka Connect https://archive.apache.org/dist/kafka/2.0.0/ kafka_2.12-2.0.0.tgz Demo video: https://www.bilibili.com/video/av75540685
  43. 43. Demo Kafka lib Broker Pulsar Consumer Pulsar lib Load Balancer Pulsar Protocol handler Kafka Protocol handler Pulsar Producer Pulsar lib Kafka Producer Kafka lib Kafka Consumer Kafka lib Kafka Producer Managed Ledger BK Client Geo- Replicator Pulsar Topic ZooKeeper Bookie Pulsar
  44. 44. Demo1: K-Producer -> K-Consumer Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
  45. 45. Demo1: P-Producer -> K-Consumer Pulsar Consumer Pulsar lib Pulsar Producer Pulsar lib Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/pulsar-client produce test -n 1 -m “Hello from Pulsar Producer, Message 1” bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
  46. 46. Demo1: P-Producer -> K-Consumer Pulsar Consumer Pulsar lib Pulsar Producer Pulsar lib Kafka lib Kafka Consumer Kafka libKafka lib Kafka Producer Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test bin/pulsar-client consume -s sub-name test -n 0
  47. 47. Demo2: Kafka Connect
  48. 48. Demo2: Kafka Connect Kafka lib Kafka File Source Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic InPut File Kafka File Sink OutPut File TOPIC bin/connect-standalone.sh 
 config/connect-standalone.properties 
 config/connect-file-source.properties 
 config/connect-file-sink.properties
  49. 49. Demo2: Pulsar Functions https://pulsar.apache.org/docs/en/functions-overview/
  50. 50. Demo2: Pulsar Functions Kafka lib Kafka File Source Broker Pulsar Protocol handler Kafka Protocol handler Pulsar Topic InPut File Kafka File Sink OutPut File TOPIC Kafka lib Pulsar Functions OutPut Topic bin/pulsar-admin functions localrun --name pulsarExclamation
 --jar pulsar-functions-api-examples.jar 
 --classname org…ExclamationFunction
 --inputs connect-test-partition-0 --output out-hello
  51. 51. Apache Pulsar & Apache Kafka
  52. 52. Thanks!Stream Native We are hiring
  53. 53. Thanks

×