SlideShare a Scribd company logo
1 of 37
Pulsar Virtual Summit North America 2021
Apache Pulsar:
Why Unified Messaging
and Streaming Is the
Future
Matteo Merli, Sijie Guo
@ Pulsar PMC
Who are we?
● Sijie Guo (@sijieg)
● CEO, StreamNative
● PMC Member of Pulsar/BookKeeper
● Ex Co-Founder, Streamlio
● Ex Twitter
● Matteo Merli (@merlimat)
● CTO, StreamNative
● Co-creator and PMC chair of Pulsar
● Ex Co-Founder, Streamlio
● Ex Yahoo!
StreamNative
Founded by the creators of Apache Pulsar, StreamNative provides a
cloud-native, unified messaging and streaming platform powered by
Apache Pulsar to support multi-cloud and hybrid-cloud strategies
Announcing StreamNative Platform 1.0
✓ Pulsar Transactions
✓ Kafka-on-Pulsar
✓ Function Mesh for serverless streaming
✓ Enterprise-ready security
✓ Pulsar Operators
✓ Seamless StreamNative Cloud experience
Pulsar Trends
Kafka -> Pulsar
Scale Cloud-Native
Pulsar + Flink
Pulsar at Scale
More companies in Production
Pulsar at Scale
Hit Trillion Messages Per Day
Cloud-Native
Kubernetes Drive Adoption of Pulsar
✓ 80% of Pulsar users deploy Pulsar in a cloud environment
✓ 62% of Pulsar users deploy Pulsar on Kubernetes
✓ 49% noted Pulsar’s Cloud-Native capabilities as one of the
top reasons they chose to adopt Pulsar
Cloud-Native
Built for Kubernetes
Containers
Cloud Native
Hybrid & MultiCloud
● Single Cloud Provider
● Monolithic
Architectures
● Single Tenant Systems
● No Geo-replication
VM / Early Cloud Era Containers / Modern Cloud Era
Microservices
Pulsar + Flink
Unified Stream and Batch
Kafka to Pulsar
More and More Kafka Users Adopt Pulsar
✓ 68% of respondents use Kafka in addition to Pulsar
✓ 34% of respondents use or plan to use Kafka-on-Pulsar
✓ Kafka and Pulsar serve different use cases
✓ Once adopted, Pulsar usage expands across organizations
Pulsar Adoption Use Cases
Adopted Pulsar to replace Kafka
in their DSP (Data Streaming
Platform).
● 1.5-2x lower in capex cost
● 5-50x improvement in
latency
● 2-3x lower in opex due
● 10 PB / day
Adopted Pulsar to power their
billing platform, Midas, which
processing hundreds of billions
of financial transactions daily.
Adoption then expanded to
Tencent’s Federated Learning
Platform and Tencent Gaming.
Use cases require a scalable
message queue for serving
mission-critical business
applications to replace
RabbitMQ.
In the process of expanding use
cases to build data streaming
services
Modern Data Needs
Messaging + Streaming
Messaging
● Queueing systems are ideal for work
queues that do not require tasks to
be performed in a particular order—
for example, sending one email
message to many recipients.
● RabbitMQ and Amazon SQS are
examples of popular queue-based
message systems.
Streaming
● Streaming works best in situations
where the order of messages is
important—for example, data
ingestion.
● Kafka and Amazon Kinesis are
examples of messaging systems that
use streaming semantics for
consuming messages.
Data in motion
Typical Architecture
E-Commerce w/o Pulsar
✓ Separate storage
✓ Tiering outside toolset
✓ Separate application and
data domains
✓ Different tech stacks
Why not a system that is
able to support messaging
and streaming?
E-Commerce with Pulsar
✓ Unified storage for in-
motion data
✓ Native tiered storage
✓ Single system to
exchange data
✓ Teams share toolset
Build Apache Pulsar for
unified messaging and
streaming
Step 1: A scalable storage for streams of data
Step 2: Separate serving from storage
Apache Pulsar
Apache BookKeeper
Broker 0
Producer Consumer
Broker 1 Broker 2
Bookie
0
Bookie
1
Bookie
2
Bookie
3
Bookie
4
Step 3: Unified API
Streaming
Messaging
Producer 1
Producer 2
Pulsar
Topic/Partition
m0
m1
m2
m3
m4
Consumer D-1
Consumer D-2
Consumer D-3
Subscription D
Key-Shared
Consumer C-1
Consumer C-2
Consumer C-3
Subscription C
m1
m2
m3
m4
m0
Shared
Failover
Consumer B-1
Consumer B-0
Subscription B
m1
m2
m3
m4
m0
In case of failure
in Consumer B-0
Consumer A-1
Consumer A-0
Subscription A
m1
m2
m3
m4
m0
Exclusive
X
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Step 3:
Unified API
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Step 4:
Schema
API
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Step 5:
Functions
and IO API
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Step 6:
Tiered
Storage
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Tiered Storage
Step 7: Protocol Handlers
Apache Pulsar
Pulsar Protocol
Handler
Pulsar Clients
(queue + stream)
Kafka Protocol
Handler
AMQP Protocol
Handler
MQTT Protocol
Handler
Kafka Clients AMQP Clients MQTT Clients
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Tiered Storage
Step 8:
Transaction
API
Transaction
API
Pulsar 2.8 towards a
complete vision of unified
messaging and streaming
The future of Pulsar
Towards a self-adjusting
data platform
✓ Tuning data platforms to run at scale is hard
✓ Lots of configurations
✓ Requires in-depth knowledge of internals
✓ Workloads are constantly changing
Topic auto-partitioning
✓ Partitions are an artifact of implementation
✓ It’s not a natural property of the data
✓ Abstract the partitioning away from users
✓ Partitions are automatically split / merged based
✓ Rethink how an API should look like
Self-Adjusting Storage
✓ Ensure most optimal utilization of hardware
✓ No configuration
✓ Automatically adjust strategies based on changing
condition:
✓ Disk access
✓ Cache management
✓ Queue sizes
Pulsar Functions
✓ The foundation is now mature — UX is still poor
✓ Simpler tooling to create & manage functions
✓ CI/CD integration — Versioning — A/B testing
✓ Observability & Debuggability
✓ Improve support for Go and Python functions
✓ DSL — Provide higher level constructs to process data
Stream Storage
✓ Evolve the current state of Tiered Storage
✓ Integrate with data lake technologies
Working with the data
community

More Related Content

What's hot

Introduction to Nginx
Introduction to NginxIntroduction to Nginx
Introduction to NginxKnoldus Inc.
 
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsFunction Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsStreamNative
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Araf Karsh Hamid
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK
 
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
Distributed Lock ManagerHao Chen
 
Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019
Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019
Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019confluent
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®confluent
 
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...StreamNative
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on KubernetesStateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetesconfluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersJean-Paul Azar
 
Confluent Tech Talk Korea
Confluent Tech Talk KoreaConfluent Tech Talk Korea
Confluent Tech Talk Koreaconfluent
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Kai Wähner
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaGuido Schmutz
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaKai Wähner
 
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...confluent
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep diveWinton Winton
 

What's hot (20)

Introduction to Nginx
Introduction to NginxIntroduction to Nginx
Introduction to Nginx
 
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming SolutionsFunction Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
 
Distributed Lock Manager
Distributed Lock ManagerDistributed Lock Manager
Distributed Lock Manager
 
Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019
Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019
Kafka Streams at Scale (Deepak Goyal, Walmart Labs) Kafka Summit London 2019
 
Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®Introduction to KSQL: Streaming SQL for Apache Kafka®
Introduction to KSQL: Streaming SQL for Apache Kafka®
 
RabbitMQ & Kafka
RabbitMQ & KafkaRabbitMQ & Kafka
RabbitMQ & Kafka
 
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on KubernetesStateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
Stateful, Stateless and Serverless - Running Apache Kafka® on Kubernetes
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Kafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer ConsumersKafka Intro With Simple Java Producer Consumers
Kafka Intro With Simple Java Producer Consumers
 
Confluent Tech Talk Korea
Confluent Tech Talk KoreaConfluent Tech Talk Korea
Confluent Tech Talk Korea
 
Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?Can Apache Kafka Replace a Database?
Can Apache Kafka Replace a Database?
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around Kafka
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
Architecture Patterns for Event Streaming (Nick Dearden, Confluent) London 20...
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep dive
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 

Similar to Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...Timothy Spann
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureTimothy Spann
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lakeTimothy Spann
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsTimothy Spann
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeTimothy Spann
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Timothy Spann
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Timothy Spann
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...Timothy Spann
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...Timothy Spann
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computingTimothy Spann
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Timothy Spann
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudWhat We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudStreamNative
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Timothy Spann
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarScyllaDB
 
Open keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieOpen keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieStreamNative
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsTimothy Spann
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...Timothy Spann
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceTimothy Spann
 

Similar to Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote (20)

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lake
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudWhat We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for Cloud
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
Open keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieOpen keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijie
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming Apps
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
 

More from StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021StreamNative
 

More from StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021Improvements Made in KoP 2.9.0  - Pulsar Summit Asia 2021
Improvements Made in KoP 2.9.0 - Pulsar Summit Asia 2021
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote

  • 1. Pulsar Virtual Summit North America 2021 Apache Pulsar: Why Unified Messaging and Streaming Is the Future Matteo Merli, Sijie Guo @ Pulsar PMC
  • 2. Who are we? ● Sijie Guo (@sijieg) ● CEO, StreamNative ● PMC Member of Pulsar/BookKeeper ● Ex Co-Founder, Streamlio ● Ex Twitter ● Matteo Merli (@merlimat) ● CTO, StreamNative ● Co-creator and PMC chair of Pulsar ● Ex Co-Founder, Streamlio ● Ex Yahoo!
  • 3. StreamNative Founded by the creators of Apache Pulsar, StreamNative provides a cloud-native, unified messaging and streaming platform powered by Apache Pulsar to support multi-cloud and hybrid-cloud strategies
  • 4. Announcing StreamNative Platform 1.0 ✓ Pulsar Transactions ✓ Kafka-on-Pulsar ✓ Function Mesh for serverless streaming ✓ Enterprise-ready security ✓ Pulsar Operators ✓ Seamless StreamNative Cloud experience
  • 5. Pulsar Trends Kafka -> Pulsar Scale Cloud-Native Pulsar + Flink
  • 6. Pulsar at Scale More companies in Production
  • 7. Pulsar at Scale Hit Trillion Messages Per Day
  • 8. Cloud-Native Kubernetes Drive Adoption of Pulsar ✓ 80% of Pulsar users deploy Pulsar in a cloud environment ✓ 62% of Pulsar users deploy Pulsar on Kubernetes ✓ 49% noted Pulsar’s Cloud-Native capabilities as one of the top reasons they chose to adopt Pulsar
  • 9. Cloud-Native Built for Kubernetes Containers Cloud Native Hybrid & MultiCloud ● Single Cloud Provider ● Monolithic Architectures ● Single Tenant Systems ● No Geo-replication VM / Early Cloud Era Containers / Modern Cloud Era Microservices
  • 10. Pulsar + Flink Unified Stream and Batch
  • 11. Kafka to Pulsar More and More Kafka Users Adopt Pulsar ✓ 68% of respondents use Kafka in addition to Pulsar ✓ 34% of respondents use or plan to use Kafka-on-Pulsar ✓ Kafka and Pulsar serve different use cases ✓ Once adopted, Pulsar usage expands across organizations
  • 12. Pulsar Adoption Use Cases Adopted Pulsar to replace Kafka in their DSP (Data Streaming Platform). ● 1.5-2x lower in capex cost ● 5-50x improvement in latency ● 2-3x lower in opex due ● 10 PB / day Adopted Pulsar to power their billing platform, Midas, which processing hundreds of billions of financial transactions daily. Adoption then expanded to Tencent’s Federated Learning Platform and Tencent Gaming. Use cases require a scalable message queue for serving mission-critical business applications to replace RabbitMQ. In the process of expanding use cases to build data streaming services
  • 15. Messaging ● Queueing systems are ideal for work queues that do not require tasks to be performed in a particular order— for example, sending one email message to many recipients. ● RabbitMQ and Amazon SQS are examples of popular queue-based message systems. Streaming ● Streaming works best in situations where the order of messages is important—for example, data ingestion. ● Kafka and Amazon Kinesis are examples of messaging systems that use streaming semantics for consuming messages. Data in motion
  • 17. E-Commerce w/o Pulsar ✓ Separate storage ✓ Tiering outside toolset ✓ Separate application and data domains ✓ Different tech stacks
  • 18. Why not a system that is able to support messaging and streaming?
  • 19. E-Commerce with Pulsar ✓ Unified storage for in- motion data ✓ Native tiered storage ✓ Single system to exchange data ✓ Teams share toolset
  • 20. Build Apache Pulsar for unified messaging and streaming
  • 21. Step 1: A scalable storage for streams of data
  • 22. Step 2: Separate serving from storage Apache Pulsar Apache BookKeeper Broker 0 Producer Consumer Broker 1 Broker 2 Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4
  • 23. Step 3: Unified API Streaming Messaging Producer 1 Producer 2 Pulsar Topic/Partition m0 m1 m2 m3 m4 Consumer D-1 Consumer D-2 Consumer D-3 Subscription D Key-Shared Consumer C-1 Consumer C-2 Consumer C-3 Subscription C m1 m2 m3 m4 m0 Shared Failover Consumer B-1 Consumer B-0 Subscription B m1 m2 m3 m4 m0 In case of failure in Consumer B-0 Consumer A-1 Consumer A-0 Subscription A m1 m2 m3 m4 m0 Exclusive X
  • 24. Reader and Batch API Pub/Sub API Publisher Subscriber Step 3: Unified API Stream Processor Applications Microservices or Event-Driven Architecture
  • 25. Step 4: Schema API Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API
  • 26. Step 5: Functions and IO API Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors
  • 27. Step 6: Tiered Storage Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors Tiered Storage
  • 28. Step 7: Protocol Handlers Apache Pulsar Pulsar Protocol Handler Pulsar Clients (queue + stream) Kafka Protocol Handler AMQP Protocol Handler MQTT Protocol Handler Kafka Clients AMQP Clients MQTT Clients
  • 29. Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors Tiered Storage Step 8: Transaction API Transaction API
  • 30. Pulsar 2.8 towards a complete vision of unified messaging and streaming
  • 31. The future of Pulsar
  • 32. Towards a self-adjusting data platform ✓ Tuning data platforms to run at scale is hard ✓ Lots of configurations ✓ Requires in-depth knowledge of internals ✓ Workloads are constantly changing
  • 33. Topic auto-partitioning ✓ Partitions are an artifact of implementation ✓ It’s not a natural property of the data ✓ Abstract the partitioning away from users ✓ Partitions are automatically split / merged based ✓ Rethink how an API should look like
  • 34. Self-Adjusting Storage ✓ Ensure most optimal utilization of hardware ✓ No configuration ✓ Automatically adjust strategies based on changing condition: ✓ Disk access ✓ Cache management ✓ Queue sizes
  • 35. Pulsar Functions ✓ The foundation is now mature — UX is still poor ✓ Simpler tooling to create & manage functions ✓ CI/CD integration — Versioning — A/B testing ✓ Observability & Debuggability ✓ Improve support for Go and Python functions ✓ DSL — Provide higher level constructs to process data
  • 36. Stream Storage ✓ Evolve the current state of Tiered Storage ✓ Integrate with data lake technologies
  • 37. Working with the data community

Editor's Notes

  1. Before diving into the “Unified Messaging and Streaming”, let’s take a look at the trends in Pulsar community.
  2. To understand what is happening behind the scene, we need to rewind back to the early days of Pulsar. Back to 2012, when we first set out to build Pulsar, we thought there should be a global geo-replicated infrastructure for all the messaging data. We didn’t start with the idea of making our own software, but started by observing the gaps in the existing technologies available at the time and realized how they were insufficient to serve the needs of an data-driven organization.
  3. Talking about these 2 different worlds Messaging - read slide These are like commands that represent changes that need to be made to the system An example : we send message that says “Process this order” or “change user to be deleted” but we don’t actually perform that change just notify Messaging systems are selected when synchronous communications breaks down In contrast - streaming systems deal with events. The state changes themselves, so instead of sending a message saying this user wants to update their email, we instead actually perform the update Events interlinked together that may be persisted, replayed or aggregated
  4. Talking about these 2 different worlds Messaging - read slide These are like commands that represent changes that need to be made to the system An example : we send message that says “Process this order” or “change user to be deleted” but we don’t actually perform that change just notify Messaging systems are selected when synchronous communications breaks down In contrast - streaming systems deal with events. The state changes themselves, so instead of sending a message saying this user wants to update their email, we instead actually perform the update Events interlinked together that may be persisted, replayed or aggregated
  5. Instructor Notes What we have here is a little bit of an example of what we might see in a modern organization that has run into both these issues We have basically 2 different regimes or 2 different worlds - different teams. Historically, these worlds often seem very different with entirely different tech stacks and entirely different teams. However, as data becomes more critical in informing applications, the need to have applications make more use of what data teams and data services are producing. Likewise getting the data out of applications and into the data realm has forced organizations to get better at being able to do both of these things really well. This can be a real challenge. So on the left we have the application side and these are applications that are interacting via messages and dealing with the aspects of running your systems and providing capabilities focused on business concerns On the right side we have services that deal with the data. Data bulk and large Sometimes the right side includes real time or batch processes such as sending large amounts of data, putting it into data lakes, making computing answers about it, sending data for another services or providing that data to other orgs that need it These 2 worlds generally are using different technologies and different tools and different processes - all leading to more complexity and cost
  6. Read slide Separate storage/transport systems for messaging, streaming, and big-data. Focus on ETL separate processes Messaging helps decouple apps, provides for reliable async communication, work queues, in core applications. Streaming allows for “medium-term” storage of streams (~30 days), aggregating streams of data and real-time processing for near real-time analytics. Batch processing and long-term object storage (S3, HDFS, etc) allows for processing historical data to learn from the past. “Tiering” of data from messaging -> streaming -> object storage is outside of core toolset and is maintained explicitly. Application and Data domains are separated, data is replicated into data domain. Results from data domain are loaded (ETL) back into application domain. Multiple teams with very different technology stacks. ==== To show how Pulsar provides that ability to be transformative here is a common example of an e-commerce system stack that contains both a streaming set of services and also data processing On the application side we have order services, inventory service and fulfillment Talk to each service (think Amazon) On the data side we have Spark - some batch processing using spark Flink - Real time inventory analysis using flink Another use case maybe some long term storage needs versus short term (30 days) then data warehouse layer Imagine a person ordering something and then check inventory and it isn’t there. Do you delete the order or put on backorder? Once the inventory gets replenished then how do we notify the customers that their order is now coming So need to join both sides together
  7. It is very nature to merge both. Talk about the technologies are evolved to a way to that is able to support both. Read slide and add more context: “Unified” storage/transport of message and streams with access to underlying data: Messaging - Decoupled applications with pub/sub, shared subscriptions for work queues, exclusive subscriptions for fanout and point-to-point messaging with flexible large numbers of non-partitioned topics. Streaming - Ordered, scalable partitioned topics with failover and key shared subscriptions. Pub/sub (broker controlled) or reader API (client controlled) for advanced stream processing, replay, etc. Big-data batch Access - Underlying segments of topics can be read directly, allow for scale-out parallelism. Tiered storage is core to Pulsar, no need for external tools. Application and data domains use single system to exchange data, with converged “messaging” and “streaming”. One or many teams, with shared toolset. Talk to diagram Talk to the slide and on the left side say how Pulsar can process real time streams and on the right can do batch processing, offload to tiered storage and read back in parallel batch fashion and even provide a stream back to other systems for consumption order services, inventory service and fulfillment - they still work from the messaging domain (use cases not too different) But now can support processing at much higher scale, any messages they have are kept in Pulsar as a single source of truth and these messages can be offloaded via Pulsar to long term storage Pulsar also provides the power to enable a unified batch and streaming job that can do both batch processing by reading from underlying storage and combine that with real time streams all with a single technology
  8. Let's take a retrospective look at how Pulsar has evolved through the years. When we started designing Pulsar as a new platform, we always had this idea of supporting both the Pub-Sub semantics as well as the data streaming pipelines, which at the time were a new and emerging thing. But it would be a lie to say we had everything pre-planned since the beginning. Instead, we spent a lot of time observing how people used these platforms and we tried to fill all the gaps we were seeing, evolving Pulsar with the changing needs of data applications.
  9. At the very core of Pulsar there has always been the concept of the "log". A distributed, replicated and immutable ledger where all the events are appended. BookKeeper has proved, throughout the years, to be the best storage solution for streams of data. It scales to very large number of logs, it offers consistency, durability, low latency and high-throughput and, more importantly, very convenient operational tooling. To summarize: using the log as a building block does a lot of the heavy lifting required to build a truly scalable system.
  10. Another architectural choice that came naturally from using BookKeeper has been the separation of the storage from the data serving layer. This comes from BookKeeper because BookKeeper requires to have a single writer for a each log. In our case the Broker acts as that single writer. This multi-layer architecture was exactly what we needed because it allows Pulsar to have: 1. Stateless brokers - Means topics can be easily moved across brokers without copying any data. For example, expanding cluster or adjusting the topics assignments after changing conditions. 2. Data locality - Because of this broker layer, the data for a single topic or partition does not have to be stored in one single storage node. Instead we can fully utilize the resources of the entire cluster.
  11. We just said that the log is the building block of Pulsa... but the log on its own is a very low level construct. Applications very often need much more sophisticated ways of interacting with the data than just reading through the log of events. Instead, we wanted to capture the right level of semantics needed to support a wide range of pub-sub and streaming use cases. The core idea was to leave the flexibility to consume data from topics in multiple different ways, depending on what the application needs. We ended up having 4 subscription types with different semantics and different properties, each one with its own merits.
  12. After the Pub/Sub API, the next addition was the Reader API. You can think of it as the "unmanaged" way to consume data from a topic. While there are many reasons for using a reader, the main users are typically Stream Processing frameworks because they tend to have their own checkpointing mechanisms or, similarly, batch systems that want to do a scan of the historical data.
  13. The common theme in the API exposed by Pulsar is the support for Schema. Having direct support for Schema inside Pulsar means that brokers can validate the schema of the data being published and that the expectation of consumers is matched as well. But it also means that it becomes very easy to "discover" the schema of the data. The discoverability of the schema means that you can write fully type safe generic consumers that don't need to be aware of one specific schema.
  14. Next we looked at what people were trying to do with messaging platforms and the realization was that there was always some portion of computation involved. Application very often need to do simple data transformations, enrichment and similar things. Functions were designed to provide the simplicity of the "Serverless" model with a very tight integration in the Pulsar platform. One example of how powerful Pulsar functions are is that we have created a connector framework, Pulsar IO, entirely based on Pulsar Functions. With Pulsar IO, you can choose between a large set of pre-built connectors, both sources or sinks, or build your own custom connectors.
  15. After that, the next trend saw is that more and more users wanted to use the "stream" concept not just as a temporary buffer, as a way to isolate the data ingestion and the processing. Instead, they increasingly want to keep the stream as a permanent, or at least long term "storage of record". Tiered storage was the missing link to enable this. By offloading cold data to cloud storage providers, we can have large scale data retention at a very effective cost, all while maintaining the stream view of the data and the same APIs.
  16. Another realization was that, because of its nature, messaging is always the integration point for different applications and components. This makes migration from other platforms a bit harder. You often have to coordinate that migration across different teams or organizations. To make it easier, we extended the Pulsar brokers to be able to speak several protocols, in addition the Pulsar native protocols. With Protocol Handlers, there is a pluggable way to add more ways to interact with the Pulsar service and the same topic data. We started with KoP, Kafka On Pulsar, then followed up by AMQP and MQTT. It is very powerful mechanism for a few reasons: 1. Applications can use existing client libraries with no code or dependencies changes 2. You can mix all sort of different protocols to interact with the same topic 3. It's exposed directly in Pulsar brokers, data is stored only once and there is no "proxy overhead"
  17. To really complete the full picture, in Pulsar 2.8 we introduced support for transactions. It's now possible to do very complex interactions and take advantage of the transactional properties, for example publishing messages atomically across multiple topics, or consuming and producing atomically.
  18. We can say that Pulsar 2.8 is a big milestone in the journey completing this vision of unified messaging and streaming platform. We are very excited and very proud of this release. This is culminating months and months of work by a “larger than ever” group of committers and contributors. And while transactions support is the biggest new feature, it is certainly not the only one. We have feature like Exclusive producer support, about which I will Be talking about tomorrow in an ad-hoc session, a new API for package management, to improve the way we manage the functions and connectors code artifacts, or finally simplified way to configure memory limit in Pulsar clients.
  19. After looking at the past, let's now take a look at some of the items that we want to focus on in the very near future.
  20. A problem that we're seeing overall in the data ecosystem is that these platforms can be very difficult to tune and operate when running at a large scale. This is not a problem specific to Pulsar, but it is something that we believe it should be addressed. Typically, there are a lot of configuration options and each of them requires in-depth knowledge of the internal of the system. Worse, when integrating multiple systems, like a comput framework, it might be very hard to predict how a change in the configuration will affect the overall stability and performance. Finally, the workloads are increasingly dynamic and constantly changing. It's not possible to have a static configuration that will have "optimal" performance in every condition.
  21. The first item I want to discuss is partitioning. People are used to see partitioning and sharding, but these are really artifacts of how systems are implemented. Partitions are usually not a natural property of the data. Because of that, we want to abstract the partition concept away from the user sight. Application developers should not be worried about partitions, operators should not be thinking at how many partitions are needed for a certain use case. Instead, the system should be able to figure it out on its own, internally splitting and merging partitions, while maintaining the fundamentals ordering guarantees.
  22. Tuning storage system can also be a very complex task. In particular, it can be very hard to predict the impact of configuration on the overall performance when we're crossing multiple layers: there is the Operating System, the disk device and the disk controller. In a similar way, the idea we have is to make it working with no configuration, in a way that the storage system is able to automatically adjust the strategies based on the changing conditions of the traffic. All aspects regarding the access pattern to the disk, what kind of cache eviction strategy and so on.
  23. When we introduced Pulsar Functions, we had the idea of making it a frictionless platform for developers to do data processing. Over few years, the foundation of Pulsar Functions runtime has really matured into a solid platform, although the user experience is still not great. While it is very easy for developers to write functions, we should strive to make it much easier to actually deploy and manage functions. For example, having functions tooling to be well integrated with CI/CD platforms, supporting versioning and out of the box support for A/B testings. Another aspect is observability and debuggability. The tooling and the platform needs to make it super-easy for users to discover issues in their own code or to detect performance issues. Finally, we are thinking on a more higher level DSL, that can support higher level constructs to further simplify writing data processing functions.
  24. We talked before about Tiered Storage and how it has enabled completely new use cases to be supported by Pulsar. The next step here is to make sure we can integrate with existing data lake technologies, like Delta Lake and Apache Hudi. The vision is to use the Data Lake as the tiered storage backend, so that the same data can be consumed as a stream or with the data lake tooling.
  25. As a final note, given the very nature of Pulsar, that sits between different systems and platforms and links all of them together, we want to reaffirm our commitment to work with the larger data community to ensure that Pulsar is supported everywhere, out of the box, as a first class citizen. We have been partnering with many Open Source communities like Trino, Druid, Pinot, Spark and Flink. We will continue to do so, and more in the future. We believe that this will benefits Pulsar, its users and the overall data ecosystem.