SlideShare a Scribd company logo
Pulsar Virtual Summit North America 2021
Apache Pulsar:
Why Unified Messaging
and Streaming Is the
Future
Matteo Merli, Sijie Guo
@ Pulsar PMC
Who are we?
● Sijie Guo (@sijieg)
● CEO, StreamNative
● PMC Member of Pulsar/BookKeeper
● Ex Co-Founder, Streamlio
● Ex Twitter
● Matteo Merli (@merlimat)
● CTO, StreamNative
● Co-creator and PMC chair of Pulsar
● Ex Co-Founder, Streamlio
● Ex Yahoo!
StreamNative
Founded by the creators of Apache Pulsar, StreamNative provides a
cloud-native, unified messaging and streaming platform powered by
Apache Pulsar to support multi-cloud and hybrid-cloud strategies
Announcing StreamNative Platform 1.0
✓ Pulsar Transactions
✓ Kafka-on-Pulsar
✓ Function Mesh for serverless streaming
✓ Enterprise-ready security
✓ Pulsar Operators
✓ Seamless StreamNative Cloud experience
Pulsar Trends
Kafka -> Pulsar
Scale Cloud-Native
Pulsar + Flink
Pulsar at Scale
More companies in Production
Pulsar at Scale
Hit Trillion Messages Per Day
Cloud-Native
Kubernetes Drive Adoption of Pulsar
✓ 80% of Pulsar users deploy Pulsar in a cloud environment
✓ 62% of Pulsar users deploy Pulsar on Kubernetes
✓ 49% noted Pulsar’s Cloud-Native capabilities as one of the
top reasons they chose to adopt Pulsar
Cloud-Native
Built for Kubernetes
Containers
Cloud Native
Hybrid & MultiCloud
● Single Cloud Provider
● Monolithic
Architectures
● Single Tenant Systems
● No Geo-replication
VM / Early Cloud Era Containers / Modern Cloud Era
Microservices
Pulsar + Flink
Unified Stream and Batch
Kafka to Pulsar
More and More Kafka Users Adopt Pulsar
✓ 68% of respondents use Kafka in addition to Pulsar
✓ 34% of respondents use or plan to use Kafka-on-Pulsar
✓ Kafka and Pulsar serve different use cases
✓ Once adopted, Pulsar usage expands across organizations
Pulsar Adoption Use Cases
Adopted Pulsar to replace Kafka
in their DSP (Data Streaming
Platform).
● 1.5-2x lower in capex cost
● 5-50x improvement in
latency
● 2-3x lower in opex due
● 10 PB / day
Adopted Pulsar to power their
billing platform, Midas, which
processing hundreds of billions
of financial transactions daily.
Adoption then expanded to
Tencent’s Federated Learning
Platform and Tencent Gaming.
Use cases require a scalable
message queue for serving
mission-critical business
applications to replace
RabbitMQ.
In the process of expanding use
cases to build data streaming
services
Modern Data Needs
Messaging + Streaming
Messaging
● Queueing systems are ideal for work
queues that do not require tasks to
be performed in a particular order—
for example, sending one email
message to many recipients.
● RabbitMQ and Amazon SQS are
examples of popular queue-based
message systems.
Streaming
● Streaming works best in situations
where the order of messages is
important—for example, data
ingestion.
● Kafka and Amazon Kinesis are
examples of messaging systems that
use streaming semantics for
consuming messages.
Data in motion
Typical Architecture
E-Commerce w/o Pulsar
✓ Separate storage
✓ Tiering outside toolset
✓ Separate application and
data domains
✓ Different tech stacks
Why not a system that is
able to support messaging
and streaming?
E-Commerce with Pulsar
✓ Unified storage for in-
motion data
✓ Native tiered storage
✓ Single system to
exchange data
✓ Teams share toolset
Build Apache Pulsar for
unified messaging and
streaming
Step 1: A scalable storage for streams of data
Step 2: Separate serving from storage
Apache Pulsar
Apache BookKeeper
Broker 0
Producer Consumer
Broker 1 Broker 2
Bookie
0
Bookie
1
Bookie
2
Bookie
3
Bookie
4
Step 3: Unified API
Streaming
Messaging
Producer 1
Producer 2
Pulsar
Topic/Partition
m0
m1
m2
m3
m4
Consumer D-1
Consumer D-2
Consumer D-3
Subscription D
Key-Shared
Consumer C-1
Consumer C-2
Consumer C-3
Subscription C
m1
m2
m3
m4
m0
Shared
Failover
Consumer B-1
Consumer B-0
Subscription B
m1
m2
m3
m4
m0
In case of failure
in Consumer B-0
Consumer A-1
Consumer A-0
Subscription A
m1
m2
m3
m4
m0
Exclusive
X
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Step 3:
Unified API
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Step 4:
Schema
API
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Step 5:
Functions
and IO API
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Step 6:
Tiered
Storage
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Tiered Storage
Step 7: Protocol Handlers
Apache Pulsar
Pulsar Protocol
Handler
Pulsar Clients
(queue + stream)
Kafka Protocol
Handler
AMQP Protocol
Handler
MQTT Protocol
Handler
Kafka Clients AMQP Clients MQTT Clients
Reader and
Batch API
Pub/Sub
API
Publisher
Subscriber
Stream Processor
Applications
Microservices or
Event-Driven Architecture
Schema
API
Schema API
Functions
API
Pulsar
IO/Connectors
Prebuilt Connectors
Custom Connectors
Tiered Storage
Step 8:
Transaction
API
Transaction
API
Pulsar 2.8 towards a
complete vision of unified
messaging and streaming
The future of Pulsar
Towards a self-adjusting
data platform
✓ Tuning data platforms to run at scale is hard
✓ Lots of configurations
✓ Requires in-depth knowledge of internals
✓ Workloads are constantly changing
Topic auto-partitioning
✓ Partitions are an artifact of implementation
✓ It’s not a natural property of the data
✓ Abstract the partitioning away from users
✓ Partitions are automatically split / merged based
✓ Rethink how an API should look like
Self-Adjusting Storage
✓ Ensure most optimal utilization of hardware
✓ No configuration
✓ Automatically adjust strategies based on changing
condition:
✓ Disk access
✓ Cache management
✓ Queue sizes
Pulsar Functions
✓ The foundation is now mature — UX is still poor
✓ Simpler tooling to create & manage functions
✓ CI/CD integration — Versioning — A/B testing
✓ Observability & Debuggability
✓ Improve support for Go and Python functions
✓ DSL — Provide higher level constructs to process data
Stream Storage
✓ Evolve the current state of Tiered Storage
✓ Integrate with data lake technologies
Working with the data
community

More Related Content

What's hot

Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
Todd Palino
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Ververica
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
Cloudera, Inc.
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
StreamNative
 
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
StreamNative
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
Flink Forward
 
Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon Whitear
Spark Summit
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
Danny Yuan
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
Jun Rao
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Amy W. Tang
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
Ketan Gote
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
 
Inside open metadata—the deep dive
Inside open metadata—the deep diveInside open metadata—the deep dive
Inside open metadata—the deep dive
DataWorks Summit
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
Matei Zaharia
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Databricks
 

What's hot (20)

Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
 
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Spark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon WhitearSpark Summit EU talk by Simon Whitear
Spark Summit EU talk by Simon Whitear
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Inside open metadata—the deep dive
Inside open metadata—the deep diveInside open metadata—the deep dive
Inside open metadata—the deep dive
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Making Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse TechnologyMaking Data Timelier and More Reliable with Lakehouse Technology
Making Data Timelier and More Reliable with Lakehouse Technology
 
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...Building a Streaming Microservice Architecture: with Apache Spark Structured ...
Building a Streaming Microservice Architecture: with Apache Spark Structured ...
 

Similar to Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Timothy Spann
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
Timothy Spann
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lake
Timothy Spann
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
Timothy Spann
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
Timothy Spann
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Timothy Spann
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
Timothy Spann
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Timothy Spann
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Timothy Spann
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Guido Schmutz
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Timothy Spann
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Timothy Spann
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudWhat We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for Cloud
StreamNative
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022
Timothy Spann
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
confluent
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
ScyllaDB
 
Open keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieOpen keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijie
StreamNative
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming Apps
Timothy Spann
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
Timothy Spann
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
Timothy Spann
 

Similar to Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote (20)

Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
Automation + dev ops summit hail hydrate! from stream to lake
Automation + dev ops summit   hail hydrate! from stream to lakeAutomation + dev ops summit   hail hydrate! from stream to lake
Automation + dev ops summit hail hydrate! from stream to lake
 
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and FriendsPortoTechHub  - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
PortoTechHub - Hail Hydrate! From Stream to Lake with Apache Pulsar and Friends
 
Music city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lakeMusic city data Hail Hydrate! from stream to lake
Music city data Hail Hydrate! from stream to lake
 
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
Using FLiP with InfluxDB for EdgeAI IoT at Scale 2022
 
Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022Using FLiP with influxdb for edgeai iot at scale 2022
Using FLiP with influxdb for edgeai iot at scale 2022
 
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...Devfest uk & ireland  using apache nifi with apache pulsar for fast data on-r...
Devfest uk & ireland using apache nifi with apache pulsar for fast data on-r...
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
What We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for CloudWhat We Learned From Building a Modern Messaging and Streaming System for Cloud
What We Learned From Building a Modern Messaging and Streaming System for Cloud
 
Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022Open Source Bristol 30 March 2022
Open Source Bristol 30 March 2022
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Building an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache PulsarBuilding an Event Streaming Architecture with Apache Pulsar
Building an Event Streaming Architecture with Apache Pulsar
 
Open keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijieOpen keynote_carolyn&matteo&sijie
Open keynote_carolyn&matteo&sijie
 
ITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming AppsITPC Building Modern Data Streaming Apps
ITPC Building Modern Data Streaming Apps
 
Big data conference europe real-time streaming in any and all clouds, hybri...
Big data conference europe   real-time streaming in any and all clouds, hybri...Big data conference europe   real-time streaming in any and all clouds, hybri...
Big data conference europe real-time streaming in any and all clouds, hybri...
 
Hail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open sourceHail hydrate! from stream to lake using open source
Hail hydrate! from stream to lake using open source
 

More from StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 

More from StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Recently uploaded

Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 

Recently uploaded (20)

Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 

Apache Pulsar: Why Unified Messaging and Streaming Is the Future - Pulsar Summit NA 2021 Keynote

  • 1. Pulsar Virtual Summit North America 2021 Apache Pulsar: Why Unified Messaging and Streaming Is the Future Matteo Merli, Sijie Guo @ Pulsar PMC
  • 2. Who are we? ● Sijie Guo (@sijieg) ● CEO, StreamNative ● PMC Member of Pulsar/BookKeeper ● Ex Co-Founder, Streamlio ● Ex Twitter ● Matteo Merli (@merlimat) ● CTO, StreamNative ● Co-creator and PMC chair of Pulsar ● Ex Co-Founder, Streamlio ● Ex Yahoo!
  • 3. StreamNative Founded by the creators of Apache Pulsar, StreamNative provides a cloud-native, unified messaging and streaming platform powered by Apache Pulsar to support multi-cloud and hybrid-cloud strategies
  • 4. Announcing StreamNative Platform 1.0 ✓ Pulsar Transactions ✓ Kafka-on-Pulsar ✓ Function Mesh for serverless streaming ✓ Enterprise-ready security ✓ Pulsar Operators ✓ Seamless StreamNative Cloud experience
  • 5. Pulsar Trends Kafka -> Pulsar Scale Cloud-Native Pulsar + Flink
  • 6. Pulsar at Scale More companies in Production
  • 7. Pulsar at Scale Hit Trillion Messages Per Day
  • 8. Cloud-Native Kubernetes Drive Adoption of Pulsar ✓ 80% of Pulsar users deploy Pulsar in a cloud environment ✓ 62% of Pulsar users deploy Pulsar on Kubernetes ✓ 49% noted Pulsar’s Cloud-Native capabilities as one of the top reasons they chose to adopt Pulsar
  • 9. Cloud-Native Built for Kubernetes Containers Cloud Native Hybrid & MultiCloud ● Single Cloud Provider ● Monolithic Architectures ● Single Tenant Systems ● No Geo-replication VM / Early Cloud Era Containers / Modern Cloud Era Microservices
  • 10. Pulsar + Flink Unified Stream and Batch
  • 11. Kafka to Pulsar More and More Kafka Users Adopt Pulsar ✓ 68% of respondents use Kafka in addition to Pulsar ✓ 34% of respondents use or plan to use Kafka-on-Pulsar ✓ Kafka and Pulsar serve different use cases ✓ Once adopted, Pulsar usage expands across organizations
  • 12. Pulsar Adoption Use Cases Adopted Pulsar to replace Kafka in their DSP (Data Streaming Platform). ● 1.5-2x lower in capex cost ● 5-50x improvement in latency ● 2-3x lower in opex due ● 10 PB / day Adopted Pulsar to power their billing platform, Midas, which processing hundreds of billions of financial transactions daily. Adoption then expanded to Tencent’s Federated Learning Platform and Tencent Gaming. Use cases require a scalable message queue for serving mission-critical business applications to replace RabbitMQ. In the process of expanding use cases to build data streaming services
  • 15. Messaging ● Queueing systems are ideal for work queues that do not require tasks to be performed in a particular order— for example, sending one email message to many recipients. ● RabbitMQ and Amazon SQS are examples of popular queue-based message systems. Streaming ● Streaming works best in situations where the order of messages is important—for example, data ingestion. ● Kafka and Amazon Kinesis are examples of messaging systems that use streaming semantics for consuming messages. Data in motion
  • 17. E-Commerce w/o Pulsar ✓ Separate storage ✓ Tiering outside toolset ✓ Separate application and data domains ✓ Different tech stacks
  • 18. Why not a system that is able to support messaging and streaming?
  • 19. E-Commerce with Pulsar ✓ Unified storage for in- motion data ✓ Native tiered storage ✓ Single system to exchange data ✓ Teams share toolset
  • 20. Build Apache Pulsar for unified messaging and streaming
  • 21. Step 1: A scalable storage for streams of data
  • 22. Step 2: Separate serving from storage Apache Pulsar Apache BookKeeper Broker 0 Producer Consumer Broker 1 Broker 2 Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4
  • 23. Step 3: Unified API Streaming Messaging Producer 1 Producer 2 Pulsar Topic/Partition m0 m1 m2 m3 m4 Consumer D-1 Consumer D-2 Consumer D-3 Subscription D Key-Shared Consumer C-1 Consumer C-2 Consumer C-3 Subscription C m1 m2 m3 m4 m0 Shared Failover Consumer B-1 Consumer B-0 Subscription B m1 m2 m3 m4 m0 In case of failure in Consumer B-0 Consumer A-1 Consumer A-0 Subscription A m1 m2 m3 m4 m0 Exclusive X
  • 24. Reader and Batch API Pub/Sub API Publisher Subscriber Step 3: Unified API Stream Processor Applications Microservices or Event-Driven Architecture
  • 25. Step 4: Schema API Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API
  • 26. Step 5: Functions and IO API Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors
  • 27. Step 6: Tiered Storage Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors Tiered Storage
  • 28. Step 7: Protocol Handlers Apache Pulsar Pulsar Protocol Handler Pulsar Clients (queue + stream) Kafka Protocol Handler AMQP Protocol Handler MQTT Protocol Handler Kafka Clients AMQP Clients MQTT Clients
  • 29. Reader and Batch API Pub/Sub API Publisher Subscriber Stream Processor Applications Microservices or Event-Driven Architecture Schema API Schema API Functions API Pulsar IO/Connectors Prebuilt Connectors Custom Connectors Tiered Storage Step 8: Transaction API Transaction API
  • 30. Pulsar 2.8 towards a complete vision of unified messaging and streaming
  • 31. The future of Pulsar
  • 32. Towards a self-adjusting data platform ✓ Tuning data platforms to run at scale is hard ✓ Lots of configurations ✓ Requires in-depth knowledge of internals ✓ Workloads are constantly changing
  • 33. Topic auto-partitioning ✓ Partitions are an artifact of implementation ✓ It’s not a natural property of the data ✓ Abstract the partitioning away from users ✓ Partitions are automatically split / merged based ✓ Rethink how an API should look like
  • 34. Self-Adjusting Storage ✓ Ensure most optimal utilization of hardware ✓ No configuration ✓ Automatically adjust strategies based on changing condition: ✓ Disk access ✓ Cache management ✓ Queue sizes
  • 35. Pulsar Functions ✓ The foundation is now mature — UX is still poor ✓ Simpler tooling to create & manage functions ✓ CI/CD integration — Versioning — A/B testing ✓ Observability & Debuggability ✓ Improve support for Go and Python functions ✓ DSL — Provide higher level constructs to process data
  • 36. Stream Storage ✓ Evolve the current state of Tiered Storage ✓ Integrate with data lake technologies
  • 37. Working with the data community

Editor's Notes

  1. Before diving into the “Unified Messaging and Streaming”, let’s take a look at the trends in Pulsar community.
  2. To understand what is happening behind the scene, we need to rewind back to the early days of Pulsar. Back to 2012, when we first set out to build Pulsar, we thought there should be a global geo-replicated infrastructure for all the messaging data. We didn’t start with the idea of making our own software, but started by observing the gaps in the existing technologies available at the time and realized how they were insufficient to serve the needs of an data-driven organization.
  3. Talking about these 2 different worlds Messaging - read slide These are like commands that represent changes that need to be made to the system An example : we send message that says “Process this order” or “change user to be deleted” but we don’t actually perform that change just notify Messaging systems are selected when synchronous communications breaks down In contrast - streaming systems deal with events. The state changes themselves, so instead of sending a message saying this user wants to update their email, we instead actually perform the update Events interlinked together that may be persisted, replayed or aggregated
  4. Talking about these 2 different worlds Messaging - read slide These are like commands that represent changes that need to be made to the system An example : we send message that says “Process this order” or “change user to be deleted” but we don’t actually perform that change just notify Messaging systems are selected when synchronous communications breaks down In contrast - streaming systems deal with events. The state changes themselves, so instead of sending a message saying this user wants to update their email, we instead actually perform the update Events interlinked together that may be persisted, replayed or aggregated
  5. Instructor Notes What we have here is a little bit of an example of what we might see in a modern organization that has run into both these issues We have basically 2 different regimes or 2 different worlds - different teams. Historically, these worlds often seem very different with entirely different tech stacks and entirely different teams. However, as data becomes more critical in informing applications, the need to have applications make more use of what data teams and data services are producing. Likewise getting the data out of applications and into the data realm has forced organizations to get better at being able to do both of these things really well. This can be a real challenge. So on the left we have the application side and these are applications that are interacting via messages and dealing with the aspects of running your systems and providing capabilities focused on business concerns On the right side we have services that deal with the data. Data bulk and large Sometimes the right side includes real time or batch processes such as sending large amounts of data, putting it into data lakes, making computing answers about it, sending data for another services or providing that data to other orgs that need it These 2 worlds generally are using different technologies and different tools and different processes - all leading to more complexity and cost
  6. Read slide Separate storage/transport systems for messaging, streaming, and big-data. Focus on ETL separate processes Messaging helps decouple apps, provides for reliable async communication, work queues, in core applications. Streaming allows for “medium-term” storage of streams (~30 days), aggregating streams of data and real-time processing for near real-time analytics. Batch processing and long-term object storage (S3, HDFS, etc) allows for processing historical data to learn from the past. “Tiering” of data from messaging -> streaming -> object storage is outside of core toolset and is maintained explicitly. Application and Data domains are separated, data is replicated into data domain. Results from data domain are loaded (ETL) back into application domain. Multiple teams with very different technology stacks. ==== To show how Pulsar provides that ability to be transformative here is a common example of an e-commerce system stack that contains both a streaming set of services and also data processing On the application side we have order services, inventory service and fulfillment Talk to each service (think Amazon) On the data side we have Spark - some batch processing using spark Flink - Real time inventory analysis using flink Another use case maybe some long term storage needs versus short term (30 days) then data warehouse layer Imagine a person ordering something and then check inventory and it isn’t there. Do you delete the order or put on backorder? Once the inventory gets replenished then how do we notify the customers that their order is now coming So need to join both sides together
  7. It is very nature to merge both. Talk about the technologies are evolved to a way to that is able to support both. Read slide and add more context: “Unified” storage/transport of message and streams with access to underlying data: Messaging - Decoupled applications with pub/sub, shared subscriptions for work queues, exclusive subscriptions for fanout and point-to-point messaging with flexible large numbers of non-partitioned topics. Streaming - Ordered, scalable partitioned topics with failover and key shared subscriptions. Pub/sub (broker controlled) or reader API (client controlled) for advanced stream processing, replay, etc. Big-data batch Access - Underlying segments of topics can be read directly, allow for scale-out parallelism. Tiered storage is core to Pulsar, no need for external tools. Application and data domains use single system to exchange data, with converged “messaging” and “streaming”. One or many teams, with shared toolset. Talk to diagram Talk to the slide and on the left side say how Pulsar can process real time streams and on the right can do batch processing, offload to tiered storage and read back in parallel batch fashion and even provide a stream back to other systems for consumption order services, inventory service and fulfillment - they still work from the messaging domain (use cases not too different) But now can support processing at much higher scale, any messages they have are kept in Pulsar as a single source of truth and these messages can be offloaded via Pulsar to long term storage Pulsar also provides the power to enable a unified batch and streaming job that can do both batch processing by reading from underlying storage and combine that with real time streams all with a single technology
  8. Let's take a retrospective look at how Pulsar has evolved through the years. When we started designing Pulsar as a new platform, we always had this idea of supporting both the Pub-Sub semantics as well as the data streaming pipelines, which at the time were a new and emerging thing. But it would be a lie to say we had everything pre-planned since the beginning. Instead, we spent a lot of time observing how people used these platforms and we tried to fill all the gaps we were seeing, evolving Pulsar with the changing needs of data applications.
  9. At the very core of Pulsar there has always been the concept of the "log". A distributed, replicated and immutable ledger where all the events are appended. BookKeeper has proved, throughout the years, to be the best storage solution for streams of data. It scales to very large number of logs, it offers consistency, durability, low latency and high-throughput and, more importantly, very convenient operational tooling. To summarize: using the log as a building block does a lot of the heavy lifting required to build a truly scalable system.
  10. Another architectural choice that came naturally from using BookKeeper has been the separation of the storage from the data serving layer. This comes from BookKeeper because BookKeeper requires to have a single writer for a each log. In our case the Broker acts as that single writer. This multi-layer architecture was exactly what we needed because it allows Pulsar to have: 1. Stateless brokers - Means topics can be easily moved across brokers without copying any data. For example, expanding cluster or adjusting the topics assignments after changing conditions. 2. Data locality - Because of this broker layer, the data for a single topic or partition does not have to be stored in one single storage node. Instead we can fully utilize the resources of the entire cluster.
  11. We just said that the log is the building block of Pulsa... but the log on its own is a very low level construct. Applications very often need much more sophisticated ways of interacting with the data than just reading through the log of events. Instead, we wanted to capture the right level of semantics needed to support a wide range of pub-sub and streaming use cases. The core idea was to leave the flexibility to consume data from topics in multiple different ways, depending on what the application needs. We ended up having 4 subscription types with different semantics and different properties, each one with its own merits.
  12. After the Pub/Sub API, the next addition was the Reader API. You can think of it as the "unmanaged" way to consume data from a topic. While there are many reasons for using a reader, the main users are typically Stream Processing frameworks because they tend to have their own checkpointing mechanisms or, similarly, batch systems that want to do a scan of the historical data.
  13. The common theme in the API exposed by Pulsar is the support for Schema. Having direct support for Schema inside Pulsar means that brokers can validate the schema of the data being published and that the expectation of consumers is matched as well. But it also means that it becomes very easy to "discover" the schema of the data. The discoverability of the schema means that you can write fully type safe generic consumers that don't need to be aware of one specific schema.
  14. Next we looked at what people were trying to do with messaging platforms and the realization was that there was always some portion of computation involved. Application very often need to do simple data transformations, enrichment and similar things. Functions were designed to provide the simplicity of the "Serverless" model with a very tight integration in the Pulsar platform. One example of how powerful Pulsar functions are is that we have created a connector framework, Pulsar IO, entirely based on Pulsar Functions. With Pulsar IO, you can choose between a large set of pre-built connectors, both sources or sinks, or build your own custom connectors.
  15. After that, the next trend saw is that more and more users wanted to use the "stream" concept not just as a temporary buffer, as a way to isolate the data ingestion and the processing. Instead, they increasingly want to keep the stream as a permanent, or at least long term "storage of record". Tiered storage was the missing link to enable this. By offloading cold data to cloud storage providers, we can have large scale data retention at a very effective cost, all while maintaining the stream view of the data and the same APIs.
  16. Another realization was that, because of its nature, messaging is always the integration point for different applications and components. This makes migration from other platforms a bit harder. You often have to coordinate that migration across different teams or organizations. To make it easier, we extended the Pulsar brokers to be able to speak several protocols, in addition the Pulsar native protocols. With Protocol Handlers, there is a pluggable way to add more ways to interact with the Pulsar service and the same topic data. We started with KoP, Kafka On Pulsar, then followed up by AMQP and MQTT. It is very powerful mechanism for a few reasons: 1. Applications can use existing client libraries with no code or dependencies changes 2. You can mix all sort of different protocols to interact with the same topic 3. It's exposed directly in Pulsar brokers, data is stored only once and there is no "proxy overhead"
  17. To really complete the full picture, in Pulsar 2.8 we introduced support for transactions. It's now possible to do very complex interactions and take advantage of the transactional properties, for example publishing messages atomically across multiple topics, or consuming and producing atomically.
  18. We can say that Pulsar 2.8 is a big milestone in the journey completing this vision of unified messaging and streaming platform. We are very excited and very proud of this release. This is culminating months and months of work by a “larger than ever” group of committers and contributors. And while transactions support is the biggest new feature, it is certainly not the only one. We have feature like Exclusive producer support, about which I will Be talking about tomorrow in an ad-hoc session, a new API for package management, to improve the way we manage the functions and connectors code artifacts, or finally simplified way to configure memory limit in Pulsar clients.
  19. After looking at the past, let's now take a look at some of the items that we want to focus on in the very near future.
  20. A problem that we're seeing overall in the data ecosystem is that these platforms can be very difficult to tune and operate when running at a large scale. This is not a problem specific to Pulsar, but it is something that we believe it should be addressed. Typically, there are a lot of configuration options and each of them requires in-depth knowledge of the internal of the system. Worse, when integrating multiple systems, like a comput framework, it might be very hard to predict how a change in the configuration will affect the overall stability and performance. Finally, the workloads are increasingly dynamic and constantly changing. It's not possible to have a static configuration that will have "optimal" performance in every condition.
  21. The first item I want to discuss is partitioning. People are used to see partitioning and sharding, but these are really artifacts of how systems are implemented. Partitions are usually not a natural property of the data. Because of that, we want to abstract the partition concept away from the user sight. Application developers should not be worried about partitions, operators should not be thinking at how many partitions are needed for a certain use case. Instead, the system should be able to figure it out on its own, internally splitting and merging partitions, while maintaining the fundamentals ordering guarantees.
  22. Tuning storage system can also be a very complex task. In particular, it can be very hard to predict the impact of configuration on the overall performance when we're crossing multiple layers: there is the Operating System, the disk device and the disk controller. In a similar way, the idea we have is to make it working with no configuration, in a way that the storage system is able to automatically adjust the strategies based on the changing conditions of the traffic. All aspects regarding the access pattern to the disk, what kind of cache eviction strategy and so on.
  23. When we introduced Pulsar Functions, we had the idea of making it a frictionless platform for developers to do data processing. Over few years, the foundation of Pulsar Functions runtime has really matured into a solid platform, although the user experience is still not great. While it is very easy for developers to write functions, we should strive to make it much easier to actually deploy and manage functions. For example, having functions tooling to be well integrated with CI/CD platforms, supporting versioning and out of the box support for A/B testings. Another aspect is observability and debuggability. The tooling and the platform needs to make it super-easy for users to discover issues in their own code or to detect performance issues. Finally, we are thinking on a more higher level DSL, that can support higher level constructs to further simplify writing data processing functions.
  24. We talked before about Tiered Storage and how it has enabled completely new use cases to be supported by Pulsar. The next step here is to make sure we can integrate with existing data lake technologies, like Delta Lake and Apache Hudi. The vision is to use the Data Lake as the tiered storage backend, so that the same data can be consumed as a stream or with the data lake tooling.
  25. As a final note, given the very nature of Pulsar, that sits between different systems and platforms and links all of them together, we want to reaffirm our commitment to work with the larger data community to ensure that Pulsar is supported everywhere, out of the box, as a first class citizen. We have been partnering with many Open Source communities like Trino, Druid, Pinot, Spark and Flink. We will continue to do so, and more in the future. We believe that this will benefits Pulsar, its users and the overall data ecosystem.