Apache Pulsar
Unifies
Streaming and
Messaging for
Real-Time Data
● Apache Pulsar Committer | Author of Pulsar In Action
● Former Principal Software Engineer on Splunk’s messaging
team that is responsible for Splunk’s internal
Pulsar-as-a-Service platform.
● Former Director of Solution Architecture at Streamlio.
David
Kjerrumgaard
Developer Advocate
Tim Spann
Developer Advocate
Tim Spann, Developer Advocate at StreamNative
● FLiP(N) Stack = Flink, Pulsar and NiFI Stack
● Streaming Systems & Data Architecture Expert
● Experience:
○ 15+ years of experience with streaming technologies including Pulsar,
Flink, Spark, NiFi, Big Data, Cloud, MXNet, IoT, Python and more.
○ Today, he helps to grow the Pulsar community sharing rich technical
knowledge and experience at both global conferences and through
individual conversations.
Sijie Guo
ASF Member
Pulsar/BookKeeper PMC
Founder and CEO
Jia Zhai
Pulsar/BookKeeper PMC
Co-Founder
✓ Data veterans with
extensive industry
experience
✓ Original creators of Apache
Pulsar & BookKeeper
✓ Operated the largest
Pulsar/BookKeeper cluster
Matteo Merli
Pulsar PMC Chair,
BookKeeper PMC
CTO
StreamNative Executive Team
Apache Pulsar is a Cloud-Native
Messaging and Event-Streaming Platform.
CREATED
Originally
developed inside
Yahoo! as Cloud
Messaging
Service
GROWTH
10x Contributors
10MM+ Downloads
Ecosystem Expands
Kafka on Pulsar
AMQ on Pulsar
Functions
. . .
2012 2016 2018 TODAY
APACHE TLP
Pulsar
becomes
Apache top
level project.
OPEN SOURCE
Pulsar
committed
to open source.
Apache Pulsar Timeline
Evolution of Pulsar Growth
Pulsar Has a Built-in Super Set of OSS
Features
Durability
Scalability Geo-Replication
Multi-Tenancy
Unified Messaging
Model
Reduced Vendor Dependency
Functions
Open-Source Features
Multi-Tenancy Model
Tenants
(Data Services)
Namespace
(Microservices)
Pulsar Cluster
Tenants
(Marketing)
Tenants
(Compliance)
Namespace
(ETL)
Namespace
(Campaigns)
Namespace
(ETL)
Namespace
(Risk Assessment)
Topic-1
(Cust Auth)
Topic-1
(Location Resolution)
Topic-2
(Demographics)
Topic-1
(Budgeted Spend)
Topic-1
(Acct History)
Topic-2
(Risk Detection)
Fintech Powered by Pulsar
10
Low latency
Geo-replication
Data integrity
High availability
Durability
Multi-tenancy
Multiple data
consumers:
Transactions,
payment
processing, alerts,
analytics, fraud
detection with ML
Large data
volumes, high
scalability
Financial
event
messaging
Many topics,
producers,
consumers
11
Designed for
teams, with
built in
multi-tenancy
Power and
flexibility,
w/ support for
simultaneous
streaming and
messaging use
cases
Ideal for
high-scale,
mission
critical
microservices
Easy to use,
with a simple
pub/sub API
Asynchronous APIs Empower All
Ideal for app and data tiers
Less sprawl and better
utilization
Cloud-native scalability
Build globally without
the complexity
Cost effective long-term
storage
Pulsar across the
organization
Joining Streams in SQL
Perform in Real-Time
Ordering and Arrival
Concurrent Consumers
Change Data Capture
Data Streaming
Streaming
Consumer
Consumer
Consumer
Subscription
Shared
Failover
Consumer
Consumer
Subscription
In case of failure in
Consumer B-0
Consumer
Consumer
Subscription
Exclusive
X
Consumer
Consumer
Key-Shared
Subscription
Pulsar
Topic/Partition
Messaging
Background
● The third-largest payment
provider in China behind
Alipay and WeChat
Payment
● 500 million registered users
and 41.9 million active users
● Need to improve the
efficiency of fraud detection
for mobile payments
● Current lambda architecture
of Kafka + Hive is complex
and difficult to maintain
Benefits
● Reduce complexity by 33%
(clusters reduced from six to
four)
● Improve production
efficiency by 11 times
● Higher stability due to the
unified architecture
Why Pulsar
● Cloud-native architecture
and segment-centric
storage
● Pulsar is able to do both
streaming and batch
processing
● Able to build a unified
data processing stack
with Pulsar and Spark,
streamlining messy
operations problems
Apps
Building Real-Time Requires a Team
DEMO
Scan the QR code
to learn more about
Apache Pulsar and
StreamNative.
Scan the QR code
to build your own
apps today.
Apache Pulsar
Apache BookKeeper
Broker 0
Producer
Consumer - Kafka
Broker 1 Broker 2
Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4
T
1
T
2
T
3
T
4
T
0
Consumer - Pulsar

[AerospikeRoadshow] Apache Pulsar Unifies Streaming and Messaging for Real-Time Data

  • 1.
  • 2.
    ● Apache PulsarCommitter | Author of Pulsar In Action ● Former Principal Software Engineer on Splunk’s messaging team that is responsible for Splunk’s internal Pulsar-as-a-Service platform. ● Former Director of Solution Architecture at Streamlio. David Kjerrumgaard Developer Advocate
  • 3.
    Tim Spann Developer Advocate TimSpann, Developer Advocate at StreamNative ● FLiP(N) Stack = Flink, Pulsar and NiFI Stack ● Streaming Systems & Data Architecture Expert ● Experience: ○ 15+ years of experience with streaming technologies including Pulsar, Flink, Spark, NiFi, Big Data, Cloud, MXNet, IoT, Python and more. ○ Today, he helps to grow the Pulsar community sharing rich technical knowledge and experience at both global conferences and through individual conversations.
  • 4.
    Sijie Guo ASF Member Pulsar/BookKeeperPMC Founder and CEO Jia Zhai Pulsar/BookKeeper PMC Co-Founder ✓ Data veterans with extensive industry experience ✓ Original creators of Apache Pulsar & BookKeeper ✓ Operated the largest Pulsar/BookKeeper cluster Matteo Merli Pulsar PMC Chair, BookKeeper PMC CTO StreamNative Executive Team
  • 5.
    Apache Pulsar isa Cloud-Native Messaging and Event-Streaming Platform.
  • 6.
    CREATED Originally developed inside Yahoo! asCloud Messaging Service GROWTH 10x Contributors 10MM+ Downloads Ecosystem Expands Kafka on Pulsar AMQ on Pulsar Functions . . . 2012 2016 2018 TODAY APACHE TLP Pulsar becomes Apache top level project. OPEN SOURCE Pulsar committed to open source. Apache Pulsar Timeline
  • 7.
  • 8.
    Pulsar Has aBuilt-in Super Set of OSS Features Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model Reduced Vendor Dependency Functions Open-Source Features
  • 9.
    Multi-Tenancy Model Tenants (Data Services) Namespace (Microservices) PulsarCluster Tenants (Marketing) Tenants (Compliance) Namespace (ETL) Namespace (Campaigns) Namespace (ETL) Namespace (Risk Assessment) Topic-1 (Cust Auth) Topic-1 (Location Resolution) Topic-2 (Demographics) Topic-1 (Budgeted Spend) Topic-1 (Acct History) Topic-2 (Risk Detection)
  • 10.
    Fintech Powered byPulsar 10 Low latency Geo-replication Data integrity High availability Durability Multi-tenancy Multiple data consumers: Transactions, payment processing, alerts, analytics, fraud detection with ML Large data volumes, high scalability Financial event messaging Many topics, producers, consumers
  • 11.
    11 Designed for teams, with builtin multi-tenancy Power and flexibility, w/ support for simultaneous streaming and messaging use cases Ideal for high-scale, mission critical microservices Easy to use, with a simple pub/sub API Asynchronous APIs Empower All
  • 12.
    Ideal for appand data tiers Less sprawl and better utilization Cloud-native scalability Build globally without the complexity Cost effective long-term storage Pulsar across the organization
  • 13.
    Joining Streams inSQL Perform in Real-Time Ordering and Arrival Concurrent Consumers Change Data Capture Data Streaming
  • 14.
    Streaming Consumer Consumer Consumer Subscription Shared Failover Consumer Consumer Subscription In case offailure in Consumer B-0 Consumer Consumer Subscription Exclusive X Consumer Consumer Key-Shared Subscription Pulsar Topic/Partition Messaging
  • 17.
    Background ● The third-largestpayment provider in China behind Alipay and WeChat Payment ● 500 million registered users and 41.9 million active users ● Need to improve the efficiency of fraud detection for mobile payments ● Current lambda architecture of Kafka + Hive is complex and difficult to maintain Benefits ● Reduce complexity by 33% (clusters reduced from six to four) ● Improve production efficiency by 11 times ● Higher stability due to the unified architecture Why Pulsar ● Cloud-native architecture and segment-centric storage ● Pulsar is able to do both streaming and batch processing ● Able to build a unified data processing stack with Pulsar and Spark, streamlining messy operations problems
  • 18.
  • 19.
  • 20.
    Scan the QRcode to learn more about Apache Pulsar and StreamNative.
  • 21.
    Scan the QRcode to build your own apps today.
  • 22.
    Apache Pulsar Apache BookKeeper Broker0 Producer Consumer - Kafka Broker 1 Broker 2 Bookie 0 Bookie 1 Bookie 2 Bookie 3 Bookie 4 T 1 T 2 T 3 T 4 T 0 Consumer - Pulsar