SlideShare a Scribd company logo
Large scale log pipeline using
Apache Pulsar
Yahoo Japan Corporation
Nozomi Kurihara
June, 18th, 2020
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 2
Who am I?
Nozomi Kurihara
• Software engineer at Yahoo! JAPAN (April 2012 ~)
• Working on internal messaging platform using Apache Pulsar
• Committer of Apache Pulsar
• (Hobby: Board / video games!)
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved.
Agenda
3
1. Apache Pulsar at Yahoo! JAPAN
- About Yahoo! JAPAN
- Why Pulsar was chosen
- Architecture and performance
- Use cases
2. Large scale log pipeline
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 4
Apache Pulsar at Yahoo! JAPAN
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 5
Yahoo! JAPAN
https://www.yahoo.co.jp/
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 6
Yahoo! JAPAN – 3 numbers
100+ 150,000+ 49,010,000+
image: aflo
login users per month
(2019/06)
servers
(real)
services
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 7
Pulsar at Yahoo! JAPAN
• We use Apache Pulsar as a centralized messaging platform for 3.5 years
• 1 Pulsar maintainer team and a lot of teams (services) use Pulsar as a “tenant”
Producer
Service A
Consumer
Producer Consumer
Producer Consumer
Topic B
Topic A
Pulsar team
Pulsar
Service B
Service C
Topic C
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 8
Pulsar at Yahoo! JAPAN - Users
More and more services start to use Pulsar!
• 270+ tenants
• 4400+ topics
• ~50K publishes/s
• ~150K consumes/s
Typical use cases:
• Notification
• Job queueing
• Log pipeline
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 9
Pulsar community in Japan
TechBlog
- https://techblog.yahoo.co.jp/entry/20200312818173/
- https://techblog.yahoo.co.jp/entry/20200413827977/
- https://techblog.yahoo.co.jp/entry/2020060330002394/
Apache Pulsar Meetup Japan (in Tokyo)
- https://japan-pulsar-user-group.connpass.com/
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 10
Why Pulsar was chosen
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 11
Why did Yahoo! JAPAN choose Pulsar?
Large number of customers
Large number of services
Sensitive/mission-critical messages
Multiple data centers
→ High performance & scalability
→ Multi-tenancy
→ Security & Durability
→ Geo-replication
Pulsar meets all requirements!
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 12
Multi-tenancy
Share 1 Pulsar with all YJ services → low hardware and labor costs
Service A
MQ ConsumerProducer
Service B
MQ ConsumerProducer
Service C
MQ ConsumerProducer
Service A
topic ConsumerProducer
Service B
topic ConsumerProducer
Service C
topic ConsumerProducer
Pulsar team
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 13
Multi-tenancy – self-service
Users can create/configure/delete their topics by themselves
→ management of topics is delegated to users
Internal Web UI tool to manage topics (will be replaced with pulsar-manager):
Create tenant
Create namespace See topic stats
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 14
Architecture and performance
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved.
East
Broker
Bookie ZK
WebSocket
Proxy
15
Clusters in Yahoo! JAPAN
West
Broker
Bookie ZK
WebSocket
Proxy
Geo-replication
For each cluster:
• 20 WS proxies
• 15 Brokers
• 10 Bookies
• 5 ZKs
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 16
Performance – experimental settings
CPU Memory Disk NIC
Broker 2.00GHz / 2CPU 768GB SATA SSD 240GB x2(RAID1) 10GBaseT
Bookie 2.00GHz / 2CPU 768GB Journal: SATA SSD 240GB x2(RAID1)
Ledger: SATA HDD 10TB x12(RAID1+0)
10GBaseT
• Pulsar version: 2.3.2(Broker) / 2.4.1(Client)
• Tool: openmessaging-benchmark
• Message size: 1 KB
• partition: 1, 16, 32
• rate(attempted): 100000, 500000
• Server spec:
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 17
Performance – experimental results
- 16, 32 partitions achieves 500,000 msg/s whereas 1 partition does not
- max publish rate with 1 partition looks 200,000 msg/s
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 18
Tuning example (Bookie)
Problem:
• More users increases, more writes to SSD
• That reduces lifespan of SSD (actually we saw frequent failure of SSD)
Solution:
Increase journalMaxGroupWaitMSec from 1 to 2
→ Write decreased by 30% at the sacrifice of the least latency
CPU Memory Disk NIC
Broker 2.00GHz / 2CPU 768GB SATA SSD 240GB x2(RAID1) 10GBaseT
Bookie 2.00GHz / 2CPU 768GB Journal: SATA SSD 240GB x2(RAID1)
Ledger: SATA HDD 10TB x12(RAID1+0)
10GBaseT
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 19
Use cases
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 20
Case 1 – Notification of contents update
Various contents files pushed from partner companies to Yahoo! JAPAN
Notification sent to topic when contents are updated
Once services receive notification, fetch contents from file server
Producer
Consumer
Topic
Service A
Pulsar
①send notification
③fetch content files
Consumer
Service B
Consumer
Service CPartner
Companies
weather, map, news etc.
FTP server
ftpd
②receive notification
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 21
Case 2 – Job queuing in mail service
Asynchronously execute heavy jobs like indexing of mail
Producers register jobs to Pulsar
Consumers take jobs from Pulsar at their own pace
Producer
Consumer
Producer
Topic Handler for indexing
Mail BE server
Mail BE server
Pulsar
request
Register a job
Re-register if it fails
Take and process a job
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 22
Case 3 – Kafka alternative
We have an internal FaaS system using Apache OpenWhisk
Problem: FaaS team had to maintain Apache Kafka
Solution: migrate from Kafka to our internal Pulsar
Pulsar Kafka Wrapper needs only a few configuration changes (.pom, topic name, etc.)
<dependency>
- <groupId>org.apache.kafka</groupId>
- <artifactId>kakfa-clients</artifactId>
- <version>0.10.2.1</version>
+ <groupId>org.apache.pulsar</groupId>
+ <artifactId>pulsar-client-kafka</artifactId>
+ <version>2.4.0</version>
</dependency>
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 23
Large scale log pipeline
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 24
Situation
…
Service developers
deploy
monitor
logs/
metrics
PaaS CaaSFaaS
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 25
Yamas
• Metrics monitoring / alerting platform (SaaS)
• Originally developed in Verizon media
• Will be open-sourced soon!
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 26
Scale
• Amount of total logs: 1.4~3.8 TB/h
• Peek traffics: 10+ Gbps
• Number of PFs will increase more and more
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 27
Legacy architecture
Computing PFs
app
PaaS…
…
Monitoring PFs
Splunk
Yamas
Yamas
agent
Splunk
agent
app
app
app
app
CaaS
Yamas
agent
Splunk
agent
app
app
app
L Need to install dedicated “agent” for each Monitoring PFs
L Difficult to scale out
L Traffic spikes directly influence Monitoring PFs
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 28
Motivation
Remove dedicated agent for each monitoring PF:
- No need specific knowledge and extra components
- Easier trouble shooting
Decouple sender/receiver PFs by introducing message queueing layer:
- Scalability
- Resiliency
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 29
New architecture
Computing PFs
app
PaaS…
…
Monitoring PFs
Splunk
Yamas
Splunk topic
app
app
app
app
CaaS
Pulsar
producer
app
app
app
Pulsar
Yamas topic
Pulsar
producer
Pulsar
consumer
Pulsar
consumer
J Single library
J Easy to scale out
J Traffic spikes are mitigated by queueing layer
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 30
Topic design – 3 patterns
PaaS
Pulsar
CaaS
PaaS
CaaS
Splunk
Yamas
①Producer-centric
②Consumer-centric
Messages are filtered/transformed at Consumer-side:
J Producers donʼt care about Consumers
L Consumers care about Producers
Splunk
Pulsar
Yamas
PaaS
CaaS
Splunk
Yamas
Messages are filtered/transformed at Producer-side:
J Consumers donʼt care about Producers
L Producers care about Consumers
③Function
Splunk
Pulsar
Yamas
PaaS
CaaS
Splunk
Yamas
Messages are filtered/transformed at Function-side:
J Both Producers and Consumers donʼt care about each other
L Extra loads: traffic, computing, storage etc.
PaaS
CaaS
func
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 31
Topic format and message format
{consumer_pf}/{region}/{message_type}-{num}
splunk/west/log-0
Pulsar (west)
yamas/west/metric-0
splunk/west/log-1
splunk/west/metric-0
……
splunk
yamas
…
west
east
log
metric
…
splunk/east/log-0
Pulsar (east)
yamas/east/metric-0
splunk/east/log-1
splunk/east/metric-0
………
{
"time": "2018-10-25T08:36:47.000Z",
"producer": "paas-producer.example.com",
"origin": "app.space.org.cluster.dc.nwseg",
"domain": "paas",
"body": {
"message": "hello splunk”,
…
}
}
Pulsar
producer
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 32
Use case: Pulsar stats on Yamas
YamasPulsar
Yamas topic
Pulsar
producer
/admin/v2/broker-stats/topics
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 33
Conclusion
Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved.
Conclusion
34
Conclusion:
• Yahoo! JAPAN uses Pulsar as a centralized platform for various services
• Recently we start to use Pulsar as a large scale log pipeline where
computing PFs publish their logs/metrics and monitoring PFs consume
• Pulsar plays an important role to connect various PFs and make whole
system scalable and resilient
Future plan:
• More Producer PFs and Consumer PFs
• Visualize SLI (message delivery rate, latency etc.)
Large scale log pipeline using Apache Pulsar_Nozomi

More Related Content

What's hot

Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&PierreKafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
StreamNative
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsar
StreamNative
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
StreamNative
 
Kafka on Pulsar
Kafka on Pulsar Kafka on Pulsar
Kafka on Pulsar
StreamNative
 
Query Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache FlinkQuery Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache Flink
StreamNative
 
Serverless Event Streaming with Pulsar Functions
Serverless Event Streaming with Pulsar FunctionsServerless Event Streaming with Pulsar Functions
Serverless Event Streaming with Pulsar Functions
StreamNative
 
Getting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison HighamGetting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison Higham
StreamNative
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
Discover Pinterest
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
StreamNative
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
StreamNative
 
When apache pulsar meets apache flink
When apache pulsar meets apache flinkWhen apache pulsar meets apache flink
When apache pulsar meets apache flink
StreamNative
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
Joe Stein
 
Apache Pulsar and Github
Apache Pulsar and GithubApache Pulsar and Github
Apache Pulsar and Github
StreamNative
 
Building a FaaS with pulsar
Building a FaaS with pulsarBuilding a FaaS with pulsar
Building a FaaS with pulsar
StreamNative
 
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
StreamNative
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
confluent
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
StreamNative
 
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Chen-en Lu
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
mattlieber
 
I Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache KafkaI Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache Kafka
Jay Kreps
 

What's hot (20)

Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&PierreKafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsar
 
Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)Lessons from managing a Pulsar cluster (Nutanix)
Lessons from managing a Pulsar cluster (Nutanix)
 
Kafka on Pulsar
Kafka on Pulsar Kafka on Pulsar
Kafka on Pulsar
 
Query Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache FlinkQuery Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache Flink
 
Serverless Event Streaming with Pulsar Functions
Serverless Event Streaming with Pulsar FunctionsServerless Event Streaming with Pulsar Functions
Serverless Event Streaming with Pulsar Functions
 
Getting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison HighamGetting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison Higham
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
 
When apache pulsar meets apache flink
When apache pulsar meets apache flinkWhen apache pulsar meets apache flink
When apache pulsar meets apache flink
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Apache Pulsar and Github
Apache Pulsar and GithubApache Pulsar and Github
Apache Pulsar and Github
 
Building a FaaS with pulsar
Building a FaaS with pulsarBuilding a FaaS with pulsar
Building a FaaS with pulsar
 
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
How Apache Pulsar Helps Tencent Process Tens of Billions of Transactions Effi...
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
I Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache KafkaI Heart Log: Real-time Data and Apache Kafka
I Heart Log: Real-time Data and Apache Kafka
 

Similar to Large scale log pipeline using Apache Pulsar_Nozomi

Preparations for koha implementation
Preparations for koha implementationPreparations for koha implementation
Preparations for koha implementation
Mahatma Gandhi University Library
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Yahoo!デベロッパーネットワーク
 
Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?
DataCore Software
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
KTN
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
RCCSRENKEI
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
HostedbyConfluent
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Timothy Spann
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
Todd Palino
 
Apache Pulsar Development 101 with Python
Apache Pulsar Development 101 with PythonApache Pulsar Development 101 with Python
Apache Pulsar Development 101 with Python
Timothy Spann
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core
C4Media
 
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics WorkshopLagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus SDN/OpenFlow switch
 
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
Rakuten Group, Inc.
 
Apache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt LtdApache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt Ltd
Strakin Technologies Pvt Ltd
 
Hive Now Sparks
Hive Now SparksHive Now Sparks
Hive Now Sparks
DataWorks Summit
 
Updates on webSpoon and other innovations from Hitachi R&D
Updates on webSpoon and other innovations from Hitachi R&DUpdates on webSpoon and other innovations from Hitachi R&D
Updates on webSpoon and other innovations from Hitachi R&D
Hiromu Hota
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
Filipe Miranda
 
DataCore Technology Overview
DataCore Technology OverviewDataCore Technology Overview
DataCore Technology Overview
Jeff Slapp
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017
Databricks
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
DataWorks Summit/Hadoop Summit
 
Apache StreamPipes – Flexible Industrial IoT Management
Apache StreamPipes – Flexible Industrial IoT ManagementApache StreamPipes – Flexible Industrial IoT Management
Apache StreamPipes – Flexible Industrial IoT Management
Apache StreamPipes
 

Similar to Large scale log pipeline using Apache Pulsar_Nozomi (20)

Preparations for koha implementation
Preparations for koha implementationPreparations for koha implementation
Preparations for koha implementation
 
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DMUpgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
Upgrading HDFS to 3.3.0 and deploying RBF in production #LINE_DM
 
Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?Can $0.08 Change your View of Storage?
Can $0.08 Change your View of Storage?
 
Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
Why Spring Belongs In Your Data Stream (From Edge to Multi-Cloud)
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Apache Pulsar Development 101 with Python
Apache Pulsar Development 101 with PythonApache Pulsar Development 101 with Python
Apache Pulsar Development 101 with Python
 
“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core“Quantum” Performance Effects: beyond the Core
“Quantum” Performance Effects: beyond the Core
 
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics WorkshopLagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
 
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
 
Apache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt LtdApache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt Ltd
 
Hive Now Sparks
Hive Now SparksHive Now Sparks
Hive Now Sparks
 
Updates on webSpoon and other innovations from Hitachi R&D
Updates on webSpoon and other innovations from Hitachi R&DUpdates on webSpoon and other innovations from Hitachi R&D
Updates on webSpoon and other innovations from Hitachi R&D
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
DataCore Technology Overview
DataCore Technology OverviewDataCore Technology Overview
DataCore Technology Overview
 
What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017 What to Expect for Big Data and Apache Spark in 2017
What to Expect for Big Data and Apache Spark in 2017
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
 
Apache StreamPipes – Flexible Industrial IoT Management
Apache StreamPipes – Flexible Industrial IoT ManagementApache StreamPipes – Flexible Industrial IoT Management
Apache StreamPipes – Flexible Industrial IoT Management
 

More from StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
StreamNative
 

More from StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Recently uploaded

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 

Large scale log pipeline using Apache Pulsar_Nozomi

  • 1. Large scale log pipeline using Apache Pulsar Yahoo Japan Corporation Nozomi Kurihara June, 18th, 2020
  • 2. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 2 Who am I? Nozomi Kurihara • Software engineer at Yahoo! JAPAN (April 2012 ~) • Working on internal messaging platform using Apache Pulsar • Committer of Apache Pulsar • (Hobby: Board / video games!)
  • 3. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. Agenda 3 1. Apache Pulsar at Yahoo! JAPAN - About Yahoo! JAPAN - Why Pulsar was chosen - Architecture and performance - Use cases 2. Large scale log pipeline
  • 4. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 4 Apache Pulsar at Yahoo! JAPAN
  • 5. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 5 Yahoo! JAPAN https://www.yahoo.co.jp/
  • 6. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 6 Yahoo! JAPAN – 3 numbers 100+ 150,000+ 49,010,000+ image: aflo login users per month (2019/06) servers (real) services
  • 7. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 7 Pulsar at Yahoo! JAPAN • We use Apache Pulsar as a centralized messaging platform for 3.5 years • 1 Pulsar maintainer team and a lot of teams (services) use Pulsar as a “tenant” Producer Service A Consumer Producer Consumer Producer Consumer Topic B Topic A Pulsar team Pulsar Service B Service C Topic C
  • 8. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 8 Pulsar at Yahoo! JAPAN - Users More and more services start to use Pulsar! • 270+ tenants • 4400+ topics • ~50K publishes/s • ~150K consumes/s Typical use cases: • Notification • Job queueing • Log pipeline
  • 9. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 9 Pulsar community in Japan TechBlog - https://techblog.yahoo.co.jp/entry/20200312818173/ - https://techblog.yahoo.co.jp/entry/20200413827977/ - https://techblog.yahoo.co.jp/entry/2020060330002394/ Apache Pulsar Meetup Japan (in Tokyo) - https://japan-pulsar-user-group.connpass.com/
  • 10. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 10 Why Pulsar was chosen
  • 11. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 11 Why did Yahoo! JAPAN choose Pulsar? Large number of customers Large number of services Sensitive/mission-critical messages Multiple data centers → High performance & scalability → Multi-tenancy → Security & Durability → Geo-replication Pulsar meets all requirements!
  • 12. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 12 Multi-tenancy Share 1 Pulsar with all YJ services → low hardware and labor costs Service A MQ ConsumerProducer Service B MQ ConsumerProducer Service C MQ ConsumerProducer Service A topic ConsumerProducer Service B topic ConsumerProducer Service C topic ConsumerProducer Pulsar team
  • 13. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 13 Multi-tenancy – self-service Users can create/configure/delete their topics by themselves → management of topics is delegated to users Internal Web UI tool to manage topics (will be replaced with pulsar-manager): Create tenant Create namespace See topic stats
  • 14. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 14 Architecture and performance
  • 15. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. East Broker Bookie ZK WebSocket Proxy 15 Clusters in Yahoo! JAPAN West Broker Bookie ZK WebSocket Proxy Geo-replication For each cluster: • 20 WS proxies • 15 Brokers • 10 Bookies • 5 ZKs
  • 16. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 16 Performance – experimental settings CPU Memory Disk NIC Broker 2.00GHz / 2CPU 768GB SATA SSD 240GB x2(RAID1) 10GBaseT Bookie 2.00GHz / 2CPU 768GB Journal: SATA SSD 240GB x2(RAID1) Ledger: SATA HDD 10TB x12(RAID1+0) 10GBaseT • Pulsar version: 2.3.2(Broker) / 2.4.1(Client) • Tool: openmessaging-benchmark • Message size: 1 KB • partition: 1, 16, 32 • rate(attempted): 100000, 500000 • Server spec:
  • 17. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 17 Performance – experimental results - 16, 32 partitions achieves 500,000 msg/s whereas 1 partition does not - max publish rate with 1 partition looks 200,000 msg/s
  • 18. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 18 Tuning example (Bookie) Problem: • More users increases, more writes to SSD • That reduces lifespan of SSD (actually we saw frequent failure of SSD) Solution: Increase journalMaxGroupWaitMSec from 1 to 2 → Write decreased by 30% at the sacrifice of the least latency CPU Memory Disk NIC Broker 2.00GHz / 2CPU 768GB SATA SSD 240GB x2(RAID1) 10GBaseT Bookie 2.00GHz / 2CPU 768GB Journal: SATA SSD 240GB x2(RAID1) Ledger: SATA HDD 10TB x12(RAID1+0) 10GBaseT
  • 19. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 19 Use cases
  • 20. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 20 Case 1 – Notification of contents update Various contents files pushed from partner companies to Yahoo! JAPAN Notification sent to topic when contents are updated Once services receive notification, fetch contents from file server Producer Consumer Topic Service A Pulsar ①send notification ③fetch content files Consumer Service B Consumer Service CPartner Companies weather, map, news etc. FTP server ftpd ②receive notification
  • 21. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 21 Case 2 – Job queuing in mail service Asynchronously execute heavy jobs like indexing of mail Producers register jobs to Pulsar Consumers take jobs from Pulsar at their own pace Producer Consumer Producer Topic Handler for indexing Mail BE server Mail BE server Pulsar request Register a job Re-register if it fails Take and process a job
  • 22. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 22 Case 3 – Kafka alternative We have an internal FaaS system using Apache OpenWhisk Problem: FaaS team had to maintain Apache Kafka Solution: migrate from Kafka to our internal Pulsar Pulsar Kafka Wrapper needs only a few configuration changes (.pom, topic name, etc.) <dependency> - <groupId>org.apache.kafka</groupId> - <artifactId>kakfa-clients</artifactId> - <version>0.10.2.1</version> + <groupId>org.apache.pulsar</groupId> + <artifactId>pulsar-client-kafka</artifactId> + <version>2.4.0</version> </dependency>
  • 23. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 23 Large scale log pipeline
  • 24. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 24 Situation … Service developers deploy monitor logs/ metrics PaaS CaaSFaaS
  • 25. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 25 Yamas • Metrics monitoring / alerting platform (SaaS) • Originally developed in Verizon media • Will be open-sourced soon!
  • 26. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 26 Scale • Amount of total logs: 1.4~3.8 TB/h • Peek traffics: 10+ Gbps • Number of PFs will increase more and more
  • 27. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 27 Legacy architecture Computing PFs app PaaS… … Monitoring PFs Splunk Yamas Yamas agent Splunk agent app app app app CaaS Yamas agent Splunk agent app app app L Need to install dedicated “agent” for each Monitoring PFs L Difficult to scale out L Traffic spikes directly influence Monitoring PFs
  • 28. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 28 Motivation Remove dedicated agent for each monitoring PF: - No need specific knowledge and extra components - Easier trouble shooting Decouple sender/receiver PFs by introducing message queueing layer: - Scalability - Resiliency
  • 29. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 29 New architecture Computing PFs app PaaS… … Monitoring PFs Splunk Yamas Splunk topic app app app app CaaS Pulsar producer app app app Pulsar Yamas topic Pulsar producer Pulsar consumer Pulsar consumer J Single library J Easy to scale out J Traffic spikes are mitigated by queueing layer
  • 30. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 30 Topic design – 3 patterns PaaS Pulsar CaaS PaaS CaaS Splunk Yamas ①Producer-centric ②Consumer-centric Messages are filtered/transformed at Consumer-side: J Producers donʼt care about Consumers L Consumers care about Producers Splunk Pulsar Yamas PaaS CaaS Splunk Yamas Messages are filtered/transformed at Producer-side: J Consumers donʼt care about Producers L Producers care about Consumers ③Function Splunk Pulsar Yamas PaaS CaaS Splunk Yamas Messages are filtered/transformed at Function-side: J Both Producers and Consumers donʼt care about each other L Extra loads: traffic, computing, storage etc. PaaS CaaS func
  • 31. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 31 Topic format and message format {consumer_pf}/{region}/{message_type}-{num} splunk/west/log-0 Pulsar (west) yamas/west/metric-0 splunk/west/log-1 splunk/west/metric-0 …… splunk yamas … west east log metric … splunk/east/log-0 Pulsar (east) yamas/east/metric-0 splunk/east/log-1 splunk/east/metric-0 ……… { "time": "2018-10-25T08:36:47.000Z", "producer": "paas-producer.example.com", "origin": "app.space.org.cluster.dc.nwseg", "domain": "paas", "body": { "message": "hello splunk”, … } } Pulsar producer
  • 32. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 32 Use case: Pulsar stats on Yamas YamasPulsar Yamas topic Pulsar producer /admin/v2/broker-stats/topics
  • 33. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. 33 Conclusion
  • 34. Copyright (C) 2020 Yahoo Japan Corporation. All Rights Reserved. Conclusion 34 Conclusion: • Yahoo! JAPAN uses Pulsar as a centralized platform for various services • Recently we start to use Pulsar as a large scale log pipeline where computing PFs publish their logs/metrics and monitoring PFs consume • Pulsar plays an important role to connect various PFs and make whole system scalable and resilient Future plan: • More Producer PFs and Consumer PFs • Visualize SLI (message delivery rate, latency etc.)