SlideShare a Scribd company logo
1 of 69
Dive into Streams with Brooklin
Celia Kung
LinkedIn
Outline
Background
Scenarios
Application Use Cases
Architecture
Current and Future
Background
Nearline Applications
• Require near real-time response
• Thousands of applications at LinkedIn
○ E.g. Live search indices, Notifications
Nearline Applications
• Require continuous, low-latency access to data
○ Data could be spread across multiple database systems
Heterogeneous Data Systems
Espresso
(LinkedIn’s
document store)
Microsoft
EventHubs
Nearline Applications
• Need an easy way to move data to applications
○ App devs should focus on event processing and not on
data access
Building the Right Infrastructure
• Build separate, specialized
solutions to stream data from
and to each different system?
○ Slows down development
○ Hard to manage!
Microsoft
EventHubs
...
Streaming
System C
Streaming
System D
Streaming
System A
Streaming
System B
...
Nearline
Applications
Need a centralized, managed, and extensible service
to continuously deliver data in near real-time
Brooklin
Brooklin
• Streams are dynamically
provisioned and individually
configured
• Extensible: Plug-in support for
additional sources/destinations
• Streaming data pipeline service
• Propagates data from many source
types to many destination types
• Multitenant: Can run several
thousand streams simultaneously
Kinesis
EventHubs
Kafka
Destinations
Applications
Kinesis
EventHubs
Kafka
Databases
Messaging
Systems
Source
s
Espresso
Pluggable Sources & Destinations
Scenarios
Change Data Capture
Scenario 1:
Capturing Live Updates
1.Member updates her profile to
reflect her recent job change
Capturing Live Updates
2.LinkedIn wants to inform her
colleagues of this change
Capturing Live Updates
Member DB
Updates
News Feed
Service
Capturing Live Updates
Member DB
Updates
News Feed
Service
Search Indices
Service
Capturing Live Updates
Member DB
Updates
News Feed
Service
Notifications
Service
Standardization
Service
...
Search Indices
Service
Capturing Live Updates
Member DB
Updates
News Feed
Service
Notifications
Service
Standardization
Service
...
Search Indices
Service
Capturing Live Updates
Member DB
Updates
News Feed
Service
Notifications
Service
Standardization
Service
...
Search Indices
Service
• Isolation: Applications are decoupled
from the sources and don’t compete
for resources with online queries
• Applications can be at different
points in change timelines
• Brooklin can stream database
updates to a change stream
• Data processing applications
consume from change streams
Change Data Capture (CDC)
Change Data Capture (CDC)
Messaging System
News Feed
Service
Notifications
Service
Standardization
Service
Search Indices
Service
Member DB
Updates
Streaming Bridge
Scenario 2:
● Across…
○ cloud services
○ clusters
○ data centers
Stream Data from X to Y
Streaming Bridge
• Data pipe to move data between
different environments
• Enforce policy: Encryption,
Obfuscation, Data formats
● Aggregating data from all data centers into a centralized place
● Moving data between LinkedIn and external cloud services (e.g. Azure)
● Brooklin has replaced Kafka MirrorMaker (KMM) at LinkedIn
○ Issues with KMM: didn’t scale well, difficult to operate and manage, poor
failure isolation
Mirroring Kafka Data
Use Brooklin to Mirror Kafka Data
DestinationsSources
Messaging systems
Microsoft
EventHubs
Messaging systems
Microsoft
EventHubs
Databases Databases
Kafka
MirrorMaker
Topology
Datacenter B
aggregate
tracking
tracking
Datacenter A
aggregate
tracking
tracking
KMM
aggregate
metrics
metrics
aggregate
metrics
metrics
Datacenter C
aggregate
tracking
tracking
aggregate
metrics
metrics
...
KMM KMM
KMM KMM KMM
KMM KMM KMM KMM KMM KMM
KMM KMM KMM KMM KMM KMM
... ... ...
Brooklin Kafka Mirroring Topology
Datacenter A
aggregate
tracking
tracking
Brooklin
metrics
aggregate
metrics
Datacenter B
aggregate
tracking
tracking metrics
aggregate
metrics
Datacenter C
aggregate
tracking
tracking metrics
aggregate
metrics
...Brooklin Brooklin
Brooklin Kafka Mirroring
● Optimized for stability and operability
● Manually pause and resume mirroring at every level
○ Entire pipeline, topic, topic-partition
● Can auto-pause partitions facing mirroring issues
○ Auto-resumes the partitions after a configurable duration
● Flow of messages from other partitions is unaffected
Application Use Cases
Security
Cache
Application Use Cases
Security
Cache Search Indices
Application Use Cases
Security
Cache Search Indices ETL or Data
warehouse
Application Use Cases
Security
Cache Search Indices ETL or Data
warehouse
Materialized Views or
Replication
Application Use Cases
Security
Cache Search Indices ETL or Data
warehouse
Materialized Views or
Replication
Repartitioning
Application Use Cases
Adjunct Data
Application Use Cases
BridgeAdjunct Data
Application Use Cases
Bridge Serde, Encryption,
Policy
Adjunct Data
Application Use Cases
Bridge Serde, Encryption,
Policy
Standardization,
Notifications …
Adjunct Data
Application Use Cases
Architecture
Stream updates made to Member Profile
Example:
Capturing Live Updates
Member DB
Updates
News Feed
Service
● Scenario: Stream Espresso Member Profile updates into Kafka
○ Source Database: Espresso (Member DB, Profile table)
○ Destination: Kafka
○ Application: News Feed service
Example
• Describes the data pipeline
• Mapping between source and
destination
• Holds the configuration for the pipeline
Datastream
Name: MemberProfileChangeStream
Source: MemberDB/ProfileTable
Type: Espresso
Partitions: 8
Destination: ProfileTopic
Type: Kafka
Partitions: 8
Metadata:
Application: News Feed service
Owner: newsfeed@linkedin.com
1. Client makes REST call to create datastream
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
create POST /datastream
News Feed
service
2. Create request goes to any Brooklin instance
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
3. Datastream is written to ZooKeeper
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
4. Leader coordinator is notified of new datastream
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
5. Leader coordinator calculates work distribution
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
6. Leader coordinator writes the assignments to ZK
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
7. ZooKeeper is used to communicate the assignments
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
8. Coordinators hand task assignments to consumers
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
9. Consumers start streaming data from the source
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
10. Consumers propagate data to producers
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
11. Producers write data to the destination
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
12. App consumes from Kafka
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
13. Destinations can be shared by apps
Brooklin Client
Load Balancer
ZooKeeper
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
(Leader)
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Brooklin Instance
Datastream
Management
Service (DMS) Espresso Consumer
Coordinator
Kafka Producer
Member DB
News Feed
service
News Feed
service
News Feed
service
Brooklin Architecture
ZooKeeper
Brooklin Instance
Brooklin Instance
Coordinator
Datastream
Management
Service
(DMS)
ConsumerA
ConsumerB
ProducerX
ProducerY
ProducerZ
Brooklin Instance
Datastream
Management
Service
(DMS)
ConsumerA
ProducerX
ConsumerB
ProducerY
ProducerZ
Coordinator
(Leader)
Brooklin Instance
Coordinator
Datastream
Management
Service
(DMS)
ConsumerA
ConsumerB
ProducerX
ProducerY
ProducerZ
Brooklin Instance
Current & Future
Current
• Consumers:
○ Espresso
○ Oracle
○ Kafka
○ EventHubs
○ Kinesis
• Producers:
○ Kafka
○ EventHubs
• APIs are standardized to support additional
sources and destinations
• Multitenant: Can power thousands of
datastreams across several source and
destination types
• Guarantees: At-least-once delivery, order is
maintained at partition level
• Kafka mirroring improvements: finer control
of pipelines (pause/auto-pause partitions),
improved latency with flushless-produce
mode
Sources & Destinations Features
38B
messages/day
2K+
datastreams
1K+
unique sources
200+
applications
Brooklin in Production
Brooklin streams with Espresso or Oracle as the source
3T+
messages/day
200+
datastreams
10K+
topics
Brooklin in Production
Brooklin streams mirroring Kafka data
Brooklin is now open-source!
Future
• Consumers:
○ MySQL
○ Cosmos DB
○ Azure SQL
• Producers:
○ Azure Blob storage
○ Kinesis
○ Cosmos DB
○ Azure SQL
○ Couchbase
• Brooklin auto-scaling
• Passthrough compression
• Read optimizations: Read once, Write multiple
Sources & Destinations Optimizations
Thank
you
Questions?
: ckung@linkedin.com
: linkedin.com/in/celiakkung/
: gitter.im/linkedin/brooklin
: github.com/linkedin/brooklin
Dive into Streams with Brooklin

More Related Content

What's hot

Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)KafkaZone
 
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
IBM Message Hub service in Bluemix - Apache Kafka in a public cloudIBM Message Hub service in Bluemix - Apache Kafka in a public cloud
IBM Message Hub service in Bluemix - Apache Kafka in a public cloudAndrew Schofield
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIconfluent
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.ioKickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.ioHostedbyConfluent
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache KafkaBen Stopford
 
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Timo Walther
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...confluent
 
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...confluent
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Amy W. Tang
 
Schemas, streams, and grocery stores
Schemas, streams, and grocery storesSchemas, streams, and grocery stores
Schemas, streams, and grocery storesconfluent
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...HostedbyConfluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...confluent
 
Building an Enterprise Eventing Framework
Building an Enterprise Eventing FrameworkBuilding an Enterprise Eventing Framework
Building an Enterprise Eventing Frameworkconfluent
 
Confluent & Attunity: Mainframe Data Modern Analytics
Confluent & Attunity: Mainframe Data Modern AnalyticsConfluent & Attunity: Mainframe Data Modern Analytics
Confluent & Attunity: Mainframe Data Modern Analyticsconfluent
 
Confluent Messaging Modernization Forum
Confluent Messaging Modernization ForumConfluent Messaging Modernization Forum
Confluent Messaging Modernization Forumconfluent
 
Etl, esb, mq? no! es Apache Kafka®
Etl, esb, mq?  no! es Apache Kafka®Etl, esb, mq?  no! es Apache Kafka®
Etl, esb, mq? no! es Apache Kafka®confluent
 
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Timothy Spann
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafkaconfluent
 

What's hot (20)

Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)Introduction to ksqlDB and stream processing (Vish Srinivasan  - Confluent)
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
 
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
IBM Message Hub service in Bluemix - Apache Kafka in a public cloudIBM Message Hub service in Bluemix - Apache Kafka in a public cloud
IBM Message Hub service in Bluemix - Apache Kafka in a public cloud
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.ioKickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
 
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2:  Building Event-Driven Services with Apache KafkaEvent Driven Services Part 2:  Building Event-Driven Services with Apache Kafka
Event Driven Services Part 2: Building Event-Driven Services with Apache Kafka
 
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
 
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
A Marriage of Lambda and Kappa: Supporting Iterative Development of an Event ...
 
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
Building an Enterprise Eventing Framework (Bryan Zelle, Centene; Neil Buesing...
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Schemas, streams, and grocery stores
Schemas, streams, and grocery storesSchemas, streams, and grocery stores
Schemas, streams, and grocery stores
 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
 
Building an Enterprise Eventing Framework
Building an Enterprise Eventing FrameworkBuilding an Enterprise Eventing Framework
Building an Enterprise Eventing Framework
 
Confluent & Attunity: Mainframe Data Modern Analytics
Confluent & Attunity: Mainframe Data Modern AnalyticsConfluent & Attunity: Mainframe Data Modern Analytics
Confluent & Attunity: Mainframe Data Modern Analytics
 
Confluent Messaging Modernization Forum
Confluent Messaging Modernization ForumConfluent Messaging Modernization Forum
Confluent Messaging Modernization Forum
 
Etl, esb, mq? no! es Apache Kafka®
Etl, esb, mq?  no! es Apache Kafka®Etl, esb, mq?  no! es Apache Kafka®
Etl, esb, mq? no! es Apache Kafka®
 
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020
 
Data integration with Apache Kafka
Data integration with Apache KafkaData integration with Apache Kafka
Data integration with Apache Kafka
 

Similar to Dive into Streams with Brooklin

A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinC4Media
 
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn confluent
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...HostedbyConfluent
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerBrooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerShun-ping Chiu
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharingconfluent
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Nitin Kumar
 
Databus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineDatabus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineSunil Nagaraj
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBconfluent
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaAttunity
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Societyconfluent
 
Easily Build a Smart Pulsar Stream Processor_Simon Crosby
Easily Build a Smart Pulsar Stream Processor_Simon CrosbyEasily Build a Smart Pulsar Stream Processor_Simon Crosby
Easily Build a Smart Pulsar Stream Processor_Simon CrosbyStreamNative
 
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...HostedbyConfluent
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka WayStrategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka WayScyllaDB
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAAndrew Morgan
 
Webinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBWebinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBMongoDB
 
James Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters
 

Similar to Dive into Streams with Brooklin (20)

A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
More Data, More Problems: Scaling Kafka-Mirroring Pipelines at LinkedIn
 
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror MakerBrooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharing
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
 
Databus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture PipelineDatabus - LinkedIn's Change Data Capture Pipeline
Databus - LinkedIn's Change Data Capture Pipeline
 
Data Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDBData Streaming with Apache Kafka & MongoDB
Data Streaming with Apache Kafka & MongoDB
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache Kafka
 
Introducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building SocietyIntroducing Events and Stream Processing into Nationwide Building Society
Introducing Events and Stream Processing into Nationwide Building Society
 
Easily Build a Smart Pulsar Stream Processor_Simon Crosby
Easily Build a Smart Pulsar Stream Processor_Simon CrosbyEasily Build a Smart Pulsar Stream Processor_Simon Crosby
Easily Build a Smart Pulsar Stream Processor_Simon Crosby
 
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
Transform Your Mainframe and IBM i Data for the Cloud with Precisely and Apac...
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka WayStrategies For Migrating From SQL to NoSQL — The Apache Kafka Way
Strategies For Migrating From SQL to NoSQL — The Apache Kafka Way
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Data Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEAData Streaming with Apache Kafka & MongoDB - EMEA
Data Streaming with Apache Kafka & MongoDB - EMEA
 
Webinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDBWebinar: Data Streaming with Apache Kafka & MongoDB
Webinar: Data Streaming with Apache Kafka & MongoDB
 
James Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 KeynoteJames Watters Kafka Summit NYC 2019 Keynote
James Watters Kafka Summit NYC 2019 Keynote
 

Recently uploaded

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 

Recently uploaded (20)

SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 

Dive into Streams with Brooklin

  • 1. Dive into Streams with Brooklin Celia Kung LinkedIn
  • 4. Nearline Applications • Require near real-time response • Thousands of applications at LinkedIn ○ E.g. Live search indices, Notifications
  • 5. Nearline Applications • Require continuous, low-latency access to data ○ Data could be spread across multiple database systems
  • 7. Nearline Applications • Need an easy way to move data to applications ○ App devs should focus on event processing and not on data access
  • 8. Building the Right Infrastructure • Build separate, specialized solutions to stream data from and to each different system? ○ Slows down development ○ Hard to manage! Microsoft EventHubs ... Streaming System C Streaming System D Streaming System A Streaming System B ... Nearline Applications
  • 9. Need a centralized, managed, and extensible service to continuously deliver data in near real-time
  • 11. Brooklin • Streams are dynamically provisioned and individually configured • Extensible: Plug-in support for additional sources/destinations • Streaming data pipeline service • Propagates data from many source types to many destination types • Multitenant: Can run several thousand streams simultaneously
  • 15. Capturing Live Updates 1.Member updates her profile to reflect her recent job change
  • 16. Capturing Live Updates 2.LinkedIn wants to inform her colleagues of this change
  • 17. Capturing Live Updates Member DB Updates News Feed Service
  • 18. Capturing Live Updates Member DB Updates News Feed Service Search Indices Service
  • 19. Capturing Live Updates Member DB Updates News Feed Service Notifications Service Standardization Service ... Search Indices Service
  • 20. Capturing Live Updates Member DB Updates News Feed Service Notifications Service Standardization Service ... Search Indices Service
  • 21. Capturing Live Updates Member DB Updates News Feed Service Notifications Service Standardization Service ... Search Indices Service
  • 22. • Isolation: Applications are decoupled from the sources and don’t compete for resources with online queries • Applications can be at different points in change timelines • Brooklin can stream database updates to a change stream • Data processing applications consume from change streams Change Data Capture (CDC)
  • 23. Change Data Capture (CDC) Messaging System News Feed Service Notifications Service Standardization Service Search Indices Service Member DB Updates
  • 25. ● Across… ○ cloud services ○ clusters ○ data centers Stream Data from X to Y
  • 26. Streaming Bridge • Data pipe to move data between different environments • Enforce policy: Encryption, Obfuscation, Data formats
  • 27. ● Aggregating data from all data centers into a centralized place ● Moving data between LinkedIn and external cloud services (e.g. Azure) ● Brooklin has replaced Kafka MirrorMaker (KMM) at LinkedIn ○ Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data
  • 28. Use Brooklin to Mirror Kafka Data DestinationsSources Messaging systems Microsoft EventHubs Messaging systems Microsoft EventHubs Databases Databases
  • 29. Kafka MirrorMaker Topology Datacenter B aggregate tracking tracking Datacenter A aggregate tracking tracking KMM aggregate metrics metrics aggregate metrics metrics Datacenter C aggregate tracking tracking aggregate metrics metrics ... KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM ... ... ...
  • 30. Brooklin Kafka Mirroring Topology Datacenter A aggregate tracking tracking Brooklin metrics aggregate metrics Datacenter B aggregate tracking tracking metrics aggregate metrics Datacenter C aggregate tracking tracking metrics aggregate metrics ...Brooklin Brooklin
  • 31. Brooklin Kafka Mirroring ● Optimized for stability and operability ● Manually pause and resume mirroring at every level ○ Entire pipeline, topic, topic-partition ● Can auto-pause partitions facing mirroring issues ○ Auto-resumes the partitions after a configurable duration ● Flow of messages from other partitions is unaffected
  • 35. Security Cache Search Indices ETL or Data warehouse Application Use Cases
  • 36. Security Cache Search Indices ETL or Data warehouse Materialized Views or Replication Application Use Cases
  • 37. Security Cache Search Indices ETL or Data warehouse Materialized Views or Replication Repartitioning Application Use Cases
  • 40. Bridge Serde, Encryption, Policy Adjunct Data Application Use Cases
  • 41. Bridge Serde, Encryption, Policy Standardization, Notifications … Adjunct Data Application Use Cases
  • 43. Stream updates made to Member Profile Example:
  • 44. Capturing Live Updates Member DB Updates News Feed Service
  • 45. ● Scenario: Stream Espresso Member Profile updates into Kafka ○ Source Database: Espresso (Member DB, Profile table) ○ Destination: Kafka ○ Application: News Feed service Example
  • 46. • Describes the data pipeline • Mapping between source and destination • Holds the configuration for the pipeline Datastream Name: MemberProfileChangeStream Source: MemberDB/ProfileTable Type: Espresso Partitions: 8 Destination: ProfileTopic Type: Kafka Partitions: 8 Metadata: Application: News Feed service Owner: newsfeed@linkedin.com
  • 47. 1. Client makes REST call to create datastream Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB create POST /datastream News Feed service
  • 48. 2. Create request goes to any Brooklin instance Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 49. 3. Datastream is written to ZooKeeper Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 50. 4. Leader coordinator is notified of new datastream Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 51. 5. Leader coordinator calculates work distribution Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 52. 6. Leader coordinator writes the assignments to ZK Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 53. 7. ZooKeeper is used to communicate the assignments Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 54. 8. Coordinators hand task assignments to consumers Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 55. 9. Consumers start streaming data from the source Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 56. 10. Consumers propagate data to producers Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 57. 11. Producers write data to the destination Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 58. 12. App consumes from Kafka Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service
  • 59. 13. Destinations can be shared by apps Brooklin Client Load Balancer ZooKeeper Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator (Leader) Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Brooklin Instance Datastream Management Service (DMS) Espresso Consumer Coordinator Kafka Producer Member DB News Feed service News Feed service News Feed service
  • 60. Brooklin Architecture ZooKeeper Brooklin Instance Brooklin Instance Coordinator Datastream Management Service (DMS) ConsumerA ConsumerB ProducerX ProducerY ProducerZ Brooklin Instance Datastream Management Service (DMS) ConsumerA ProducerX ConsumerB ProducerY ProducerZ Coordinator (Leader) Brooklin Instance Coordinator Datastream Management Service (DMS) ConsumerA ConsumerB ProducerX ProducerY ProducerZ Brooklin Instance
  • 62. Current • Consumers: ○ Espresso ○ Oracle ○ Kafka ○ EventHubs ○ Kinesis • Producers: ○ Kafka ○ EventHubs • APIs are standardized to support additional sources and destinations • Multitenant: Can power thousands of datastreams across several source and destination types • Guarantees: At-least-once delivery, order is maintained at partition level • Kafka mirroring improvements: finer control of pipelines (pause/auto-pause partitions), improved latency with flushless-produce mode Sources & Destinations Features
  • 63. 38B messages/day 2K+ datastreams 1K+ unique sources 200+ applications Brooklin in Production Brooklin streams with Espresso or Oracle as the source
  • 65. Brooklin is now open-source!
  • 66. Future • Consumers: ○ MySQL ○ Cosmos DB ○ Azure SQL • Producers: ○ Azure Blob storage ○ Kinesis ○ Cosmos DB ○ Azure SQL ○ Couchbase • Brooklin auto-scaling • Passthrough compression • Read optimizations: Read once, Write multiple Sources & Destinations Optimizations
  • 68. Questions? : ckung@linkedin.com : linkedin.com/in/celiakkung/ : gitter.im/linkedin/brooklin : github.com/linkedin/brooklin