SlideShare a Scribd company logo
1 of 60
Download to read offline
Lessons Learned from 2000
event-driven microservices
natansil.com twitter@NSilnitsky linkedin/natansilnitsky github.com/natansil
Natan Silnitsky Backend Infra TL, Wix
February 2023
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Unique
visitors use
Wix platform
every month
~1B
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Unique
visitors use
Wix platform
every month
~1B
Daily HTTP
Transactions
~500B
Kafka
messages a
day
~70B
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Unique
visitors use
Wix platform
every month
~1B
Daily HTTP
Transactions
~500B
Kafka
messages a
day
~70B
GAs every
day
> 600
Microservices in
production
2500
* scale, resilience. issues
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Challenges
of event-driven architecture,
that we’ve bumped into
1 Producing message failures
Processing out-of-order & duplicates
2
4 Troubleshooting production
3 Sending large payloads
* success, tools, faster
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
How Event-driven Architecture Works
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Service-to-Service Communication
Cart
Service
User
Service
Inventory
Service
Catalog
Service
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Request-Reply Communication
HTTP RPC
HTTP RPC
HTTP RPC
Cart
Service
User
Service
Inventory
Service
Catalog
Service
* issue scale
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
slow
Cart
Service
* slow, bottleneck, cache
HTTP RPC
HTTP RPC
HTTP RPC
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
unreliable
Cart
Service
* unreliable, cascade, retr
HTTP RPC
HTTP RPC
HTTP RPC
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Event-driven Communication
Producer
Broker Product Updated Topic
Event
* improve, broker, scale
Catalog Service
Kafka
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Broker
more robust
* DB, decoupling, no impact
Cart Service
Producer Consumer
Kafka
Catalog Service
Product Updated Topic
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Broker
Event processing is guaranteed
Producer Consumer
Kafka
Catalog Service Cart Service
Product Updated Topic
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
The following is based on a true story
*Dates and products were changed for clarity :)
* ecom simple linear
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2016
Wix starts using
event-driven
We can work event-driven!!
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
It all began when
Ecom experienced
data issues
Data does NOT reflect
actual catalog
Risk: show wrong
prices in cart
Cart
DB
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2. Produce
“Product Updated”
Event
Broker
Cart
Service
4. Show updated
prices in cart
3. Update
Product Price
Catalog
Service
1.
Update
status
After investigating
Cart
DB
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Challenge #1
Producing message failure
Kafka
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Broker
Cart
Service
Catalog
Service
Make DB Update & Event Producing Atomic
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Produce event to S3
Broker
Catalog
Service
Resilient
Producer
Catch Unsent Events
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Produce event to S3
Broker
Produce to
Kafka
Healer
Service
Catalog
Service
Poll
Resilient
Producer
Fallback to S3 and Heal
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Kafka Broker
Service A Service B
Greyhound Producer
Kafka Producer
Greyhound Consumer
Kafka Consumer
Wrap Kafka with Greyhound*
* Open source: https://github.com/wix/greyhound
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
➔ Resilient Producer
➔ Parallel Consumption
➔ Batch Consumer
➔ Consumer Retry Strategies
➔ Context Propagation
➔ Metrics reporting
Developer Self-Service:
Wrap Kafka with Greyhound*
* Open source: https://github.com/wix/greyhound
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2016
Wix starts using
event-driven
2018
Greyhound
Resilient producer
& Consumer retries
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Produce event to S3
Broker
Produce to
Kafka
Healer
Service
Catalog
Service
Poll
Resilient
Producer
Fallback to S3 and Heal
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Broker
Catalog
Service
Healer
Service
Remove
Discount Introduce
Discount
Then ‘out-of-order’ happened
Cart
Service
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Challenge #2
Out-of-order & duplicates processing
Kafka
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Catalog
Service
Broker
Healer
Service
Introduce
Discount
Mitigating out-of-order with revision ID
# 10
# 9
Cart
Service
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Catalog
Service
Broker
Healer
Service
Remove
Discount Introduce
Discount
Mitigating out-of-order with revision ID
# 11
# 10
# 9
Cart
Service
* item itself
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Scan the
binlog. For
each entry
produce a
‘status
update’ event
Cart
Service
Broker
Catalog
Service
Mitigating out-of-order with Debezium connector
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
More Ecom data
issues
Data does NOT reflect
actual inventory
Risk: lose
potential customers
Inventory
DB
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Item 2
Item 1
Broker
Payments
Service
Investigation leads to duplicate processing
Payment for: Inventory
Service
Retry
Item 2 5 → 3
Item 1 9 → 7
* not idempotent
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Item 2 5 → 4
Item 1 9 → 8
Item 2
Item 1
Payment for:
Broker
txnId - a7g45
Mitigating duplicates with Transaction ID
Payments
Service
Inventory
Service
txnId - a7g45
txnId - a7g45
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2016
Wix starts using
event-driven
2018
Greyhound
Resilient
producer &
Consumer retries
2019
Revisions &
Transaction IDs
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Broker
Product Catalog
Service
Product Update
event
Cart
Service
“Dude, I can’t produce large payloads”
...
"description": "An
apple mobile which is
nothing like apple",
...
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
* 1MB
Challenge #3
Failure to send large payloads
Broker
...
"description": "An
apple mobile which is
nothing like apple",
...
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Large Payloads
Remedy I
Compression
→ Try several compression types (lz4, snappy,
etc.)
→ Compression on Kafka level is usually
better than application level, as payloads
can be compressed in batches
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Large Payloads
Remedy II
Chunking
Broker
1. Split to chunks
& produce
2. Consume &
reassemble
Product
Catalog
Service
Cart
Service
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Large Payloads
Remedy III
Reference to
Object Store
2. Produce with S3
URL
3. Consume &
download from
S3
1. Upload to S3
Product
Catalog
Service Cart
Service
Broker
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2016
Wix starts using
event-driven
2018
Greyhound Resilient
producer &
Consumer retries
2019
We use IDs for
ooo & duplicates
2020
Added
compression
by default
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
* bottlenecks
Challenge #4
It’s hard for developers to debug and maintain event-driven
microservices at scale in production
Our team
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Stream events with various filters
How do I investigate
this lag?
Our team
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Investigate consumer lag per partition
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
View a “stuck” event in some partition
How come this
side-effect didn’t
happen?
Our team
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Orders
Service
Propagate the Context
Broker Payments Topic
Orders Topic Inventory Topic
requestId
userId
Event Header
1. Greyhound
produce
* monitoring infra
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2. Greyhound
consume
Propagate the Context
Payments
Service
Broker Payments Topic
Orders Topic Inventory Topic
3. produce
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
4. Greyhound
consume
Propagate the Context
Inventory
Service
Broker Payments Topic
Orders Topic Inventory Topic
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
So developers can track events’ route
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
View event details
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
2016
Wix starts using
event-driven
2018
We open source
Greyhound
2019
We use IDs for
ooo & duplicates
2020
Added
compression
by default
2021-22
Tools in
Production
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Wix developers have embraced
event-driven architecture.
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Meeting these challenges
made our microservices more
decoupled, resilient and scalable,
while keeping complexity low and
data consistent.
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
The Blog Post
https://medium.com/wix-engineerin
g/event-driven-architecture-5-pitfalls-t
o-avoid-b3ebf885bdb1
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
How to migrate 2000 microservices to Multi Cluster
Managed Kafka with 0 Downtime
The Next Step
https://www.youtube.com/watch?v=
XKbG8a-9NRE
Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
Greyhound
github.com/wix/greyhound
Thank You!
natansil.com twitter@NSilnitsky linkedin/natansilnitsky github.com/natansil
👉 slideshare.net/NatanSilnitsky
Any questions?

More Related Content

Similar to Wix+Confluent Meetup - Lessons Learned from 2000 Event Driven Microservices

Picos, CloudOS, and Connecting Things
Picos, CloudOS, and Connecting ThingsPicos, CloudOS, and Connecting Things
Picos, CloudOS, and Connecting ThingsPhil Windley
 
Microservices with Kafka Ecosystem
Microservices with Kafka EcosystemMicroservices with Kafka Ecosystem
Microservices with Kafka EcosystemGuido Schmutz
 
Building microservices with Scala, functional domain models and Spring Boot (...
Building microservices with Scala, functional domain models and Spring Boot (...Building microservices with Scala, functional domain models and Spring Boot (...
Building microservices with Scala, functional domain models and Spring Boot (...Chris Richardson
 
Refacoring vs Rewriting WixStores
Refacoring vs Rewriting WixStoresRefacoring vs Rewriting WixStores
Refacoring vs Rewriting WixStoresDoron Rosenstock
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2inovex GmbH
 
Building microservices with Scala, functional domain models and Spring Boot
Building microservices with Scala, functional domain models and Spring BootBuilding microservices with Scala, functional domain models and Spring Boot
Building microservices with Scala, functional domain models and Spring BootChris Richardson
 
Building and deploying microservices with event sourcing, CQRS and Docker (Ha...
Building and deploying microservices with event sourcing, CQRS and Docker (Ha...Building and deploying microservices with event sourcing, CQRS and Docker (Ha...
Building and deploying microservices with event sourcing, CQRS and Docker (Ha...Chris Richardson
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaGuido Schmutz
 
D3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients PerformanceD3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients PerformanceImperva Incapsula
 
Building and deploying microservices with event sourcing, CQRS and Docker (Me...
Building and deploying microservices with event sourcing, CQRS and Docker (Me...Building and deploying microservices with event sourcing, CQRS and Docker (Me...
Building and deploying microservices with event sourcing, CQRS and Docker (Me...Chris Richardson
 
In the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesIn the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesKenny Bastani
 
Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServicesDavid Walker
 
Building event-driven (Micro)Services with Apache Kafka Ecosystem
Building event-driven (Micro)Services with Apache Kafka EcosystemBuilding event-driven (Micro)Services with Apache Kafka Ecosystem
Building event-driven (Micro)Services with Apache Kafka EcosystemGuido Schmutz
 
Design Microservice Architectures the Right Way
Design Microservice Architectures the Right WayDesign Microservice Architectures the Right Way
Design Microservice Architectures the Right WayC4Media
 
Developing event-driven microservices with event sourcing and CQRS (phillyete)
Developing event-driven microservices with event sourcing and CQRS (phillyete)Developing event-driven microservices with event sourcing and CQRS (phillyete)
Developing event-driven microservices with event sourcing and CQRS (phillyete)Chris Richardson
 
Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)
Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)
Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)Daniel Toomey
 
Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript - - MeeGo Confere...
Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript -  - MeeGo Confere...Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript -  - MeeGo Confere...
Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript - - MeeGo Confere...Raj Lal
 
OpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven appsOpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven appsDaniel Krook
 
Serverless Design Patterns
Serverless Design PatternsServerless Design Patterns
Serverless Design PatternsYan Cui
 
Microservicios net arquitectura para aplicaciones net contenerizadas - net ...
Microservicios net   arquitectura para aplicaciones net contenerizadas - net ...Microservicios net   arquitectura para aplicaciones net contenerizadas - net ...
Microservicios net arquitectura para aplicaciones net contenerizadas - net ...Germán Küber
 

Similar to Wix+Confluent Meetup - Lessons Learned from 2000 Event Driven Microservices (20)

Picos, CloudOS, and Connecting Things
Picos, CloudOS, and Connecting ThingsPicos, CloudOS, and Connecting Things
Picos, CloudOS, and Connecting Things
 
Microservices with Kafka Ecosystem
Microservices with Kafka EcosystemMicroservices with Kafka Ecosystem
Microservices with Kafka Ecosystem
 
Building microservices with Scala, functional domain models and Spring Boot (...
Building microservices with Scala, functional domain models and Spring Boot (...Building microservices with Scala, functional domain models and Spring Boot (...
Building microservices with Scala, functional domain models and Spring Boot (...
 
Refacoring vs Rewriting WixStores
Refacoring vs Rewriting WixStoresRefacoring vs Rewriting WixStores
Refacoring vs Rewriting WixStores
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
Building microservices with Scala, functional domain models and Spring Boot
Building microservices with Scala, functional domain models and Spring BootBuilding microservices with Scala, functional domain models and Spring Boot
Building microservices with Scala, functional domain models and Spring Boot
 
Building and deploying microservices with event sourcing, CQRS and Docker (Ha...
Building and deploying microservices with event sourcing, CQRS and Docker (Ha...Building and deploying microservices with event sourcing, CQRS and Docker (Ha...
Building and deploying microservices with event sourcing, CQRS and Docker (Ha...
 
Building Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache KafkaBuilding Event-Driven (Micro)Services with Apache Kafka
Building Event-Driven (Micro)Services with Apache Kafka
 
D3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients PerformanceD3SF17- Improving Our China Clients Performance
D3SF17- Improving Our China Clients Performance
 
Building and deploying microservices with event sourcing, CQRS and Docker (Me...
Building and deploying microservices with event sourcing, CQRS and Docker (Me...Building and deploying microservices with event sourcing, CQRS and Docker (Me...
Building and deploying microservices with event sourcing, CQRS and Docker (Me...
 
In the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesIn the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at Microservices
 
Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServices
 
Building event-driven (Micro)Services with Apache Kafka Ecosystem
Building event-driven (Micro)Services with Apache Kafka EcosystemBuilding event-driven (Micro)Services with Apache Kafka Ecosystem
Building event-driven (Micro)Services with Apache Kafka Ecosystem
 
Design Microservice Architectures the Right Way
Design Microservice Architectures the Right WayDesign Microservice Architectures the Right Way
Design Microservice Architectures the Right Way
 
Developing event-driven microservices with event sourcing and CQRS (phillyete)
Developing event-driven microservices with event sourcing and CQRS (phillyete)Developing event-driven microservices with event sourcing and CQRS (phillyete)
Developing event-driven microservices with event sourcing and CQRS (phillyete)
 
Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)
Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)
Building Event-Driven Integration Architectures with Azure Event Grid (GIB2019)
 
Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript - - MeeGo Confere...
Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript -  - MeeGo Confere...Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript -  - MeeGo Confere...
Build Amazing Mobile Apps using HTML5, CSS3 and JavaScript - - MeeGo Confere...
 
OpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven appsOpenWhisk - A platform for cloud native, serverless, event driven apps
OpenWhisk - A platform for cloud native, serverless, event driven apps
 
Serverless Design Patterns
Serverless Design PatternsServerless Design Patterns
Serverless Design Patterns
 
Microservicios net arquitectura para aplicaciones net contenerizadas - net ...
Microservicios net   arquitectura para aplicaciones net contenerizadas - net ...Microservicios net   arquitectura para aplicaciones net contenerizadas - net ...
Microservicios net arquitectura para aplicaciones net contenerizadas - net ...
 

More from Natan Silnitsky

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...Natan Silnitsky
 
DevSum - Lessons Learned from 2000 microservices
DevSum - Lessons Learned from 2000 microservicesDevSum - Lessons Learned from 2000 microservices
DevSum - Lessons Learned from 2000 microservicesNatan Silnitsky
 
GeeCon - Lessons Learned from 2000 microservices
GeeCon - Lessons Learned from 2000 microservicesGeeCon - Lessons Learned from 2000 microservices
GeeCon - Lessons Learned from 2000 microservicesNatan Silnitsky
 
Migrating to Multi Cluster Managed Kafka - ApacheKafkaIL
Migrating to Multi Cluster Managed Kafka - ApacheKafkaILMigrating to Multi Cluster Managed Kafka - ApacheKafkaIL
Migrating to Multi Cluster Managed Kafka - ApacheKafkaILNatan Silnitsky
 
Devoxx Ukraine - Kafka based Global Data Mesh
Devoxx Ukraine - Kafka based Global Data MeshDevoxx Ukraine - Kafka based Global Data Mesh
Devoxx Ukraine - Kafka based Global Data MeshNatan Silnitsky
 
Devoxx UK - Migrating to Multi Cluster Managed Kafka
Devoxx UK - Migrating to Multi Cluster Managed KafkaDevoxx UK - Migrating to Multi Cluster Managed Kafka
Devoxx UK - Migrating to Multi Cluster Managed KafkaNatan Silnitsky
 
Dev Days Europe - Kafka based Global Data Mesh at Wix
Dev Days Europe - Kafka based Global Data Mesh at WixDev Days Europe - Kafka based Global Data Mesh at Wix
Dev Days Europe - Kafka based Global Data Mesh at WixNatan Silnitsky
 
Kafka Summit London - Kafka based Global Data Mesh at Wix
Kafka Summit London - Kafka based Global Data Mesh at WixKafka Summit London - Kafka based Global Data Mesh at Wix
Kafka Summit London - Kafka based Global Data Mesh at WixNatan Silnitsky
 
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative Natan Silnitsky
 
5 Takeaways from Migrating a Library to Scala 3 - Scala Love
5 Takeaways from Migrating a Library to Scala 3 - Scala Love5 Takeaways from Migrating a Library to Scala 3 - Scala Love
5 Takeaways from Migrating a Library to Scala 3 - Scala LoveNatan Silnitsky
 
Migrating to Multi Cluster Managed Kafka - DevopStars 2022
Migrating to Multi Cluster Managed Kafka - DevopStars 2022Migrating to Multi Cluster Managed Kafka - DevopStars 2022
Migrating to Multi Cluster Managed Kafka - DevopStars 2022Natan Silnitsky
 
Open sourcing a successful internal project - Reversim 2021
Open sourcing a successful internal project - Reversim 2021Open sourcing a successful internal project - Reversim 2021
Open sourcing a successful internal project - Reversim 2021Natan Silnitsky
 
How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021
How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021
How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021Natan Silnitsky
 
Advanced Caching Patterns used by 2000 microservices - Code Motion
Advanced Caching Patterns used by 2000 microservices - Code MotionAdvanced Caching Patterns used by 2000 microservices - Code Motion
Advanced Caching Patterns used by 2000 microservices - Code MotionNatan Silnitsky
 
Advanced Caching Patterns used by 2000 microservices - Devoxx Ukraine
Advanced Caching Patterns used by 2000 microservices - Devoxx UkraineAdvanced Caching Patterns used by 2000 microservices - Devoxx Ukraine
Advanced Caching Patterns used by 2000 microservices - Devoxx UkraineNatan Silnitsky
 
Advanced Microservices Caching Patterns - Devoxx UK
Advanced Microservices Caching Patterns - Devoxx UKAdvanced Microservices Caching Patterns - Devoxx UK
Advanced Microservices Caching Patterns - Devoxx UKNatan Silnitsky
 
Advanced Caching Patterns used by 2000 microservices - Api World
Advanced Caching Patterns used by 2000 microservices - Api WorldAdvanced Caching Patterns used by 2000 microservices - Api World
Advanced Caching Patterns used by 2000 microservices - Api WorldNatan Silnitsky
 
Kafka based Global Data Mesh at Wix
Kafka based Global Data Mesh at WixKafka based Global Data Mesh at Wix
Kafka based Global Data Mesh at WixNatan Silnitsky
 
Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021
Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021
Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021Natan Silnitsky
 

More from Natan Silnitsky (20)

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
Workflow Engines & Event Streaming Brokers - Can they work together? [Current...
 
DevSum - Lessons Learned from 2000 microservices
DevSum - Lessons Learned from 2000 microservicesDevSum - Lessons Learned from 2000 microservices
DevSum - Lessons Learned from 2000 microservices
 
GeeCon - Lessons Learned from 2000 microservices
GeeCon - Lessons Learned from 2000 microservicesGeeCon - Lessons Learned from 2000 microservices
GeeCon - Lessons Learned from 2000 microservices
 
Migrating to Multi Cluster Managed Kafka - ApacheKafkaIL
Migrating to Multi Cluster Managed Kafka - ApacheKafkaILMigrating to Multi Cluster Managed Kafka - ApacheKafkaIL
Migrating to Multi Cluster Managed Kafka - ApacheKafkaIL
 
Devoxx Ukraine - Kafka based Global Data Mesh
Devoxx Ukraine - Kafka based Global Data MeshDevoxx Ukraine - Kafka based Global Data Mesh
Devoxx Ukraine - Kafka based Global Data Mesh
 
Devoxx UK - Migrating to Multi Cluster Managed Kafka
Devoxx UK - Migrating to Multi Cluster Managed KafkaDevoxx UK - Migrating to Multi Cluster Managed Kafka
Devoxx UK - Migrating to Multi Cluster Managed Kafka
 
Dev Days Europe - Kafka based Global Data Mesh at Wix
Dev Days Europe - Kafka based Global Data Mesh at WixDev Days Europe - Kafka based Global Data Mesh at Wix
Dev Days Europe - Kafka based Global Data Mesh at Wix
 
Kafka Summit London - Kafka based Global Data Mesh at Wix
Kafka Summit London - Kafka based Global Data Mesh at WixKafka Summit London - Kafka based Global Data Mesh at Wix
Kafka Summit London - Kafka based Global Data Mesh at Wix
 
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
Migrating to Multi Cluster Managed Kafka - Conf42 - CloudNative
 
5 Takeaways from Migrating a Library to Scala 3 - Scala Love
5 Takeaways from Migrating a Library to Scala 3 - Scala Love5 Takeaways from Migrating a Library to Scala 3 - Scala Love
5 Takeaways from Migrating a Library to Scala 3 - Scala Love
 
Migrating to Multi Cluster Managed Kafka - DevopStars 2022
Migrating to Multi Cluster Managed Kafka - DevopStars 2022Migrating to Multi Cluster Managed Kafka - DevopStars 2022
Migrating to Multi Cluster Managed Kafka - DevopStars 2022
 
Open sourcing a successful internal project - Reversim 2021
Open sourcing a successful internal project - Reversim 2021Open sourcing a successful internal project - Reversim 2021
Open sourcing a successful internal project - Reversim 2021
 
How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021
How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021
How to successfully manage a ZIO fiber’s lifecycle - Functional Scala 2021
 
Advanced Caching Patterns used by 2000 microservices - Code Motion
Advanced Caching Patterns used by 2000 microservices - Code MotionAdvanced Caching Patterns used by 2000 microservices - Code Motion
Advanced Caching Patterns used by 2000 microservices - Code Motion
 
Advanced Caching Patterns used by 2000 microservices - Devoxx Ukraine
Advanced Caching Patterns used by 2000 microservices - Devoxx UkraineAdvanced Caching Patterns used by 2000 microservices - Devoxx Ukraine
Advanced Caching Patterns used by 2000 microservices - Devoxx Ukraine
 
Advanced Microservices Caching Patterns - Devoxx UK
Advanced Microservices Caching Patterns - Devoxx UKAdvanced Microservices Caching Patterns - Devoxx UK
Advanced Microservices Caching Patterns - Devoxx UK
 
Advanced Caching Patterns used by 2000 microservices - Api World
Advanced Caching Patterns used by 2000 microservices - Api WorldAdvanced Caching Patterns used by 2000 microservices - Api World
Advanced Caching Patterns used by 2000 microservices - Api World
 
Kafka based Global Data Mesh at Wix
Kafka based Global Data Mesh at WixKafka based Global Data Mesh at Wix
Kafka based Global Data Mesh at Wix
 
Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021
Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021
Advanced Caching Patterns used by 2000 microservices - WeAreDevelopers 2021
 

Recently uploaded

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 

Recently uploaded (20)

办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 

Wix+Confluent Meetup - Lessons Learned from 2000 Event Driven Microservices

  • 1. Lessons Learned from 2000 event-driven microservices natansil.com twitter@NSilnitsky linkedin/natansilnitsky github.com/natansil Natan Silnitsky Backend Infra TL, Wix February 2023
  • 2. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky
  • 3. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Unique visitors use Wix platform every month ~1B
  • 4. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Unique visitors use Wix platform every month ~1B Daily HTTP Transactions ~500B Kafka messages a day ~70B
  • 5. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Unique visitors use Wix platform every month ~1B Daily HTTP Transactions ~500B Kafka messages a day ~70B GAs every day > 600 Microservices in production 2500 * scale, resilience. issues
  • 6. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Challenges of event-driven architecture, that we’ve bumped into 1 Producing message failures Processing out-of-order & duplicates 2 4 Troubleshooting production 3 Sending large payloads * success, tools, faster
  • 7. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky How Event-driven Architecture Works
  • 8. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Service-to-Service Communication Cart Service User Service Inventory Service Catalog Service
  • 9. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Request-Reply Communication HTTP RPC HTTP RPC HTTP RPC Cart Service User Service Inventory Service Catalog Service * issue scale
  • 10. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky slow Cart Service * slow, bottleneck, cache HTTP RPC HTTP RPC HTTP RPC
  • 11. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky unreliable Cart Service * unreliable, cascade, retr HTTP RPC HTTP RPC HTTP RPC
  • 12. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Event-driven Communication Producer Broker Product Updated Topic Event * improve, broker, scale Catalog Service Kafka
  • 13. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Broker more robust * DB, decoupling, no impact Cart Service Producer Consumer Kafka Catalog Service Product Updated Topic
  • 14. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Broker Event processing is guaranteed Producer Consumer Kafka Catalog Service Cart Service Product Updated Topic
  • 15. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky The following is based on a true story *Dates and products were changed for clarity :) * ecom simple linear
  • 16. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2016 Wix starts using event-driven We can work event-driven!!
  • 17. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky It all began when Ecom experienced data issues Data does NOT reflect actual catalog Risk: show wrong prices in cart Cart DB
  • 18. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2. Produce “Product Updated” Event Broker Cart Service 4. Show updated prices in cart 3. Update Product Price Catalog Service 1. Update status After investigating Cart DB
  • 19. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Challenge #1 Producing message failure Kafka
  • 20. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Broker Cart Service Catalog Service Make DB Update & Event Producing Atomic
  • 21. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Produce event to S3 Broker Catalog Service Resilient Producer Catch Unsent Events
  • 22. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Produce event to S3 Broker Produce to Kafka Healer Service Catalog Service Poll Resilient Producer Fallback to S3 and Heal
  • 23. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Kafka Broker Service A Service B Greyhound Producer Kafka Producer Greyhound Consumer Kafka Consumer Wrap Kafka with Greyhound* * Open source: https://github.com/wix/greyhound
  • 24. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky ➔ Resilient Producer ➔ Parallel Consumption ➔ Batch Consumer ➔ Consumer Retry Strategies ➔ Context Propagation ➔ Metrics reporting Developer Self-Service: Wrap Kafka with Greyhound* * Open source: https://github.com/wix/greyhound
  • 25. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2016 Wix starts using event-driven 2018 Greyhound Resilient producer & Consumer retries
  • 26. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Produce event to S3 Broker Produce to Kafka Healer Service Catalog Service Poll Resilient Producer Fallback to S3 and Heal
  • 27. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Broker Catalog Service Healer Service Remove Discount Introduce Discount Then ‘out-of-order’ happened Cart Service
  • 28. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Challenge #2 Out-of-order & duplicates processing Kafka
  • 29. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Catalog Service Broker Healer Service Introduce Discount Mitigating out-of-order with revision ID # 10 # 9 Cart Service
  • 30. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Catalog Service Broker Healer Service Remove Discount Introduce Discount Mitigating out-of-order with revision ID # 11 # 10 # 9 Cart Service * item itself
  • 31. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Scan the binlog. For each entry produce a ‘status update’ event Cart Service Broker Catalog Service Mitigating out-of-order with Debezium connector
  • 32. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky More Ecom data issues Data does NOT reflect actual inventory Risk: lose potential customers Inventory DB
  • 33. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Item 2 Item 1 Broker Payments Service Investigation leads to duplicate processing Payment for: Inventory Service Retry Item 2 5 → 3 Item 1 9 → 7 * not idempotent
  • 34. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Item 2 5 → 4 Item 1 9 → 8 Item 2 Item 1 Payment for: Broker txnId - a7g45 Mitigating duplicates with Transaction ID Payments Service Inventory Service txnId - a7g45 txnId - a7g45
  • 35. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2016 Wix starts using event-driven 2018 Greyhound Resilient producer & Consumer retries 2019 Revisions & Transaction IDs
  • 36. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Broker Product Catalog Service Product Update event Cart Service “Dude, I can’t produce large payloads” ... "description": "An apple mobile which is nothing like apple", ...
  • 37. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky * 1MB Challenge #3 Failure to send large payloads Broker ... "description": "An apple mobile which is nothing like apple", ...
  • 38. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Large Payloads Remedy I Compression → Try several compression types (lz4, snappy, etc.) → Compression on Kafka level is usually better than application level, as payloads can be compressed in batches
  • 39. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Large Payloads Remedy II Chunking Broker 1. Split to chunks & produce 2. Consume & reassemble Product Catalog Service Cart Service
  • 40. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Large Payloads Remedy III Reference to Object Store 2. Produce with S3 URL 3. Consume & download from S3 1. Upload to S3 Product Catalog Service Cart Service Broker
  • 41. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2016 Wix starts using event-driven 2018 Greyhound Resilient producer & Consumer retries 2019 We use IDs for ooo & duplicates 2020 Added compression by default
  • 42. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky * bottlenecks Challenge #4 It’s hard for developers to debug and maintain event-driven microservices at scale in production
  • 44. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Stream events with various filters
  • 45. How do I investigate this lag? Our team
  • 46. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Investigate consumer lag per partition
  • 47. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky View a “stuck” event in some partition
  • 48. How come this side-effect didn’t happen? Our team
  • 49. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Orders Service Propagate the Context Broker Payments Topic Orders Topic Inventory Topic requestId userId Event Header 1. Greyhound produce * monitoring infra
  • 50. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2. Greyhound consume Propagate the Context Payments Service Broker Payments Topic Orders Topic Inventory Topic 3. produce
  • 51. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 4. Greyhound consume Propagate the Context Inventory Service Broker Payments Topic Orders Topic Inventory Topic
  • 52. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky So developers can track events’ route
  • 53. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky View event details
  • 54. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky 2016 Wix starts using event-driven 2018 We open source Greyhound 2019 We use IDs for ooo & duplicates 2020 Added compression by default 2021-22 Tools in Production
  • 55. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Wix developers have embraced event-driven architecture.
  • 56. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Meeting these challenges made our microservices more decoupled, resilient and scalable, while keeping complexity low and data consistent.
  • 57. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky The Blog Post https://medium.com/wix-engineerin g/event-driven-architecture-5-pitfalls-t o-avoid-b3ebf885bdb1
  • 58. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky How to migrate 2000 microservices to Multi Cluster Managed Kafka with 0 Downtime The Next Step https://www.youtube.com/watch?v= XKbG8a-9NRE
  • 59. Lessons Learned from 2000 Event-driven Microservices @NSilnitsky Greyhound github.com/wix/greyhound
  • 60. Thank You! natansil.com twitter@NSilnitsky linkedin/natansilnitsky github.com/natansil 👉 slideshare.net/NatanSilnitsky Any questions?