Discover how Wix transitioned from complex event sourcing and CQRS to streamlined CRUD services, optimizing their vast platform for better scalability, performance, and resiliency.
Wix's platform, designed to accommodate diverse business needs, boasts:
* 3.5 Billion daily HTTP transactions
* 70 Billion Kafka messages per day
* Roughly 4000 microservices in production
This session will highlight the simplification of Wix's architecture through domain events, resilient Kafka messaging, and advanced techniques like materialization and caching. By standardizing APIs and employing tools like protobuf and gRPC, Wix has enhanced the developer experience, both internally and externally, and fostered an open, integrative platform.
Attendees will gain insights into Wix's strategies for microservice coordination, ensuring system resilience and data consistency, as well as query performance optimization through innovative 2-level caching solutions.
8. Taming Distributed Systems @NSilnitsky
WriteProduct
Event sourcing and CQRS
Product 123:
Product Created
1
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
4
ReadProduct
Name
Price
Description
Stock-level
Append-Only
Events
Snapshot
9. Taming Distributed Systems @NSilnitsky
Event
Store
CreateProduct:
Catalog Write API
Event sourcing and CQRS
Product Created
10. Taming Distributed Systems @NSilnitsky
Event sourcing and CQRS
Product 123:
Product Created
1
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
4
Event
Store
CreateProduct:
Catalog Write API
Product Created
11. Taming Distributed Systems @NSilnitsky
Replay
GetProduct
Catalog Read API
Event sourcing and CQRS
Replay
events
Product 123:
Product Created
1
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
4
Event
Store
CreateProduct:
Catalog Write API
Product Created
12. Taming Distributed Systems @NSilnitsky
Event sourcing and CQRS – Advantages
Debug with
“time travel”
GetProduct
Catalog Read API
Replay
events
Product 123:
Product Created
1
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
4
Event
Store
CreateProduct:
Catalog Write API
Product Created
13. Taming Distributed Systems @NSilnitsky
Product 123:
Product Created
1
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
4
…..
5
Product Changed (stock-level)
30
Event
Store
Event sourcing and CQRS
14. Taming Distributed Systems @NSilnitsky
Product 123:
Product Created
1
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
4
…..
5
Product Changed (stock-level)
30
Event
Store
Product 123 Snapshot
Snapshot
Repository
Event sourcing and CQRS
15. Taming Distributed Systems @NSilnitsky
Event sourcing and CQRS - Advantages
Catalog Read API
Inventory
snapshot
Product
Snapshot
Product Created
Product Changed (price)
2
Product Changed (description)
3
Product Changed (stock-level)
Product 123:
1
4
16. Taming Distributed Systems @NSilnitsky
Event sourcing and CQRS - Disadvantages
Product
DB
Create/Read Product
Catalog CRUD
Event
Store
CreateProduct:
Catalog Write API
Product Created
Product 123 Snapshot
Snapshot
Repository
Catalog Read API
17. Taming Distributed Systems @NSilnitsky
Event sourcing and CQRS - Disadvantages
Product Created (price 4$)
1
Product Changed (price 6$)
2
Product 123:
Snapshot
Repository
Delayed
Product snapshot Consumer
Catalog Write API
18. Taming Distributed Systems @NSilnitsky
Event sourcing and CQRS - Disadvantages
Product 123:
Snapshot
Repository
Product Created (price 4$)
1
Product snapshot Consumer
Catalog Write API
Product Changed
2
Product Created (price 4$)
1
Product Changed (price 6$)
2 Delayed
22. Taming Distributed Systems
Event sourcing and
CQRS -
Disadvantages
Complexity
Eventual consistency only
Massive scale
Corrupted snapshot
Read your own writes
24. Taming Distributed Systems @NSilnitsky
CreateProduct
Product
Document Store
Catalog API
CRUD - platformized
ReadProduct
UpdateProduct
DeleteProduct
Unified!
25. Taming Distributed Systems @NSilnitsky
Wix’s Open Platform
CRUD
CRUD
+
Event sourcing
Was
Independent
“startups”
Now
Single Open
Platform
26. Taming Distributed Systems @NSilnitsky
Wix’s Open Platform
CRUD
CRUD
+
Event sourcing
API First
APIs - TDD
+
FE driven
Was
Independent
“startups”
Now
Single Open
Platform
30. Taming Distributed Systems @NSilnitsky
Wix Cart service
3rd party
PoS app
API
Wix Stores on Wix’s Open Platform
Site Code
extensions
“Velo”
API
31. Taming Distributed Systems @NSilnitsky
Wix Cart service
3rd party Tax
Calculator
SPI
SPI
Wix Product
catalog service
Wix Stores on Wix’s Open Platform
32. Taming Distributed Systems @NSilnitsky
Wix Stores on Wix’s Open Platform
Wix Cart service
3rd party
Analytics app
Cart Item Added
Site Code
extensions
“Velo”
33. Taming Distributed Systems @NSilnitsky
Wix Stores on Wix’s Open Platform
Wix Cart service
3rd party
PoS app
3rd party
Analytics app
3rd party Tax
Calculator
SPI
API
Cart Item Added
SPI
Wix Product
catalog service
Site Code
extensions
“Velo”
API
34. Taming Distributed Systems @NSilnitsky
Stores
Bookings
Events
Forms
Loyalty Rewards
Tickets Policies
Checkout
Time
Slots
Schemas
Sub
missions
Guests
Coupons
Calendar
Orders
Waitlist
Cart
Catalog
Programs
API First - platformized CRUD
Internal Wix
Developer
External App
Developer
External Wix Site
Developer (Velo)
35. Taming Distributed Systems @NSilnitsky
CreateProduct
ReadProduct
UpdateProduct
DeleteProduct
Catalog API
API First - platformized CRUD
36. Taming Distributed Systems @NSilnitsky
CreateProduct
ReadProduct
UpdateProduct
DeleteProduct
Catalog API
API First - platformized CRUD
42. Taming Distributed Systems @NSilnitsky
EDA - Domain Events
CreateProduct
UpdateProduct
DeleteProduct
Product Created
Product Updated
Product Deleted
Product
Document Store
Catalog Service
* DE describes…
43. Taming Distributed Systems @NSilnitsky
EDA - Domain Events
CreateProduct
UpdateProduct
DeleteProduct
Product Created
Product Updated
Product Deleted
Catalog Service
Product
SDL
* SDL is
document based.
no direct SQL
44. Taming Distributed Systems @NSilnitsky
EDA - Domain Events
CreateProduct
UpdateProduct
DeleteProduct
Product Created
Product Updated
Product Deleted
Product
SDL
Catalog Service
Product
SDL
service ProductService {
...
rpc CreateProduct (CreateProductRequest) returns
(CreateProductResponse) {
...
option (callback) = {
event_type: CREATED
};
}
}
45. Taming Distributed Systems @NSilnitsky
EDA - Domain Events
Data
warehouse/Lake
CreateProduct
UpdateProduct
DeleteProduct
Product Created
Product Updated
Product Deleted
Catalog Service
Product
SDL
* debugging
corruption
46. Taming Distributed Systems @NSilnitsky
EDA - Domain Events
ebay-bridge Service
CreateProduct
UpdateProduct
DeleteProduct
Catalog Service
Product Created
Product Updated
Product Deleted
Product
SDL
47. Taming Distributed Systems @NSilnitsky
EDA - Domain Events
CreateProduct
UpdateProduct
DeleteProduct
Product Created
Product Updated
Product Deleted
Catalog Service
Product
SDL
49. Taming Distributed Systems @NSilnitsky
Make DB Update & Event Producing Atomic
Catalog
Service
Ebay Bridge
Service
* atomic,
otherwise
50. Taming Distributed Systems @NSilnitsky
Produce event to S3
Resilient Producer
Catch Unsent Events
Catalog Service
51. Taming Distributed Systems @NSilnitsky
Produce
to Kafka
Poll
Produce event to S3
Resilient Producer
Fallback to S3 and Heal
Catalog Service Healer Service
52. Taming Distributed Systems @NSilnitsky
Consumer retries + DLQ
Make DB Update & Event Producing Atomic
Catalog
Service
Ebay Bridge
Service
53. Taming Distributed Systems @NSilnitsky
Alternative - use outbox pattern and/or CDC
Transaction
Outbox Table
Insert
Product Table
Insert
Update
Delete
Database
Instant read-your-own-writes
consistency in Catalog service
Write to
database CDC
Read from
Outbox Table
Kafka
Connect
Debezium
connector
Publishes
messages
to brokers
Kafka
Broker
Eventually consistent data
exchange with Ebay Bridge Service
Catalog
Service
Ebay Bridge Service
55. Taming Distributed Systems @NSilnitsky
Query latency - naive
CreateProduct
ReadProduct
DeleteProduct
Catalog Service
Product
SDL
FilterProductWithInventory
Inventory Service
Inventory
SDL
RPC
1 2
56. Taming Distributed Systems @NSilnitsky
Multi-step
Query latency - naive
CreateProduct
ReadProduct
DeleteProduct
Catalog Service
Product
SDL
FilterProductWithInventory
Inventory Service
Inventory
SDL
RPC
price < 100 and stock > 4
2
1
57. Taming Distributed Systems @NSilnitsky
Query latency - DB join
CreateProduct
ReadProduct
DeleteProduct
Catalog Service
Product
SDL
FilterProductWithInventory
Inventory Service
Inventory
SDL
DB level Join
58. Taming Distributed Systems @NSilnitsky
Query Latency - Materializer
Product +
Inventory
Materializer
Inventory
Inventory updated Event
Catalog
Service
Inventory
Service
FilterProductWithInventory
59. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
60. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
61. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
62. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
63. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
* no consistency
64. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
65. Taming Distributed Systems @NSilnitsky
Simplicity
Onboarding new team member
Write performance
Read performance
consistency
Audit log/time machine
Projections/queries
Comparing Event sourcing
to Wix’s CRUD based solution
Event Sourcing
CRUD
SDL+Domain Events
Materializer
66. Taming Distributed Systems @NSilnitsky
Summary
Wix successfully shifted its vast distributed system entirely to
CRUD-based microservices, moving away from a CRUD/event sourcing
hybrid.
.
.
.
67. Taming Distributed Systems @NSilnitsky
Summary
Wix successfully shifted its vast distributed system entirely to
CRUD-based microservices, moving away from a CRUD/event sourcing
hybrid.
This transformation was driven by a commitment to standardization,
managed infrastructure with automated code generation, and a
decoupled architecture.
68. Taming Distributed Systems @NSilnitsky
Summary
Advanced tools were also implemented to boost development speed,
ensure system resilience, and optimize for scale and performance.
Domain Events
Resilient Producer Materializer
Simple Data Layer