Asynchronous processing in big system

Asynchronous Processing
in Big System
Lê Minh Nghĩa
Solution Architect at Tiki.vn

Big System’s Targets
• High Performance
• High Scalability
• High Reliable
• Low Cost Maintenance

Problems
• IO Bottleneck
• Scale Processing
• Handle a Huge Concurrent Request
• Availability and Partition Fault Tolerance
• Deal with consistency and concurrency

Split to Scale
• If you can’t split, you can’t scale it
• From Monolithic System to Distributed System
• From One to Many Processing
• From One to Many Persistent
• From Single to Parallel Processing
• From Synchronous to Asynchronous

Data Replication
• Every nodes need a way to communicate with
other
• Data replication is the most important in
distributed system
• The reliability of a system depends on the way
data replication

Replicate Model
• Primary Data Backup
• State Machine Model
(Active Active Model)

What’s Reliable
Replication?
• No Lost Data
• Guarantee Ordering
• High Scalability
• Easy Integration

Message Queue
• Message Queue is the key to split and scale
system
• It’s the solution for reliable replication
• But It’s not simple as we think…

1. Guarantee No Lost Data
We usually do both:
- Write Data To DB
- Send Message To
Queue
Database
Message
Queue
Processing
Problems, In fact:
- Can Write But Can’t
Send
- Can Send But Cant’
Write

1. Guarantee No Lost Data
• Solutions:
• Use One Way data flow:
Process —> Database —> Message Queue
• Use Transaction Log to Dispatch Data Changes

2. Guarantee
Sending Ordering
• Problems:
• Each request sending out
one message at the
same time, in different
threads
• One of the messages can
be fail in sending
• That cause the messages
are not in ordering

2. Guarantee
Sending Ordering
• Use Transaction Log To Append Un-dispatched
Message in Order
• Asynchronous Sending Un-Dispatched Message
to Message Queue

2. Guarantee
Sending Ordering
Transaction
Undispatched
Message
Write
Polling
Worker
1. Poll Messages
Message Bus
2. Publish Messages
3. Remove Messages

3. Guarantee
Delivery Ordering
E2 E3 E4E1
Worker 1 Worker 2 Worker 3 Worker 4
- Events are dequeued in
concurrency by many
workers
- Message Queue can
guarantee first in first out
- The later event can be
processed faster than the
earlier event —> cause lost
ordering

3. Guarantee
Delivery Ordering
• Solutions:
• if use Rabbit MQ/Active MQ: use only one
consumer for a queue
• If use Kafka, Kafka guarantee ordering delivery
message per each partition. Only one thread of a
consumer group can receive message from a
partition

4. Idempotemcy Filtering
• This is about duplicate message
• A message can be delivery more than one time
• Example: can deposit twice because receive
deposit message twice

4. Idempotemcy Filtering
• Solutions:
• Use UUID/GUID v4 for message id
• Use timestamp or version of message to detect
duplicate

5. Versoning Message
• Replicated data is
always eventually
consistency
• Sometime we
need to know
about how stale
data is
V4 V3 V2V5
Write V5 Read V1

5. Versoning Message
• Use timestamp
• Use incremental version (integer)
• Guarantee increase version consistency when
write data

6. Non Blocking IO
• How to handle million
messages in a queue?
• Solutions:
• Processing message in
pipeline.
• Split processing in three
separated phases: receiving,
handling and completing
message
• Each phase is processing in
parallel
receiving
handling
completing

7. Capture Data Changes
• Is the way capture data changes of DB to
replicate data to Message Queue
• Use specific mechanism of DB to know the
changes of Data

MySQL Bin Log
• Decode My
SQL Bin Log
to know new
data changes
MySQL My SQL Binlog
Event Handler
Decode Bin Log
Message Queue

Postgresql Notification
• Use Postgres
Notification to
notify the
changes of
data
Postgresql
Notification
Receiver
Message Queue
Notify

Thank You
• Contact: Lê Minh Nghĩa
• Email: nghia.fit@gmail.com
• Facebook: /nghialeminh

Asynchronous processing in big system

More Related Content

What's hot

Similar to Asynchronous processing in big system

Recently uploaded

Asynchronous processing in big system