Distributed transactions

Distributed transactions
- ARITRA DAS

Database transactions
A group of operations that is independently executed for
data retrieval or updates.

ACID properties
➔ Atomicity
➔ Consistency
➔ Isolation
➔ Durability

consistency
➔ Constraints are satisfied
➔ Data integrity is maintained

Durability
➔ Changes that have been committed to the database should
remain even in the case of software and hardware failure

Atomicity
➔ All or nothing
➔ Abortibility
➔ Log for crash recovery

Isolation
➔ Concurrently running transactions shouldn’t interfere
with each other.

Dirty write
➔ One transaction overrides uncommitted data of another
transaction

Dirty read
➔ One transaction sees uncommitted data of a different
transaction

Non repeatable reads
➔ Different reads gives back different values for same
object.

Phantom reads
➔ Transactions depend on the criteria that it modifies

Read uncommitted
➔ Transaction takes a write-lock on the row whose data it’s
modifying.
➔ Prevents dirty write.
➔ Highest performance, since no read lock.

Read committed
➔ Maintain last committed value in memory.
➔ For read operations last commited value is given (no
dirty reads)
➔ For write operation it takes a write-lock on the row
level (no dirty write)

Snapshot isolation
➔ Transaction sees all the data in a state, when the
transaction was initiated.
➔ Database maintains several copies of the same data (multi
version concurrency control)
➔ Takes write-lock.
➔ Prevents dirty write, dirty read, non-repeatable reads.

serializable
➔ Transactions may execute in parallel, the end result is
the same as if they had executed one at a time, serially,
without any concurrency.
➔ Complex and slower than others.

serializable isolation techniques
➔ Execute in serial order
➔ 2 phase locking
◆ Shared lock for reading, exclusive lock for writing
◆ Pessimistic
➔ Serializable snapshot isolation
◆ No locks, database checks for conflicts when commit attempt is made.
◆ In case of a conflict transactions are aborted.
◆ Optimistic

Isolation levels on different dbs

Distributed systems
➔ The nodes operate concurrently.
➔ The nodes fail independently.
➔ The nodes do not share a global clock.

Distributed transactions
➔ Transactions in a distributed system, spanned across two
or more nodes.
➔ Transactions involving 2 or more nodes having network
partition.

2 phase commit
➔ Provides atomicity
➔ Synchronous
➔ Requires global coordinator(most of the times)
➔ 2 Phases
◆ Prepare
◆ Commit

2pc failures: after prepare
1. P1 -> Prepared ACK
2. P2 -> Prepared ACK
3. P1 -> Committed ACK
4. P2 -> Commit fails
5. Coordinator -> Retry indefinitely

2pc failures: coordinator failure
➔ Before prepare
◆ Clients abort
➔ After getting ACK from participants
◆ Participants wait for the coordinator to come back up
◆ Coordinator comes back, reads the log and act

2pc beneﬁts
➔ Guarantees atomicity
➔ Provides read-write isolation
➔ Provides strong consistency

2pc disadvantages
➔ Synchronous and Blocking
➔ Hold locks
➔ Some problems in 2PC is addressed by 3PC

SAGA
➔ Async and reactive
➔ Communication over message bus
➔ Compensating transactions on failure

Saga pros and cons
➔ Pros
◆ Async, non-blocking
◆ Atomicity
➔ Cons
◆ No isolation

Types of saga
➔ Choreography
➔ Orchestration

choreography
➔ No central coordinator
➔ Participants emits and subscribes to messages
➔ Simpler to implement
➔ Provides loose coupling

orchestration
➔ Central orchestrator coordinate the events
➔ Works in command reply async style
➔ Less coupling
➔ Smart orchestrator, dumb services

Saga isolation anomalies
➔ Lost update
➔ Dirty read
➔ Non repeatable reads

Counter measures
➔ Semantic lock— An application-level lock. This can be an actual DB lock,
or adding an indicator that this record is being updated with something
like *_PENDING added to the status.
➔ Commutative updates— Design update operations to be executable in any
order.
➔ Pessimistic view— Reorder the steps of a saga to minimize business risk.
➔ Reread value— Prevent dirty writes by rereading data to verify that it’s
unchanged before overwriting it.
➔ Version file— Record the updates to a record so that they can be
reordered.
➔ By value— Use each request’s business risk to dynamically select the
concurrency mechanism.

Semantic lock: LOST UPDATE
1. Create order SAGA -> Flag :: PENDING_APPROVAL
2. Cancel order SAGA -> can’t cancel as flag is
PENDING_APPROVAL
3. Create order SAGA -> Flag :: APPROVED

Reread value: lost update
1. Create order saga creates an order
2. Cancel order saga cancels the order
3. Create order saga try to approve the order
a. Reads the order status
b. Order is cancelled hence don’t do anything

conclusion
➔ Maintaining ACID properties in distributed systems is
hard.
➔ It’s essential to understand atomicity and isolation to
determine what is required by your systems.
➔ If you can identify what kind of anomalies might happen
in your system, then you can take countermeasures.
➔ There’s no silver bullet.

Distributed transactions

More Related Content

What's hot

Similar to Distributed transactions

Recently uploaded

Distributed transactions