Thoughts on
Transactions
Chained transactions for scaling ACID
@billynewport
@billynewport
1
Agenda
 Explain scenario
 Implement using a single database
 Explain concurrency issues under load
 Implement using a ...
Scenario
 Large eCommerce web site
 Problem is order checkout
 We track inventory levels for each SKU
 Orders during c...
Shopping cart metrics
 Millions of SKUs
 Cart size of 5 items for electronics/big ticket items
 Cart size of 20 items f...
Database
 Begin
 For each item in cart
 Select for update where sku = item.sku
 Decrement available sku level
 If not...
Problem: cabbage patch dolls
 Cabbage patch dolls are popular this fall…
@billynewport
6
Database killers!
 The dolls cause major
concurrency problems
 Lots of row level locks
 Contention on doll rows
 Possi...
Database killers
 We need a way to get locks to decrement inventory
 But, we don’t want to hold the lock for very long
...
Solution
 Hold lock on inventory rows for as short a time as
possible
 Decouple this from size of cart.
 How?
@billynew...
Chained transactions
 Programmers think of transactions in synchronous
terms.
 Begin / Do Work / Hold locks / Commit
 E...
Inspiration
 Microsoft had COM objects with apartment model
threading.
 Modern Actor support is similar. Some state with...
Alternative
 We need to think asynchronously in terms of flows with
compensation
 Map of <SKU Key/SKU Amount>
 Brick:
...
Transactions and sharded
stores
 Option 1: Write transaction to one shard then
spread out asynchronously
 Option 2: 2 ph...
Transactions and sharded
stores
 Option 1: Write transaction to one shard then spread
out asynchronously
 Option 2: 2 ph...
Transactions and sharded
stores
 Option 1: Write transaction to one shard then spread
out asynchronously
 Option 2: 2 ph...
Implementation
 1PC only required
 Data store supporting
 Row locks
 Row oriented data
 Integrated FIFO messaging
 I...
Implementation
 Application makes map and code bricks
 Submits transaction as an asynchronous job.
 Uses a Future to ch...
Mechanism
 Loop
 Receive message for actor key
 Process it
 Send modified cart to next ‘sku’ using local ‘transmit q’
...
Performance
 A single logical transaction will be slower than a 1PC
DBMS implementation.
 However, under concurrent load...
Generalization
 This could be thought of as a workflow engine.
 But, a big difference here is that a workflow engine usu...
Architecture Comparison
Conventional Message oriented
@billynewport
21
Flow
DB
Appl
DB
Msg
Store
BP
Engi
ne
BP
Engi
ne
BP
...
Sample implementation
 Coming soon.
 Running in lab
 Working with several eCommerce customers looking to
implement soon...
Upcoming SlideShare
Loading in …5
×

Using chained transactions for maximum concurrency under load (QCONSF 2010)

5,019 views

Published on

My chained transaction talk for handling maximum concurrency in the presence of lock contention like shopping cart checkout.

Published in: Technology, Business
4 Comments
4 Likes
Statistics
Notes
  • Hello Billy,

    We use chaining to build scalable distributed transactions* and pair it with intra-transaction parallelism. Consider the following architecture for a shopping card with 5 items:

    - BPM engine executes a workflow with 5 parallel streams and each stream performs a 1PC transaction on individual DB shard and a synchronous replication transaction on shard’s backup.
    - Semantic reasons to abort individual transactions are prevented with an invariant, which guarantees the necessary post-execution integrity conditions.

    As a result, there is no need of compensators since there will be no need to abort for semantic reasons and hardware/software/network faults on individual shards will not prevent a transaction from committing on shard’s backup.

    Thus, the availability and partition tolerance are enhanced without trading off the consistency. The price is just 15% deterioration of performance.

    Ivan

    * http://transactum.com/Ivan_Klianev_HPTS_2011.pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello Billy,

    Do you have an insight on the availability of ’Chained transactions’ in Websphere eXtreme Scale ? Could it be there for Christmas ?

    Congratulations for WXS, it is really innovative in the IMDG sector. You solved most of the big problems I faced with competing products.

    Cyrille
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Reminds me of itinerary-based routing from the early Sonic ESB days. All the state was kept in the message and it enabled a distributed deployment. This caused early excitement and the creation of WS-Routing by Microsoft and talk of adopting it in Biztalk... but eventually they dropped it as it was felt there were major drawbacks to this approach as a general facility - security being the big one. Having said that there's no question it has beneficial concurrency properties in an orchestration scenario.

    On the other hand, I'm not sure why the flow state doesn't scale when it's it's own database (or set of shards) as long as you ensure downstream updates (of the SKUs) are decoupled and asynchronous. The state embedded in the message just seems like icing on the cake, the real win is in splitting a large footprint transaction into lots of little ones, executing asynchronously, with compensations if a rule is violated. This is basically what some of the better BPM engines do (not naming names ;)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @dehora
    Bill de hOra
    'Slide 20 is fried gold. That one slide captures what's wrong with 99% of workflow systems. plus without getting all actor theoretic.'
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
5,019
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
58
Comments
4
Likes
4
Embeds 0
No embeds

No notes for slide

Using chained transactions for maximum concurrency under load (QCONSF 2010)

  1. 1. Thoughts on Transactions Chained transactions for scaling ACID @billynewport @billynewport 1
  2. 2. Agenda  Explain scenario  Implement using a single database  Explain concurrency issues under load  Implement using a sharded database  Implement using WebSphere eXtreme Scale and chained transactions @billynewport 2
  3. 3. Scenario  Large eCommerce web site  Problem is order checkout  We track inventory levels for each SKU  Orders during checkout need to adjust available inventory. @billynewport 3
  4. 4. Shopping cart metrics  Millions of SKUs  Cart size of 5 items for electronics/big ticket items  Cart size of 20 items for clothing  Expect concurrent load of 2500 checkouts per second @billynewport 4
  5. 5. Database  Begin  For each item in cart  Select for update where sku = item.sku  Decrement available sku level  If not available then rollback…  Update level where sku = item.sku  Commit Cart items randomly distributed amongst all 2m items, lots of concurrency. Simple enough, right? All is good? @billynewport 5
  6. 6. Problem: cabbage patch dolls  Cabbage patch dolls are popular this fall… @billynewport 6
  7. 7. Database killers!  The dolls cause major concurrency problems  Lots of row level locks  Contention on doll rows  Possible table lock escalation  App server thread issues  Connection pools empty  Then DEATH!  They aren’t sweet and cuddly any more… @billynewport 7
  8. 8. Database killers  We need a way to get locks to decrement inventory  But, we don’t want to hold the lock for very long  Bigger carts make the problem worse, all the locks held for longer  Ideally, hold locks for constant time  Any contentious items make problem worse @billynewport 8
  9. 9. Solution  Hold lock on inventory rows for as short a time as possible  Decouple this from size of cart.  How? @billynewport 9
  10. 10. Chained transactions  Programmers think of transactions in synchronous terms.  Begin / Do Work / Hold locks / Commit  Easy to program, bad for concurrency. @billynewport 10
  11. 11. Inspiration  Microsoft had COM objects with apartment model threading.  Modern Actor support is similar. Some state with a mailbox.  BPEL supports flows with compensation  Data meets actors is a good analogy  Send a message (cart) to a group of actors identified using their keys with a compensator @billynewport 11
  12. 12. Alternative  We need to think asynchronously in terms of flows with compensation  Map of <SKU Key/SKU Amount>  Brick:  Do { code to reduce inventory level for SKU }  Undo { code to increase level inventory for SKU }  Provide Map with do/undo bricks  Easy to program, great concurrency. @billynewport 12
  13. 13. Transactions and sharded stores  Option 1: Write transaction to one shard then spread out asynchronously  Option 2: 2 phase commit  Option 3: Chained transactions @billynewport 13
  14. 14. Transactions and sharded stores  Option 1: Write transaction to one shard then spread out asynchronously  Option 2: 2 phase commit  Option 3: Chained transactions @billynewport 14
  15. 15. Transactions and sharded stores  Option 1: Write transaction to one shard then spread out asynchronously  Option 2: 2 phase commit  Option 3: Chained transactions @billynewport 15
  16. 16. Implementation  1PC only required  Data store supporting  Row locks  Row oriented data  Integrated FIFO messaging  IBM WebSphere eXtreme Scale provides these capabilities. @billynewport 16
  17. 17. Implementation  Application makes map and code bricks  Submits transaction as an asynchronous job.  Uses a Future to check on job outcome.  Do blocks can trigger flow reversal if a problem occurs.  Invoke undo block for each completed step @billynewport 17
  18. 18. Mechanism  Loop  Receive message for actor key  Process it  Send modified cart to next ‘sku’ using local ‘transmit q’  Commit transaction  Background thread pushes messages in transmit q to the destination shards using exactly once semantics. @billynewport 18
  19. 19. Performance  A single logical transaction will be slower than a 1PC DBMS implementation.  However, under concurrent load then it will deliver:  Higher throughput  Better response times  Thru better contention management  Each ‘SKU’ only locked for a very short period @billynewport 19
  20. 20. Generalization  This could be thought of as a workflow engine.  But, a big difference here is that a workflow engine usually talks with a remote store.  Here:  the flow state is the MESSAGE  It moves around to where the data is for the next step  Using a MESSAGE for flow state rather than a database means it scales linearly.  The message ‘store’ is integrated and scales with the data store. @billynewport 20
  21. 21. Architecture Comparison Conventional Message oriented @billynewport 21 Flow DB Appl DB Msg Store BP Engi ne BP Engi ne BP Engine Flow State Flow Edge = Msg Integrated Msg/Data store Appl DB Write behind Integrated Msg/Data store Integrated Msg/Data store
  22. 22. Sample implementation  Coming soon.  Running in lab  Working with several eCommerce customers looking to implement soon.  Soon to be published on github as sample code. @billynewport 22

×