German Shegalov Transaction Timestamping in Temporal Databases FR Informatik Graduiertenkolleg Ringvorlesung, May 26 th , ...
Outline <ul><li>Introduction: conventional vs. temporal </li></ul><ul><ul><li>Temporal databases:  valid-time ,  transacti...
Conventional vs. Temporal DB <ul><li>A  conventional  DB captures only the most  current state  of modeled world  </li></u...
Notions of Time <ul><li>Transaction-time   </li></ul><ul><ul><li>is defined as the time when a fact is  stored  in the dat...
Transaction (ACID contract) <ul><li>Atomicity  (all or nothing in case of a failure) </li></ul><ul><ul><li>begin;  </li></...
Transaction Isolation x=0 r 1 (x=0) r 2 (x=0) w 2 (x=x+20) w 1 (x=x+10) x=30 x=10 Lost  Update: w 1 (x=10) r 2 (x=10) abor...
CC Protocols <ul><li>Basic Timestamp Ordering (BTO) </li></ul><ul><ul><li>each transaction  i  obtains a  t i  timestamp r...
Outline <ul><li>Introduction: conventional vs. temporal </li></ul><ul><ul><li>Temporal databases:  valid-time ,  transacti...
TT Database Semantics <ul><li>Each record has a  timestamp   </li></ul><ul><li>Insert  creates a new record </li></ul><ul>...
Timestamp Selection (simple) <ul><li>BTO provides proper timestamp order automatically </li></ul><ul><ul><li>but it causes...
Two Phase Commit (2PC) Coordinator DB 1 DB 2 force-log begin Timeline force-log prepared force-log prepared force-log comm...
Timestamping Issues in 2PC <ul><li>Problem </li></ul><ul><ul><li>network latencies  and  loosely synced clocks </li></ul><...
2PC for Transaction Time DB Coordinator DB 1 DB 2 force-log begin(10) Timeline force-log prepared;EARLIEST 1 ++  force-log...
Timestamping since SQL-92 <ul><li>SQL query can ask for  current time  with some precision:  year, month, date, …, millise...
&quot; Current Time &quot; Matters <ul><li>X 1   reads non-current  y  as of  t current   </li></ul><ul><li>X 3   updates ...
Inconsistent Timeslice <ul><li>SS2PL  accepts  the schedule above </li></ul><ul><li>X 1   reads  y  from  X 2  (hence,  c ...
Unrepeatable Timeslice <ul><li>writers   after  timeslice have to commit with a  later  timestamp than that of the concurr...
Solution Requirements <ul><li>SS2PL remains the primary CC mechanism </li></ul><ul><ul><li>reduce the likelihood of transa...
Algorithm Design <ul><li>each data item  d  has write-timestamp  d. TT </li></ul><ul><li>read timestamp  d. T R  in volati...
Before  t X  Assignment <ul><li>Read(d): /*sync  t X  with conflict write*/ </li></ul><ul><ul><li>t l   :=  max {  t l  , ...
Timestamp  t X  Assignment <ul><li>if because of  CURRENT_TIME  request </li></ul><ul><ul><li>t X   :=  t current  /* safe...
&quot;Who comes too late … &quot; <ul><li>will be punished by scheduler </li></ul><ul><li>Read(d) : </li></ul><ul><ul><li>...
Optimization I (Precision) <ul><li>user-specified  current time precision  allows for a broader range of  acceptable  time...
Optimization II (  RTT  ) <ul><li>no way to maintain  d .T R  in main memory </li></ul><ul><li>fixed-size hash table  RTT ...
Commit Processing <ul><li>/* update volatile RTT*/  for i:=1 to 1024 do   if  V R  [ i ] = 1  then     RTT [ i ]   := max{...
System Crashes <ul><li>Observation </li></ul><ul><ul><li>timestamping for  commit ted  X  is  safe </li></ul></ul><ul><ul>...
Summary <ul><li>transaction-consistent   view  on historical data </li></ul><ul><ul><li>timestamp order   consistent  with...
Outlook <ul><li>Impact on multiversion concurrency control </li></ul><ul><ul><li>Read-Only Multiversion, Snapshot Isolatio...
Questions
Upcoming SlideShare
Loading in …5
×

Transaction Timestamping in Temporal Databases

616 views
471 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
616
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transaction Timestamping in Temporal Databases

  1. 1. German Shegalov Transaction Timestamping in Temporal Databases FR Informatik Graduiertenkolleg Ringvorlesung, May 26 th , 2003 based on the research by D. Lomet , C. Jensen and R. Snodgrass
  2. 2. Outline <ul><li>Introduction: conventional vs. temporal </li></ul><ul><ul><li>Temporal databases: valid-time , transaction-time </li></ul></ul><ul><ul><li>Databases and transactions (AC I D principles) </li></ul></ul><ul><ul><li>( Optimistic ) concurrency control: TO, … </li></ul></ul><ul><ul><li>( Pessimistic ) concurrency control: 2PL, … </li></ul></ul><ul><li>Timestamping in transaction-time databases </li></ul><ul><ul><li>Timestamping and strong 2PL (SS2PL) </li></ul></ul><ul><ul><li>Timestamping in distributed setting (2PC) </li></ul></ul><ul><ul><li>Timestamping since SQL-92 ( CURRENT_TIME ) </li></ul></ul>
  3. 3. Conventional vs. Temporal DB <ul><li>A conventional DB captures only the most current state of modeled world </li></ul><ul><ul><li>e.g. current account balance, employee's salary </li></ul></ul><ul><li>A temporal DB supports a time domain and is thus able to manage time varying data </li></ul><ul><ul><li>real-time stock quotes </li></ul></ul><ul><ul><li>employee's salaries between 1997 and 2000 </li></ul></ul>
  4. 4. Notions of Time <ul><li>Transaction-time </li></ul><ul><ul><li>is defined as the time when a fact is stored in the database that allows for as-of queries </li></ul></ul><ul><li>Valid-time </li></ul><ul><ul><li>is defined when a fact becomes effective (valid) in reality </li></ul></ul><ul><li>Bitemporal databases support both of above </li></ul>
  5. 5. Transaction (ACID contract) <ul><li>Atomicity (all or nothing in case of a failure) </li></ul><ul><ul><li>begin; </li></ul></ul><ul><ul><li>acc 1 -= money; acc 2 += money; </li></ul></ul><ul><ul><li>commit; </li></ul></ul><ul><li>Consistency </li></ul><ul><ul><li>rollback updates upon a failed consistency check </li></ul></ul><ul><li>Isolation </li></ul><ul><ul><li>mask inconsistent intermediate state resulting from concurrent execution </li></ul></ul><ul><li>Durability </li></ul><ul><ul><li>commit ted updates must be failure-resilient </li></ul></ul>
  6. 6. Transaction Isolation x=0 r 1 (x=0) r 2 (x=0) w 2 (x=x+20) w 1 (x=x+10) x=30 x=10 Lost Update: w 1 (x=10) r 2 (x=10) abort 1 =w 1 -1 (x) w 2 (x=x+10) x=0 x=10 x=20 Dirty Read: x=0 y=0 x=0 y=10 Inconsistent Read: r 1 (x=0) w 2 (x=5) w 2 (y=10) r 1 (y=10) Read/Write, Write/Read, Write/Write are not commutable
  7. 7. CC Protocols <ul><li>Basic Timestamp Ordering (BTO) </li></ul><ul><ul><li>each transaction i obtains a t i timestamp right away </li></ul></ul><ul><ul><li>operations are executed in the scheduled order </li></ul></ul><ul><ul><li>r i (x) : if t i ≥ w-time(x) then schedule else abort i </li></ul></ul><ul><ul><li>w i (x) : if t i ≥ max{w-time(x), r-time(x)} then schedule else abort i </li></ul></ul><ul><li>Two Phase Locking (2PL) </li></ul><ul><ul><li>prior to execution of an operation an appropriate lock is requested </li></ul></ul><ul><ul><li>no further lock requests after some lock has been released </li></ul></ul><ul><ul><li>lock is granted when no conflicting locks already present </li></ul></ul><ul><ul><li>otherwise add an edge to the Wait-For-Graph (WFG) </li></ul></ul><ul><ul><li>outperforms BTO </li></ul></ul><ul><li>Strong 2PL (SS2PL) </li></ul><ul><ul><li>locks are held until commit (IBM DB2, MS SQL Server , …) </li></ul></ul>
  8. 8. Outline <ul><li>Introduction: conventional vs. temporal </li></ul><ul><ul><li>Temporal databases: valid-time , transaction-time </li></ul></ul><ul><ul><li>Databases and transactions (AC I D principles) </li></ul></ul><ul><ul><li>( Optimistic ) concurrency control: TO, … </li></ul></ul><ul><ul><li>( Pessimistic ) concurrency control: 2PL, … </li></ul></ul><ul><li>Timestamping in transaction-time databases </li></ul><ul><ul><li>Timestamping and strong 2PL (SS2PL) </li></ul></ul><ul><ul><li>Timestamping in distributed setting (2PC) </li></ul></ul><ul><ul><li>Timestamping since SQL-92 ( CURRENT_TIME ) </li></ul></ul>
  9. 9. TT Database Semantics <ul><li>Each record has a timestamp </li></ul><ul><li>Insert creates a new record </li></ul><ul><li>Update inserts a new record version </li></ul><ul><li>Delete inserts an empty record version ( delete-stub ) for the record being deleted </li></ul><ul><li>Timeslice Q(t) executes Q against DB as of t </li></ul><ul><ul><li>returns for each qualifying record the latest version with timestamp ≤ t unless it is a delete-stub </li></ul></ul><ul><ul><li>implies that timestamp order must agree with serialization order </li></ul></ul>
  10. 10. Timestamp Selection (simple) <ul><li>BTO provides proper timestamp order automatically </li></ul><ul><ul><li>but it causes too many transaction restarts </li></ul></ul><ul><li>SS2PL for any p i (x) < q j (x) in conflict: </li></ul><ul><ul><li>pl i (x) < p i (x) < pul i (x) < c i < ql j (x) < qul j (x) < c j </li></ul></ul><ul><ul><li>commit order agrees with serialization order </li></ul></ul><ul><ul><li>chose commit time as timestamp </li></ul></ul><ul><ul><li>timestamping is not used for CC, thus no additional concurrency limitation </li></ul></ul>
  11. 11. Two Phase Commit (2PC) Coordinator DB 1 DB 2 force-log begin Timeline force-log prepared force-log prepared force-log commit force-log commit force-log commit force-log end prepare prepare yes yes commit commit ack ack
  12. 12. Timestamping Issues in 2PC <ul><li>Problem </li></ul><ul><ul><li>network latencies and loosely synced clocks </li></ul></ul><ul><ul><li>commit points are different at all sites </li></ul></ul><ul><ul><li>max_commit_time < begin_time as perceived by the user </li></ul></ul><ul><li>Observation </li></ul><ul><ul><li>when X is prepared , all conflicting concurrent transactions will commit after X </li></ul></ul><ul><li>Solution: </li></ul><ul><ul><li>each database i votes EARLIEST i acceptable timestamp that is updated after logging prepared </li></ul></ul><ul><ul><li>commit with max{ EARLIEST i , begin_time} </li></ul></ul>
  13. 13. 2PC for Transaction Time DB Coordinator DB 1 DB 2 force-log begin(10) Timeline force-log prepared;EARLIEST 1 ++ force-log prepared;EARLIEST 2 ++ force-log commit(11) force-log commit(11) force-log commit(11) force-log end /*begin_time = 10*/ /*EARLIEST 1 = 8*/ /*EARLIEST 2 = 10*/ prepare prepare yes(9) yes(11) commit(11) commit(11) ack ack
  14. 14. Timestamping since SQL-92 <ul><li>SQL query can ask for current time with some precision: year, month, date, …, millisecond </li></ul><ul><li>SQL-92 explicitly requires current time value to be fixed just within a single SQL statement </li></ul><ul><li>In TTDB a transaction logically takes place at a single point in time </li></ul><ul><ul><li>current time value must not change until commit </li></ul></ul>
  15. 15. &quot; Current Time &quot; Matters <ul><li>X 1 reads non-current y as of t current </li></ul><ul><li>X 3 updates unlocked current y (e.g. a stock goes up enormously) </li></ul><ul><li>some time later: was X 1 aware of X 3 ?! </li></ul><ul><ul><li>based on transaction timestamps: ct 1 > ct 3 => YES! </li></ul></ul><ul><ul><li>in fact: NOT GUILTY!!!!!!!!!!!!!!!!!!!!! </li></ul></ul><ul><li>current time determines user-perceived transaction time </li></ul>r(y 0 ) X 1 X 3 time fix t current ct 1 ct 3 w 3 ( y 3 ) X 2 ct 2 w 2 ( y 2 ) buy
  16. 16. Inconsistent Timeslice <ul><li>SS2PL accepts the schedule above </li></ul><ul><li>X 1 reads y from X 2 (hence, c 2 < c 1 ) => serialization X 2 < X 1 </li></ul><ul><li>timeslice(2) = { (x, 1), (y,0), (z,2) } , when taken after 8 is transaction inconsistent, it has never been current </li></ul><ul><li>Reason: t X 2 > t X 1 although X 2 < X 1 </li></ul>time X 1 X 2 x=0 y=0 z=0 5 c 2 fix t 1 current 1 fix t 2 current 3 w 1 (x=1) 2 w 2 ( y=1 ) 4 r 1 (y=1) 6 w 1 (z=2) 7 8 c 1
  17. 17. Unrepeatable Timeslice <ul><li>writers after timeslice have to commit with a later timestamp than that of the concurrent timeslicing transaction </li></ul>X 1 X 2 y=0 y=0 y=1 time X 3 6 c 2 4 c 1 fix t 1 current 2 timeslice 1 (t 1 current ) 3 w 2 ( y=1 ) 5 timeslice 3 ( t 1 current ) 7 8 c 3 fix t 2 current 1
  18. 18. Solution Requirements <ul><li>SS2PL remains the primary CC mechanism </li></ul><ul><ul><li>reduce the likelihood of transaction aborts </li></ul></ul><ul><li>If X has t current = t then X has started and not yet committed at time t </li></ul><ul><li>X 1 and X 2 with t 1 current < t 2 current then </li></ul><ul><ul><li>X 1 must not see X 2 's updates </li></ul></ul><ul><ul><li>there exists an equivalent serial schedule: X 1 < X 2 </li></ul></ul>
  19. 19. Algorithm Design <ul><li>each data item d has write-timestamp d. TT </li></ul><ul><li>read timestamp d. T R in volatile memory </li></ul><ul><li>reads define the lower transaction time bound t l </li></ul><ul><ul><li>initially t l := t s (transaction start time) </li></ul></ul><ul><li>V R (initially  ) volatile transaction's read-set </li></ul><ul><li>V I (initially  ) volatile transaction's write-set (newly inserted versions) </li></ul><ul><li>t X timestamp of transaction X </li></ul>
  20. 20. Before t X Assignment <ul><li>Read(d): /*sync t X with conflict write*/ </li></ul><ul><ul><li>t l := max { t l , d .TT } /* prevent t X ≤ d .TT */ </li></ul></ul><ul><ul><li>V R := V R  { d } /* will have to update d .T R */ </li></ul></ul><ul><li>Write(d): /*sync t X with conflict write&read*/ </li></ul><ul><ul><li>t l := max { t l , d. T R , d .TT } /*prevent t X ≤ d .TT and t X ≤ d .T R */ </li></ul></ul><ul><ul><li>V I := V I  {d} /* will have to update d .TT */ </li></ul></ul>
  21. 21. Timestamp t X Assignment <ul><li>if because of CURRENT_TIME request </li></ul><ul><ul><li>t X := t current /* safe because t current > t l */ </li></ul></ul><ul><li>if immediately before COMMIT </li></ul><ul><ul><li>t X := t l ++ /* smallest possible time greater than t l */ </li></ul></ul>
  22. 22. &quot;Who comes too late … &quot; <ul><li>will be punished by scheduler </li></ul><ul><li>Read(d) : </li></ul><ul><ul><li>if t X < d .TT then abort X </li></ul></ul><ul><ul><li>else V R := V R  { d } /* as before */ </li></ul></ul><ul><li>Write(d): </li></ul><ul><ul><li>if t X < max { d. T R , d .TT } then abort X </li></ul></ul><ul><ul><li>else V I := V I  { d } /* as before */ </li></ul></ul>
  23. 23. Optimization I (Precision) <ul><li>user-specified current time precision allows for a broader range of acceptable timestamps </li></ul><ul><ul><li>e.g. current year &quot;now&quot; and on Dec 31 th 2003, 23:59:59,999 is still the same </li></ul></ul><ul><li>t X := t current t X := ( t l , t h = max ( t current ,p )] </li></ul><ul><li>allow data access as before and thus potentially increasing t l </li></ul><ul><li>if t l ≥ t h then abort X </li></ul><ul><li>if X could be completely executed t X := t l ++ </li></ul>
  24. 24. Optimization II ( RTT ) <ul><li>no way to maintain d .T R in main memory </li></ul><ul><li>fixed-size hash table RTT : e.g. 1024 entries </li></ul><ul><li>D i := { d | hash(d) = i } for i in 1 … 1024 </li></ul><ul><li>trade-off: RTT size vs. read timestamp accuracy </li></ul><ul><li>Write(d), RTT is checked immediately </li></ul><ul><ul><li>t l := max { t l ,RTT [ hash(d) ], d .TT } </li></ul></ul><ul><li>Redefine V R to be 1024-bit-bitvector with </li></ul><ul><ul><li>V R [i] = 1, if d has been read and i = hash ( d ) </li></ul></ul><ul><ul><li>128 byte overhead to track accessed data items </li></ul></ul>
  25. 25. Commit Processing <ul><li>/* update volatile RTT*/ for i:=1 to 1024 do if V R [ i ] = 1 then RTT [ i ] := max{ RTT [ i ] , t X } </li></ul><ul><li>/* timestamp data, part of transaction*/ /*either directly or by X -id- t X mapping */ for each d in V I do d .TT := t X </li></ul>
  26. 26. System Crashes <ul><li>Observation </li></ul><ul><ul><li>timestamping for commit ted X is safe </li></ul></ul><ul><ul><li>RTT passed away and </li></ul></ul><ul><ul><ul><li>so did crash-interrupted X which needed RTT </li></ul></ul></ul><ul><ul><ul><li>committed transactions do not need RTT </li></ul></ul></ul><ul><ul><li>last commit time is before crash time </li></ul></ul><ul><ul><li>each new X will start and commit after crash time </li></ul></ul><ul><li>Recovery Action </li></ul><ul><ul><li>RTT[i]:= last commit time </li></ul></ul><ul><ul><li>conservative read-write sync detection w/o penalty </li></ul></ul>
  27. 27. Summary <ul><li>transaction-consistent view on historical data </li></ul><ul><ul><li>timestamp order consistent with transaction serialization order </li></ul></ul><ul><li>Simple timestamp selection at commit time </li></ul><ul><li>Solution for distributed transactions with 2PC </li></ul><ul><li>Solution for &quot;CURRENT_TIME&quot; requests </li></ul>
  28. 28. Outlook <ul><li>Impact on multiversion concurrency control </li></ul><ul><ul><li>Read-Only Multiversion, Snapshot Isolation [Weikum + Vossen 01] </li></ul></ul>
  29. 29. Questions

×