• Save
ABA Problem
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

ABA Problem

on

  • 534 views

ABA problem and solution

ABA problem and solution

Statistics

Views

Total Views
534
Views on SlideShare
534
Embed Views
0

Actions

Likes
1
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

ABA Problem Presentation Transcript

  • 1. ABA Problem Dr. C.V. Suresh Babu
  • 2. Introduction • Queues are everywhere in parallel applications and operating systems • Many researchers have proposed queues – – – – Hwang and Briggs Gottlieb Massalin Et al. etc… • Queue performance can be critical to operating system performance – Scheduling Queues – Free memory lists – Many other critical kernel operations
  • 3. Concurrent FIFO Queue algorithms • Blocking algorithms risk performance degradation – A process can be delayed or halted at inopportune moments • Scheduling preemption • Page faults • Cache misses – Slow processes can prevent faster ones from completing indefinitely • Non-Blocking algorithms must solve the ABA problem – During contention, some process will complete within a given number of operations
  • 4. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop value = SM newVal = value +1 Stack Data = d CAS(&SM, value, newVal) break Stack SM 5 … x time newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X
  • 5. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM newVal = value +1 Stack Data = d CAS(&SM, value, newVal) break Stack SM 4 … x time
  • 6. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d CAS(&SM, value, newVal) break Stack SM 4 … x time
  • 7. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Push(z) Stack Data = d CAS(&SM, value, newVal) break Stack SM 4 … x time
  • 8. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break Stack SM 4 … x time
  • 9. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack SM 4 … x time
  • 10. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) SM 5 … z time
  • 11. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X loop CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) CAS(&SM,value,newVal) SM 5 … z time
  • 12. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X loop CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) CAS(&SM,value,newVal) v1=x SM 4 … z time
  • 13. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X loop CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) CAS(&SM,value,newVal) v1=x SM 4 … z CAS should fail but it succeeds time Thread1 has Thread2’s data
  • 14. Solutions for ABA problem Cache Kernel • Add version # to data structures • Increment # during every CAS instruction LL/SC • Fail if Cache Line has been written to
  • 15. Solution for ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data DCAS(&SM, value, v1=Pop() value = 5 return data Push (d) { newVal = 4 data=X break value = 5 newVal = 4 <ver++,newVal>) v2=Pop() data=X loop DCAS(&SM,value,ver,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) DCAS(&SM, value, value = 4 <ver++,newVal>) newVal=5 Stack break DCAS(&SM,value,ver,newVal) DCAS(&SM,value,ver,newVal) Will not incorrectly succeed SM 5 … z (ver != ver+2) time
  • 16. Solution for ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data DCAS(&SM, value, v1=Pop() value = 5 return data Push (d) { newVal = 4 data=Z break value = 5 newVal = 4 <ver++,newVal>) v2=Pop() data=X loop DCAS(&SM,value,ver,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) DCAS(&SM, value, value = 4 <ver++,newVal>) newVal=5 Stack break DCAS(&SM,value,ver,newVal) DCAS(&SM,value,ver,newVal) Will not incorrectly succeed SM 4 (ver != ver+2) … time V1 = Z
  • 17. Correctness Properties 1. 2. 3. 4. 5. The linked list is always connected Nodes only inserted after the last node Nodes only deleted from beginning Head always points to the first node Tail always points to a node in the list
  • 18. Queue # 1 • Non-Blocking Concurrent Queue – enqueue()
  • 19. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value node–>next.ptr = NULL loop tail = Q–>Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node
  • 20. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node
  • 21. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node
  • 22. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node [NULL]
  • 23. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node [NULL]
  • 24. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node “enqueue(myQueue, D1)” enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) [NULL]
  • 25. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) [NULL]
  • 26. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL]
  • 27. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] [NULL]
  • 28. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail [NULL]
  • 29. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail next [NULL]
  • 30. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail next [NULL]
  • 31. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail next [NULL]
  • 32. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 +1 [NULL] tail next
  • 33. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 +1 [NULL] tail next
  • 34. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next +1 if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 +1 [NULL]
  • 35. Concurrent enqueues • Suppose two processes call enqueue() at the same time
  • 36. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break Process 1 endif Enqueue(myQueue, ABC) else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) ABC endif endif [NULL] endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) tail next initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node D1 [NULL] Process 2 Enqueue(myQueue, XYZ) XYZ [NULL] tail next
  • 37. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break Process 1 endif Enqueue(myQueue, ABC) else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) ABC endif endif [NULL] endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) tail next initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node D1 +1 XYZ [NULL] Process 2 Enqueue(myQueue, XYZ) tail
  • 38. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break Process 1 endif Enqueue(myQueue, ABC) else CAS( &Q–>Tail, tail, <next.ptr, tail.count+1>) ABC endif endif [NULL] endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) tail next initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node D1 +1 XYZ [NULL] Process 2 Enqueue(myQueue, XYZ) tail
  • 39. Queue #1 (cont.) • Non-Blocking Concurrent Queue – dequeue()
  • 40. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node “dequeue(myQueue, pvalue)” dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL]
  • 41. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 42. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 43. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 44. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 45. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 46. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 47. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
  • 48. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL]
  • 49. Concurrent dequeues • Suppose two processes call dequeue() at the same time
  • 50. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop head = Q–>Head Head tail = Q–>Tail next = head–>next if head == Q–>Head if head.ptr == tail.ptr Tail if next.ptr == NULL return FALSE endif CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head,head, <next.ptr, head.count+1>) break endif “dequeue(myQueue, endif endif endloop free(head.ptr) return TRUE D1 [NULL] pvalue)”
  • 51. Queue #2 • Two-lock Concurrent Queue
  • 52. struct node_t { data_type value node_t * next } struct queue_t { pointer_t Head pointer_t Tail lock_type H_lock lock_type T_lock } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node Q–>H lock = Q–>T lock = FREE dequeue(Q: pointer to queue t, pvalue: pointer to data type): boolean lock(&Q–>H lock) node = Q–>Head new head = node–>next if new head == NULL enqueue(Q: pointer to queue t, value: data type) unlock(&Q–>H lock) node = new node() return FALSE node–>value = value endif node–>next.ptr = NULL *pvalue = new head–>value lock(&Q–>T lock) Q–>Head = new head Q–>Tail–>next = node unlock(&Q–>H lock) Q–>Tail = node free(node) unlock(&Q–>T lock) return TRUE • Algorithms have same general structure only different data types • No loops, ‘busy waiting’ instead • Only dequeues access Head Lock • Only enqueues access Tail Lock
  • 53. Performance Parameters • Net execution time for one million enqueue/dequeue pairs • 12-processor Silicon Graphics Challenge multiprocessor • Algorithms compiled with using highest optimization level • Including many hand optimizations
  • 54. Dedicated multiprocessor Multiprogrammed system with 3 processes per processor Multiprogrammed system with 2 processes per processor
  • 55. Conclusion • NBS clear winner for multiprocessor multiprogrammed systems • Above 5 processors, use the new nonblocking queue • If hardware only supports test-and-set use two lock queue • For two or less processors use a single lock algorithm for queues