0
Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# ABA Problem

623

Published on

ABA problem and solution

ABA problem and solution

Published in: Education
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

Views
Total Views
623
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
0
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. ABA Problem Dr. C.V. Suresh Babu
• 2. Introduction • Queues are everywhere in parallel applications and operating systems • Many researchers have proposed queues – – – – Hwang and Briggs Gottlieb Massalin Et al. etc… • Queue performance can be critical to operating system performance – Scheduling Queues – Free memory lists – Many other critical kernel operations
• 3. Concurrent FIFO Queue algorithms • Blocking algorithms risk performance degradation – A process can be delayed or halted at inopportune moments • Scheduling preemption • Page faults • Cache misses – Slow processes can prevent faster ones from completing indefinitely • Non-Blocking algorithms must solve the ABA problem – During contention, some process will complete within a given number of operations
• 4. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop value = SM newVal = value +1 Stack Data = d CAS(&SM, value, newVal) break Stack SM 5 … x time newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X
• 5. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM newVal = value +1 Stack Data = d CAS(&SM, value, newVal) break Stack SM 4 … x time
• 6. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d CAS(&SM, value, newVal) break Stack SM 4 … x time
• 7. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Push(z) Stack Data = d CAS(&SM, value, newVal) break Stack SM 4 … x time
• 8. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break Stack SM 4 … x time
• 9. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack SM 4 … x time
• 10. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { loop newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) SM 5 … z time
• 11. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X loop CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) CAS(&SM,value,newVal) SM 5 … z time
• 12. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X loop CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) CAS(&SM,value,newVal) v1=x SM 4 … z time
• 13. ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data CAS(&SM, value, newVal) v1=Pop() value = 5 Push (d) { newVal = 4 data=X return data value = 5 newVal = 4 break v2=Pop() data=X loop CAS(&SM,value,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) CAS(&SM, value, newVal) value = 4 break newVal=5 Stack CAS(&SM,value,newVal) CAS(&SM,value,newVal) v1=x SM 4 … z CAS should fail but it succeeds time Thread1 has Thread2’s data
• 14. Solutions for ABA problem Cache Kernel • Add version # to data structures • Increment # during every CAS instruction LL/SC • Fail if Cache Line has been written to
• 15. Solution for ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data DCAS(&SM, value, v1=Pop() value = 5 return data Push (d) { newVal = 4 data=X break value = 5 newVal = 4 <ver++,newVal>) v2=Pop() data=X loop DCAS(&SM,value,ver,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) DCAS(&SM, value, value = 4 <ver++,newVal>) newVal=5 Stack break DCAS(&SM,value,ver,newVal) DCAS(&SM,value,ver,newVal) Will not incorrectly succeed SM 5 … z (ver != ver+2) time
• 16. Solution for ABA problem Pop () { loop value = SM newVal = value -1 THREAD1 THREAD2 data = Stack Data DCAS(&SM, value, v1=Pop() value = 5 return data Push (d) { newVal = 4 data=Z break value = 5 newVal = 4 <ver++,newVal>) v2=Pop() data=X loop DCAS(&SM,value,ver,newVal) value = SM v2 = x newVal = value +1 Stack Data = d Push(z) DCAS(&SM, value, value = 4 <ver++,newVal>) newVal=5 Stack break DCAS(&SM,value,ver,newVal) DCAS(&SM,value,ver,newVal) Will not incorrectly succeed SM 4 (ver != ver+2) … time V1 = Z
• 17. Correctness Properties 1. 2. 3. 4. 5. The linked list is always connected Nodes only inserted after the last node Nodes only deleted from beginning Head always points to the first node Tail always points to a node in the list
• 18. Queue # 1 • Non-Blocking Concurrent Queue – enqueue()
• 19. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value node–>next.ptr = NULL loop tail = Q–>Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node
• 20. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node
• 21. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node
• 22. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node [NULL]
• 23. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node [NULL]
• 24. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node “enqueue(myQueue, D1)” enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) [NULL]
• 25. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) [NULL]
• 26. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL]
• 27. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] [NULL]
• 28. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail [NULL]
• 29. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail next [NULL]
• 30. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail next [NULL]
• 31. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 [NULL] tail next [NULL]
• 32. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 +1 [NULL] tail next
• 33. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 +1 [NULL] tail next
• 34. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node enqueue(myQueue, D1) enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next +1 if tail == Q–>Tail if next.ptr == NULL if CAS(&tail.ptr–>next, next, <node, next.count+1>) myQueue break endif else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) endif endif endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) D1 +1 [NULL]
• 35. Concurrent enqueues • Suppose two processes call enqueue() at the same time
• 36. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break Process 1 endif Enqueue(myQueue, ABC) else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) ABC endif endif [NULL] endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) tail next initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node D1 [NULL] Process 2 Enqueue(myQueue, XYZ) XYZ [NULL] tail next
• 37. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break Process 1 endif Enqueue(myQueue, ABC) else CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) ABC endif endif [NULL] endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) tail next initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node D1 +1 XYZ [NULL] Process 2 Enqueue(myQueue, XYZ) tail
• 38. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } enqueue(Q: pointer to queue t, value: data type) node = new node() node–>value = value Head node–>next.ptr = NULL loop tail = Q–>Tail Tail next = tail.ptr–>next if tail == Q–>Tail if next.ptr == NULL if CAS( &tail.ptr–>next, next, <node, next.count+1>) myQueue break Process 1 endif Enqueue(myQueue, ABC) else CAS( &Q–>Tail, tail, <next.ptr, tail.count+1>) ABC endif endif [NULL] endloop CAS(&Q–>Tail, tail, <node, tail.count+1>) tail next initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node D1 +1 XYZ [NULL] Process 2 Enqueue(myQueue, XYZ) tail
• 39. Queue #1 (cont.) • Non-Blocking Concurrent Queue – dequeue()
• 40. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node “dequeue(myQueue, pvalue)” dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL]
• 41. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 42. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 43. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 44. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 45. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 46. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 47. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL] head tail next
• 48. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node dequeue(myQueue, pvalue) dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop Head head = Q–>Head +1 tail = Q–>Tail next = head–>next if head == Q–>Head Tail if head.ptr == tail.ptr if next.ptr == NULL return FALSE endif myQueue CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head, head, <next.ptr, head.count+1>) break endif endif endif endloop free(head.ptr) D1 [NULL]
• 49. Concurrent dequeues • Suppose two processes call dequeue() at the same time
• 50. struct pointer_t { node_t * ptr uint count } struct node_t { data_type value pointer_t next } struct queue_t { pointer_t Head pointer_t Tail } dequeue(Q: ptr to queue t, pvalue: ptr to data type):bool loop head = Q–>Head Head tail = Q–>Tail next = head–>next if head == Q–>Head if head.ptr == tail.ptr Tail if next.ptr == NULL return FALSE endif CAS(&Q–>Tail, tail, <next.ptr, tail.count+1>) else # Read value before CAS, otherwise another # dequeue might free the next node *pvalue = next.ptr–>value if CAS ( &Q–>Head,head, <next.ptr, head.count+1>) break endif “dequeue(myQueue, endif endif endloop free(head.ptr) return TRUE D1 [NULL] pvalue)”
• 51. Queue #2 • Two-lock Concurrent Queue
• 52. struct node_t { data_type value node_t * next } struct queue_t { pointer_t Head pointer_t Tail lock_type H_lock lock_type T_lock } initialize(Q: pointer to queue t) node = new node() node–>next.ptr = NULL Q–>Head = Q–>Tail = node Q–>H lock = Q–>T lock = FREE dequeue(Q: pointer to queue t, pvalue: pointer to data type): boolean lock(&Q–>H lock) node = Q–>Head new head = node–>next if new head == NULL enqueue(Q: pointer to queue t, value: data type) unlock(&Q–>H lock) node = new node() return FALSE node–>value = value endif node–>next.ptr = NULL *pvalue = new head–>value lock(&Q–>T lock) Q–>Head = new head Q–>Tail–>next = node unlock(&Q–>H lock) Q–>Tail = node free(node) unlock(&Q–>T lock) return TRUE • Algorithms have same general structure only different data types • No loops, ‘busy waiting’ instead • Only dequeues access Head Lock • Only enqueues access Tail Lock
• 53. Performance Parameters • Net execution time for one million enqueue/dequeue pairs • 12-processor Silicon Graphics Challenge multiprocessor • Algorithms compiled with using highest optimization level • Including many hand optimizations
• 54. Dedicated multiprocessor Multiprogrammed system with 3 processes per processor Multiprogrammed system with 2 processes per processor
• 55. Conclusion • NBS clear winner for multiprocessor multiprogrammed systems • Above 5 processors, use the new nonblocking queue • If hardware only supports test-and-set use two lock queue • For two or less processors use a single lock algorithm for queues