Successfully reported this slideshow.
Lock Free Queue
Multi-thread
Multi-thread


Pessimistic lock
Multi-thread


Optimistic Lock
Scalability




Mongodb on NoSQL DB– sharding
Multi-core – lock-free queue
CAS


Compare and Swap/Set - cmpxchg


It compares the contents of a memory location to a given value
and, only if they ...
CAS in C/C++


GCC





Windows






bool __sync_bool_compare_and_swap (type *ptr, type oldval type
newval, ...)
...
Lock-free queue


List implementation







EnQueue(x)
{
q = new record();
q->value = x;
q->next = NULL;






...
Lock-free queue


Enhancement


If T1 thread hang up before update tail pointer, dead loop for
other threads





En...
Lock-free queue


Dequeue










DeQueue() {
do{
p = head; // head is dummy node
if (p->next == NULL){
return...
CAS ABA issue




It's possible that between the time the old value is
read and the time CAS is attempted, some other
pr...
ABA solution


Double-length CAS


on a 32 bit system, a 64 bit CAS. The second half is used
to hold a counter. The comp...
Double-length CAS








SafeRead(q)
{
loop:
p = q->next;
if (p == NULL){
return p;
}



Fetch&Add(p->refcnt, 1);...
Lock-free queue in Disruptor


Ring-buffer implementation






sequence mod array length = array index
Only tail poi...
Lock-free queue


Add data to Disruptor



http://ifeve.com/disruptor-writing-ringbuffer/
Cache Line


Cache



cache line 64 bytes.
Java long type 8 bytes, so 8 long variables in one cache
line
False sharing


Two variables, one is head, another is tail.
False sharing



















struct foo {
int x;
int y;
};
static struct foo f;
/* The two following...
Eliminate False sharing


Disruptor – cache line padding








public long p1, p2, p3, p4, p5, p6, p7;//cache line...
Memory Barrier


a type of barrier instruction which causes a central
processing unit (CPU) or compiler to enforce an
ord...
Memory Barrier



编译器和CPU可以在保证输出结果一样的情况下对指
令重排序,使性能得到优化。
强制更新一次不同CPU的缓存。
volatile


volatile,Java内存模型将在写操作后插入一个写屏障指令,
在读操作前插入一个读屏障指令。






一旦你完成写入,任何访问这个字段的线程将会得到最新的值。
在你写入前,会保证所有之前发生的事已经发生,...
Summary
术语

英文单词

共享变量

描述
在多个线程之间能够被共享的变量被称为共享变量。共享变量包括所
有的实例变量,静态变量和数组元素。他们都被存放在堆内存中,
Volatile只作用于共享变量。

内存屏障

Memory Ba...


http://ifeve.com/disruptor/ - Disruptor

Thanks
Concurrency vs parallelism
Upcoming SlideShare
Loading in …5
×

Lock Free Queue introduction

1,036 views

Published on

In computer science, lock is complex. Now for cloud computing, lock is an issue must to be solve and avoid. Thus there are many solutions to implement lock-free data structure. In this article, we just introduce some basic knowledge of lock free queue.

Published in: Education
  • Be the first to comment

Lock Free Queue introduction

  1. 1. Lock Free Queue
  2. 2. Multi-thread
  3. 3. Multi-thread  Pessimistic lock
  4. 4. Multi-thread  Optimistic Lock
  5. 5. Scalability   Mongodb on NoSQL DB– sharding Multi-core – lock-free queue
  6. 6. CAS  Compare and Swap/Set - cmpxchg  It compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a given new value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail.  int compare_and_swap (int* reg, int oldval, int newval) { int old_reg_val = *reg; if (old_reg_val == oldval) *reg = newval; return old_reg_val; }      
  7. 7. CAS in C/C++  GCC    Windows     bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...) type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...) InterlockedCompareExchange ( __inout LONG volatile *Target, __in LONG Exchange, __in LONG Comperand); C++11       template< class T > bool atomic_compare_exchange_weak( std::atomic<T>* obj, T* expected, T desired ); template< class T > bool atomic_compare_exchange_weak( volatile std::atomic<T>* obj, T* expected, T desired );
  8. 8. Lock-free queue  List implementation      EnQueue(x) { q = new record(); q->value = x; q->next = NULL;     do { p = tail; } while( CAS(p->next, NULL, q) != TRUE);      CAS(tail, p, q); //why we do NOT care the return value? } //the CAS of while loop success in T1 thread, all the other //threads failed. After Ti update the tail pointer, one of the // other threads can get the new tail pointer.
  9. 9. Lock-free queue  Enhancement  If T1 thread hang up before update tail pointer, dead loop for other threads     EnQueue(x) { q = new record(); q->value = x; q->next = NULL;  p = tail; oldp = p; do { while (p->next != NULL) p = p->next; } while( CAS(p.next, NULL, q) != TRUE);       CAS(tail, oldp, q);   }
  10. 10. Lock-free queue  Dequeue          DeQueue() { do{ p = head; // head is dummy node if (p->next == NULL){ return ERR_EMPTY_QUEUE; } while( CAS(head, p, p->next) != TRUE ); return p->next->value; }
  11. 11. CAS ABA issue   It's possible that between the time the old value is read and the time CAS is attempted, some other processors or threads change the memory location two or more times such that it acquires a bit pattern which matches the old value. The problem arises if this new bit pattern, which looks exactly like the old value, has a different meaning CAS just compare the pointer address, what if this address is reused?
  12. 12. ABA solution  Double-length CAS  on a 32 bit system, a 64 bit CAS. The second half is used to hold a counter. The compare part of the operation compares the previously read value of the pointer *and* the counter, to the current pointer and counter. If they match, the swap occurs - the new value is written - but the new value has an incremented counter.
  13. 13. Double-length CAS        SafeRead(q) { loop: p = q->next; if (p == NULL){ return p; }  Fetch&Add(p->refcnt, 1);   if (p == q->next){ return p; }else{ Release(p); } goto loop;        }
  14. 14. Lock-free queue in Disruptor  Ring-buffer implementation     sequence mod array length = array index Only tail pointer It is faster, array, cache-friendly, pre-loaded, pre-allocate, no need to clean up
  15. 15. Lock-free queue  Add data to Disruptor  http://ifeve.com/disruptor-writing-ringbuffer/
  16. 16. Cache Line  Cache   cache line 64 bytes. Java long type 8 bytes, so 8 long variables in one cache line
  17. 17. False sharing  Two variables, one is head, another is tail.
  18. 18. False sharing                   struct foo { int x; int y; }; static struct foo f; /* The two following functions are running concurrently: */ int sum_a(void){ int s = 0; int i; for (i = 0; i < 1000000; ++i) s += f.x; return s; } void inc_b(void){ int i; for (i = 0; i < 1000000; ++i) ++f.y; }
  19. 19. Eliminate False sharing  Disruptor – cache line padding      public long p1, p2, p3, p4, p5, p6, p7;//cache line padding Private volatile long cursor = 0; http://www.drdobbs.com/parallel/eliminate-falsesharing/217500206 http://ifeve.com/false-sharing/ http://ifeve.com/volatile/
  20. 20. Memory Barrier  a type of barrier instruction which causes a central processing unit (CPU) or compiler to enforce an ordering constraint on memory operations issued before and after the barrier instruction. This typically means that certain operations are guaranteed to be performed before the barrier, and others after.
  21. 21. Memory Barrier   编译器和CPU可以在保证输出结果一样的情况下对指 令重排序,使性能得到优化。 强制更新一次不同CPU的缓存。
  22. 22. volatile  volatile,Java内存模型将在写操作后插入一个写屏障指令, 在读操作前插入一个读屏障指令。    一旦你完成写入,任何访问这个字段的线程将会得到最新的值。 在你写入前,会保证所有之前发生的事已经发生,并且任何更新过 的数据值也是可见的,因为内存屏障会把之前的写入值都刷新到缓 存。 http://hedengcheng.com/?p=725
  23. 23. Summary 术语 英文单词 共享变量 描述 在多个线程之间能够被共享的变量被称为共享变量。共享变量包括所 有的实例变量,静态变量和数组元素。他们都被存放在堆内存中, Volatile只作用于共享变量。 内存屏障 Memory Barriers 是一组处理器指令,用于实现对内存操作的顺序限制。 缓冲行 Cache line 缓存中可以分配的最小存储单位。处理器填写缓存线时会加载整个缓 存线,需要使用多个主内存读周期。 原子操作 Atomic operations 不可中断的一个或一系列操作。 缓存行填充 cache line fill 当处理器识别到从内存中读取操作数是可缓存的,处理器读取整个缓 存行到适当的缓存(L1,L2,L3的或所有) 缓存命中 cache hit 如果进行高速缓存行填充操作的内存位置仍然是下次处理器访问的地 址时,处理器从缓存中读取操作数,而不是从内存。 写命中 write hit 当处理器将操作数写回到一个内存缓存的区域时,它首先会检查这个 缓存的内存地址是否在缓存行中,如果存在一个有效的缓存行,则处 理器将这个操作数写回到缓存,而不是写回到内存,这个操作被称为 写命中。 写缺失 write misses the cache 一个有效的缓存行被写入到不存在的内存区域。
  24. 24.  http://ifeve.com/disruptor/ - Disruptor Thanks
  25. 25. Concurrency vs parallelism

×