Your SlideShare is downloading. ×
Volatile
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Volatile

875

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
875
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 'volatile' is volatile mark@veltzer.net
  • 2. Memory... ● When programming we use memory all the time – Reading/Writing data structures on the heap, stack or data segment. – Reading/Writing from/to hardware – By “Memory” I do not refer to registers in this presentation (since every core has it's own registers)
  • 3. What kinds of guarantees do we want from memory operations? ● That the operation is not optimized away completely ● That the operation does not take place in registers (that are not visible to other cores by definition) ● Visibility to other cores (bypass/flush/sync CPU caches) ● Visibility to hardware ● Atomicity ● Order between such two or more memory operations ● Any combination of the above (possibly none of them...) ● Regular C programming (without using special features, that is), the compiler and the CPU provides none of the above guarantees
  • 4. So what guarantees does volatile provide? ● It's just not clear! ● The first two, yes ● The others, maybe. No in a lot of the architectures. ● Depends on your compiler, it's version, your compilation flags, the astrological sign of the compiler authors best friend... ● To be specific: most volatile implementations do not imply atomicity or ordering. ● And what does volatile mean for bigger than word or int or long structures? Pass me the joint as things are getting hazy... ● Don't use volatile (true in most cases!)
  • 5. Memory reordering ● Imagine the following code (with no compiler optimizations): ● ● ● What states do you expect other cores to see? ● Or maybe ● ● Yes! The CPU does this. (well, not Intel, but others) X=5 Y=6 X=7 Y=8 X=5, Y=6 X=7, Y=6 X=7, Y=8 X=5, Y=6 X=5, Y=8 X=7, Y=8
  • 6. Is the Compiler/CPU allowed to do that? ● Yes. Actually there are many types of reordering that the Compiler/CPU is allowed to perform ● Common CPU reordering include: – Load reordered after load – Load reordered after store – Store reordered after load – Store reordered after store – Store reordered after atomics – Load reordered after atomics – Dependant load reordered (YES! Alpha does this, they should all be locked up...)
  • 7. So what do the compiler/CPU guarantee? ● They guarantee results in one thread. ● This means that they may alter your code, reorder it, discard parts of it, use different operations than the ones you use and more. ● But all of these guarantee that the results you will get will be the same, in the same thread that they are in. ● But sometimes you want your code to be left unaltered. ● This is especially true when other threads or hardware is involved. ● In these cases the order matters, the specific operations matter, etc.
  • 8. Enter memory barrier/fence ● A machine memory barrier is a special machine instruction or a special type of memory access instruction that guarantees order of execution between memory instructions before it and after it. ● __sync_synchronize() in gcc (user space). ● asm volatile ("mfence" ::: "memory") ● (smp_?)mb(),(smp_?)rmb(),(smp_?)wmb() in kernel development. ● In most cases atomic operations imply a memory barrier of some sort and new C++11 has nice API with memory model included.
  • 9. OK, prove it to me... ● Time for a demo. ● Two threads, when we start we have: ● ● ● ● ● Could it be that R1==R2==0 at the end? X=0 Y=0 X=1 R1=Y Y=1 R2=X
  • 10. Hey, but I need volatile to overcome the compiler! ● No, you don't ● There is something called a “compiler barrier” ● Compiler barriers usually offer several features: – Forces the compiler to sync unsynchronized registers with memory so that memory writes before the barrier will go to memory (no cache flush, no memory barrier) – Forces the compiler to read from memory after the barrier even if the compiler thinks it knows the value of certain memory locations. – Forces order of memory operations at the compiler level (not machine level) in relation to the barrier location in the code ● A compiler barrier is not a machine instruction (as opposed to memory barrier ● It is a compiler directive, influencing how to the compiler will generate machine code after the directive is given. ● The compiler may emit machine instructions or it may not (depends on many factors) ● Time for another demo again...
  • 11. References ● “What every programmer should know about memory” by Ulrich Drepper ● “memory-barries.txt” from the Linux kernel. ● The example for memory barriers shown is derived from “Memory Reordering Caught in the Act” by Jeff Preshing ● “Volatile_variable” from wikipedia ● “Memory_barrier” from wikipedia ● All examples can be found at linuxapi project at GitHub by me.
  • 12. Questions?

×