Add causaltly bubble for commiunication between blue and yellow workgroups.
Unlike OpenCL 1.x the execution model and memory model for OpenCL 2.0 are very well defined. Possible to rigorously argue about the behavior of OpenCL 2.0 applications.
Need to confirm these are going to be the names for scopes in HSAIL
We use work-items and threads here because if both work-items executing on HSA components and other concurrently executing threads on HSA Agents are accessing the same data-structure (potentially) concurrently, then they must both satisfy the requirements.
Clearly all wait-free algorithms are lock-free
On the following slides we high light some aspects of the algorithm and how HSA’s memory model helps to enforce memory consistency
Double CAS would allow us to store pointers here, rather than the more limited 54bit size value, which would have to be used as an index.
LL/SC (as ARM supports) might be an alternative to avoid needing the version counter and use the bottom 2 bits of an aligned address to store the null type information.
Align to a cache boundary
Queue mapped to $d0 Current mapped to $d1 Next mapped to $d2 K mapped to $s0
ISCA final presentation - Memory Model
HSA MEMORY MODEL
BEN GASTER, ENGINEER, QUALCOMM