Memoization and other techniques save on memory access. This technique proposes a solution to save on accesses and the computation involving the data from these access.
eg: sum of all nodes in a 100 node linked-list. Each node has to be accessed when say, only 2 have changed. That’s 98 redundant loads.
If value of SP calculated is diff from what is in memory, then a support thread (S) will be spawned to calculate and B. Main thread will skip code section B since data has already been calculated. Instructions for B will be left as is because support thread may have failed to spawn; skipping the thread, code will be executed by the main thread.
Programmers implement this with C pragma constructs
Every time the variable is WRITTEN to, the associated DTThread is executed
If programmer has a reason to suspect that the thread may crash/be aborted, he can place the #cancel pragma. This will ensure that only the main thread executes this block. Support thread will not be registered.
This function is triggered in a new thread (support thread) when control reaches “#block xxx”
Start PC: PC of the skippable code in the main thread.Destination PC denotes the end of the skippable region.Post skip PC is address after the region is skipped.
Computer Architecture Seminar
Data-Triggered ThreadsEliminating Redundant Computation<br />(HPCA 2011)<br />Hung-Wei Tseng and Dean M. Tullsen<br />Department of Computer Science and Engineering <br />University of California, San Diego<br />Seminar by: Naman Kumar for http://carg.uwaterloo.ca<br />
Eliminating Redundant Computation<br />Silent Store:<br />Amemory store operation that does not change the contents at that location<br />20-68% of all stores are silent [Lepakand Lipasti]<br />How about eliminating the entire stream of computation surrounding a silent store!<br />
Eliminating Redundant Computation<br />Redundant loads: <br /> silent stores result in redundant loads<br /> (last time this load loaded this address, it fetched the same value)<br /> SPEC2000 C:<br />78% of all loads are redundant<br />50% of all instructions depend on redundant loads<br />
DTT: Implementation<br />The Programming Model<br />Place redundant computation in a separate thread:<br />Thread is restartable<br />Thread may be aborted/restarted multiple times<br />Thread management is through architectural changes.<br />Easy to verify data races as thread life is between time between triggering store and main thread join point.<br />
DTT: Implementation<br />The Programming Model<br />Trigger is placed in data section, not code section<br />
DTT: Implementation<br />Architectural Support<br />Following tables are all implemented in hardware<br />Thread registry (table) <br />Thread Queue (table)<br />Thread Status Table (table) PC <br />
DTT: Implementation<br />Architectural Support<br />ISA modifications<br />tstore – generate thread when mem modified is not silent<br />tspawn – spawn the thread using thread registry<br />treturn– finish execution of the current thread<br />tcancel – terminate a running thread <br />