IMAX3: Amazing Dataflow-Centric CGRA and its Applications
I present this slide to all hungry engineers who are tired of CPU, GPU, FPGA, tensor core, AI core, who want some challenging one with no black box inside, and who want to improve by themselves.
IMAX3: Amazing Dataflow-Centric CGRA and its Applications
I present this slide to all hungry engineers who are tired of CPU, GPU, FPGA, tensor core, AI core, who want some challenging one with no black box inside, and who want to improve by themselves.
16. 例 : unroll-let
(defkernelmacro unroll-let ((n offset vars) &body body)
(let ((unrolled-vars (gensym))
(collected-vars (gensym)))
(setf unrolled-vars (loop for (name init) in vars
collect (loop for i below n
collect (list (gensym (symbol-name name))
init)))
collected-vars (loop for i below n
collect (mapcar #'list
(mapcar #'car vars)
(mapcar #'(lambda (x) (car (nth i x)))
unrolled-vars))))
(append (list 'let
(reduce #'append unrolled-vars))
(loop for expr in body
append (loop for i below n
collect `(symbol-macrolet ((,offset ,i)
,@(nth i collected-vars))
,expr))))))
中間変数定義もアンロール
19. 例 : unroll-sphere
(defkernelmacro unroll-sphere ((r vx vy vz) &body body)
`(progn
,@(loop for z from (- r) to r
append (loop for y from (- r) to r
append (loop for x from (- r) to r
when (>= r (sqrt (+ (* z z)
(* y y)
(* x x))))
collect `(symbol-macrolet ((,vz ,z)
(,vy ,y)
(,vx ,x))
,@body))))))
半径rの球へのアクセスを展開