This document describes the design of a 21x21-bit radix-8 multiplier unit for specific purposes. It presents a modification to the radix-8 multiplication algorithm to speed up the generation of the odd multiple (3Y) of the multiplicand by taking advantage of the fact that the multiplicand belongs to a known set of numbers stored in memory. This allows decomposing the previous adder into three 7-bit adders that can operate in parallel, reducing the delay by about 42% compared to a standard 21-bit adder. The design includes a radix-8 recoding of the multiplier, multiplexers to generate the partial products, a Wallace tree using 4-2 compressors to reduce the partial