Abstract: General Matrix Multiply (GEMM) units, consisting of multiply-accumulate (MAC) arrays, perform bulk of the computation in deep learning (DL). Recent work has proposed a novel MAC design, ...