Abstract: The rising popularity of deep learning algorithms demands special accelerators for matrix-matrix multiplication. Most of the matrix multipliers are designed based on the systolic array ...
Abstract: General Matrix Multiplication (GEMM) is a ubiquitous compute kernel in deep learning (DL). To support energy-efficient edge-native processing, new GEMM hardware units have been proposed that ...