This document introduces highly optimized GPU kernels for block-sparse weights, which enhance the efficiency of neural network architectures, particularly in evaluating and differentiating linear layers. The kernels demonstrate significant performance improvements over traditional methods, enabling advancements in text sentiment analysis and generative modeling. By making these kernels publicly available, the authors aim to foster further innovation in model design and algorithms.