MIVE: A Minimalist Integer Vector Engine for Softmax LayerNorm and RMSNorm Acceleration (opens in new tab)
The rapid growth of Large Language Models (LLMs) has intensified the need for specialized hardware accelerators that can satisfy stringent inference latency and power constraints. Although matrix multiplications dominate the overall computational workload, non-linear vector normalization operations, such as LayerNorm, RMSNorm and Softmax can become critical hardware bottlenecks. Existing accelerators typically implement these functions using ded...
Read the original article