NeuronFabric: A Software Reference Architecture for On-Chip Transformer Training with Local Adam (opens in new tab)
Publicly documented accelerator architectures generally separate training computation from optimizer-state updates or rely on external memory and host orchestration. This paper presents NeuronFabric, a software reference architecture intended for future FPGA and ASIC implementations of transformer training with local Adam updates. A complete C# prototype implements forward pass, backpropagation, and Adam optimization without external machine-lea...
Read the original article