A Three-Phase Factual Recall Circuit in Gemma-2B and Gemma-12B-IT (opens in new tab)
Activation patching reveals how facts are stored, routed, and read out across transformer layers, and why the residual stream does most of the work The post appeared first on .
Read the original article