Why Transformer Representations Tend to Live on Hyperspheres (opens in new tab)
Cosine similarity, angular distance, vector directions, activation steering, and representation arithmetic are everywhere in modern deep…
Read the original articleCosine similarity, angular distance, vector directions, activation steering, and representation arithmetic are everywhere in modern deep…
Read the original article