self-attention, transformers, multi-head attention, scaled dot-product
Press ? anytime to show this help