transformer architecture, attention mechanism, self-attention, BERT, GPT
Press ? anytime to show this help