🔄 TransformersSpecificAttention Mechanisms, Large Language Models, BERT, Encoder-Decoder Architecture