LLM
markusheimerl/gpt: A generative pretrained transformer implementation
聽馃Transformers 聽Content type: CodeHasse Diagrams for Attention: A Partial Order Framework for Designing Transformer Masks
聽馃Transformers 聽Content type: AcademicLess-relevant results