Gradient Descent
Backpropagation Without the Magic: A First-Principles Derivation
📊Empirical Bayes Content type: BlogMachine learning from scratch, what to build before using scikit-learn
📈Linear Models Content type: Tutorialmarkusheimerl/gpt: A generative pretrained transformer implementation
🔢Embeddings Content type: CodeA Mean-Field Analysis of Multi-Head Self-Attention under Cross-Entropy Training
📊Empirical Bayes Content type: AcademicLess-relevant results