Tour de Language Modeling: Part 1
conda.bearblog.dev·4d
🤖Transformers
Preview
Report Post
  • 23 Dec, 2025 *

Large Language Models are all the rage right now. It’s been a goal of mine to understand how these models work, so I’ve been ramping up on their academic lineage and implementing what I learn in code. For more background on these topics, I highly recommend Andrej Karpathy’s Zero to Hero series on YouTube.

This post is Part 1 of an n-part series on language modeling. Language modeling has a diverse range of problems within it, such as translation, semantic analysis, and prediction. For this series we will be looking at next token prediction. Specifically, given a sequence of tokens (like characters or words), we want to create a model which predicts the next token. Mathematically this amo…

Similar Posts

Loading similar posts...