Writing an LLM from scratch, part 22 – training our LLM
gilesthomas.com·5h·
Discuss: Hacker News
Flag this post

Archives

Categories

Blogroll

This post wraps up my notes on chapter 5 of Sebastian Raschka’s book “Build a Large Language Model (from Scratch)”. Understanding cross entropy loss and perplexity were the hard bits for me in this chapter – the remaining 28 pages were more a case of plugging bits together and running the code, to see what happens.

The shortness of this post almost feels like a damp squib. After writing so much in the last 22 posts, there’s really not all that much to say – but that hi…

Similar Posts

Loading similar posts...