gilesthomas.com — Scour · Scour

gilesthomas.com

8 posts in the last 30 days

Writing an LLM from scratch, part 32l -- Interventions: updated instruction fine-tuning results

gilesthomas.com·11h·Hacker News

How an LLM becomes more coherent as we train it

gilesthomas.com·3d·Hacker News

Writing an LLM from scratch, part 32k -- Interventions: training a better model locally with gradient accumulation

gilesthomas.com·5d·Hacker News

Writing an LLM from scratch, part 32j -- Interventions: trying to train a better model in the cloud

gilesthomas.com·1w·Hacker News

Writing an LLM from scratch, part 32i -- Interventions: what is in the noise?

gilesthomas.com·1w·Hacker News

Writing an LLM from scratch, part 32h – Interventions: full fat float32

gilesthomas.com·2w·Hacker News, Hacker News

Automating starting Lambda Labs instances

gilesthomas.com·2w·Hacker News

Writing an LLM from scratch, part 32g – Interventions: weight tying

gilesthomas.com·3w·Hacker News, Hacker News