Pre, Mid, Post-Training Way of Life
fakepixels.substack.com·1d·
Discuss: Substack
⚖️Digital Ethics
Preview
Report Post

*"For here there is no place / that does not see you. You must change your life."

— Rilke’s Archaic Torso of Apollo

Building a large language model happens inthree main stages.

Pre-training processes trillions of tokens scraped from everywhere, taken in with little discrimination. The model learns to predict the next word from the last, and the deceptively simple task demands massive compute. Tens of thousands of GPUs for months, tolerating messy, uncurated data dredged from the internet’s sediment. Each layer of data becomes another layer of silt, building a multidimensional echo of the collected human experience. The accumulated text of human civilization fed forward without editorial judgment. Here, more …

Similar Posts

Loading similar posts...