Generating Shakespeare Without Neural Networks (opens in new tab)
Learning how to generate Shakespeare has become the “Hello World” of language models.1 Recently, I’ve been messing with alternative language models (diffusion language models instead of autoregressive transformers) and came across unbounded n-gram models. These models are purely statistical and don’t require optimizing weights or training. A year ago, I read the paper Infini-gram, which scaled an unbounded n-gram model to trillions of tokens. While their model had applications like supplement...
Read the original article