Is gzip a language model? (opens in new tab)
A while back I wrote about , where I generated Shakespeare with an unbounded n-gram model: no weights, no training, just counting. Fortuitously, I came across the paper , which mentioned the compression–prediction equivalence: every prediction model is inherently a compressor, and all compression algorithms are prediction models. This led to the natural question: can gzip do language modeling?1 No neural network, no learned parameters, nothing. Just the compressor that ships with your operati...
Read the original article