Continuous Autoregressive Language Models
shaochenze.github.io·5h·
Discuss: Hacker News
Flag this post

Contents

Introduction

Large Language Models (LLMs) represent the central paradox of modern AI. On one hand, their capabilities are unprecedented. We’ve engineered models with hundreds of billions of parameters that can synthesize vast knowledge, execute intricate reasoning, and generate everything from nuanced prose to production-ready code. In short, we’ve built Ferrari-class engines.

And yet, we’ve placed them on a narrow country road, never letting it get out of first gear. This road is the dominant paradigm of autoregressive generation: predicting text one discrete token at a time. No matter how powerful the engine, its throughput is ultimately bottlenecked by the road. This mismatch is why state-of-the-art LLMs are so inefficient and computationally expensive to run…

Similar Posts

Loading similar posts...