draft model, token speculation, LLM inference speedup, medusa decoding
Press ? anytime to show this help