Making Local LLM Go Brrr (opens in new tab)
How to run your local LLM well: fast, reliable and with good quality. Key metrics: Prefill speed: prompt/input tokens per second Decode speed: generated tokens
Read the original articleHow to run your local LLM well: fast, reliable and with good quality. Key metrics: Prefill speed: prompt/input tokens per second Decode speed: generated tokens
Read the original article