Making Local LLM Go Brrr (opens in new tab)

Covers 6 stories including AlexsJones/llmfit: 94 models. 30 providers. One command to find what runs on your hardware.

How to run your local LLM well: fast, reliable and with good quality. Key metrics: Prefill speed: prompt/input tokens per second Decode speed: generated tokens

Read the original article