GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick) (opens in new tab)
GGUF, GPTQ, AWQ, Q4_K_M, NF4 — the quantization alphabet soup, explained for people who just want to fit a bigger model in the VRAM they have. What each format is, the real VRAM math, and a decision table for which to use.
Read the original article