Ask HN: What's the best LLM model that on a 24 GB VRAM GPU? (opens in new tab)
What’s the best model right now that outperforms Qwopus3.6-27B-v2-MTP-GGUF 8-bit on a 24 GB VRAM GPU? Looking for real reviews. I found 4 bit not usable in production.
Read the original article