Ask HN: What's the best LLM model that on a 24 GB VRAM GPU? (opens in new tab)

Discussed on Hacker News

What’s the best model right now that outperforms Qwopus3.6-27B-v2-MTP-GGUF 8-bit on a 24 GB VRAM GPU? Looking for real reviews. I found 4 bit not usable in production.

Read the original article