Local LLMs
Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation
🔥PyTorch Content type: AcademicTrainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization
🤖LLMs Content type: AcademicMoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
📝NLP Content type: News Content type: BlogLess-relevant results