Mlx-optiq: per-layer mixed-precision LLM quantization for Apple Silicon (opens in new tab) ⚡KV Cache Content type: Video Content type: Discussion Content type: Tutorial

Quantize, fine-tune and serve LLMs locally on Apple Silicon (M1 to M5). MLX-native, no PyTorch, no cloud. On PyPI.

Sign in to keep reading the full article.

Cited by 2 articles