⚡ Quantization - jhcha.oyo · Scour

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

vettedconsumer.com··Hacker News

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

🤖AI Academic

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

🎮Game Engines

everylocalai.com··DEV

Qwen 3.6 27B AutoRound GGUF, need your feedback

huggingface.co··r/LocalLLaMA

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

💬LLMs News Blog

blog.google··Hacker News

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

🎮Game Engines

alternativeto.net·

Unsloth Gemma 4 QAT

Remove padding and multiple D2D copies for MTP by gaugarg-nv · Pull Request #24086 · ggml-org/llama.cpp

🎮Game Engines Code

github.com··r/LocalLLaMA

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

🎛️Fine-tuning News Blog

kaitchup.substack.com··r/LocalLLaMA

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 TPS

⚡Speculative Decoding Blog

mimo.xiaomi.com··Hacker News, r/LocalLLaMA

local llm on laptop 780M GPU using llama + gemma 4 qat

💬LLMs Blog

alper.bearblog.dev·

Here's a llama.cpp CLI Command builder.

llamabuilding.com··r/LocalLLaMA

Optimal Post-Training Quantization Scales and Where to Find Them

💬LLMs Academic

DeskDash - a free Windows tool to easily manage your GGUF files

✍️Prompt Engineering

gerry7.itch.io··r/LocalLLaMA

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

deemwar-products.github.io··Hacker News

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

✍️Prompt Engineering

local-llm.utop.workers.dev··Hacker News

Introducing the Third Generation of Apple’s Foundation Models

machinelearning.apple.com··Hacker News, r/apple

Quality Is Not a Safety Proxy Under Quantization

🔐Cryptography Academic

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

✍️Prompt Engineering Discussion

news.ycombinator.com··Hacker News

1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM

smolhub.com··r/LocalLLaMA

Log in to enable infinite scrolling