🗜️ Quantization - bugrakadirhan · Scour

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization

📐Model Architecture Academic

Less-relevant results

Daily Hacker News for 2026-06-06

daemonology.net·

[AINews] FrontierCode: Benchmarking for Code Quality over Slop

⚡ML Inference News

·

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 TPS

🖥️Systems ML Blog

mimo.xiaomi.com··Hacker News, r/LocalLLaMA

not much happened today | AINews

🤖Machine Learning

Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!

Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

🖥️Systems ML Academic

Apple rebuilt its on-device AI stack at WWDC 2026

🤖Machine Learning Blog

ziraph.com··Hacker News

OpenAI govt stake 🇺🇸, Google compute deal 🚀, Microsoft Scout launch 🤖

🧠Deep Learning

☕🤖 Claude Now Writes Most of Its Own Code

⚙️Systems Programming News Blog

theaibreak.substack.com··Substack

UniSVQ: 2-bit Unified Scalar-Vector Quantization

🖥️Systems ML Academic

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

⚡ML Inference News Blog

kaitchup.substack.com··r/LocalLLaMA

On Low-Bit Quantization Errors in Speaker Verification: Diagnostic and Mitigation

🖥️Systems ML Academic

Where to Host Your Open-Source Model (Under 10B Parameters)

⚡ML Inference

digitalocean.com·

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

⚙️Model Training Academic

ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization

🖥️Systems ML Academic

alexziskind1/model-shelf: Model Shelf is a local-first model resolver that helps AI agents and scripts find model weights on your own storage before downloading from Hugging Face. Point it at an internal SSD, NAS, external SSD, or Thunderbolt DAS, and it returns the best local path for GGUF, MLX, safetensors, Ollama, vLLM, and other local AI workflows.

🧠Deep Learning Code

Dew Drop - June 8, 2026 (#4685)

alvinashcraft.com·

#068 - Apple runs Siri on Google's Gemini, OpenAI files a secret IPO at $852B, Xiaomi clocks 1,000 tps

⚡ML Inference

indiehacker.news·

AI Week in Review 26.06.06

🧠Deep Learning News Blog

patmcguinness.substack.com··Substack

Log in to enable infinite scrolling