Quantization

Feeds to Scour
SubscribedAll
Scoured 46 posts in 15.4 ms

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization

 📐Model Architecture  Content type: Academic
arxiv.org·
Less-relevant results

Daily Hacker News for 2026-06-06

 🔄MLOps
daemonology.net·

[AINews] FrontierCode: Benchmarking for Code Quality over Slop

 ML Inference  Content type: News
latent.space
·

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 TPS

 🖥️Systems ML  Content type: Blog

not much happened today | AINews

 🤖Machine Learning
news.smol.ai·

Xiaomi MiMo-V2.5-Pro Just Hit 1,000 Tokens Per Second!

 🔧MLIR
gizchina.com·

Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

 🖥️Systems ML  Content type: Academic
arxiv.org·

Apple rebuilt its on-device AI stack at WWDC 2026

 🤖Machine Learning  Content type: Blog
ziraph.com··Hacker News

OpenAI govt stake 🇺🇸, Google compute deal 🚀, Microsoft Scout launch 🤖

 🧠Deep Learning
tldr.tech·

☕🤖 Claude Now Writes Most of Its Own Code

 ⚙️Systems Programming  Content type: News  Content type: Blog

UniSVQ: 2-bit Unified Scalar-Vector Quantization

 🖥️Systems ML  Content type: Academic
arxiv.org·

MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better

 ML Inference  Content type: News  Content type: Blog

On Low-Bit Quantization Errors in Speaker Verification: Diagnostic and Mitigation

 🖥️Systems ML  Content type: Academic
arxiv.org·

Where to Host Your Open-Source Model (Under 10B Parameters)

 ML Inference
digitalocean.com·

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

 ⚙️Model Training  Content type: Academic
arxiv.org·

ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization

 🖥️Systems ML  Content type: Academic
arxiv.org·

alexziskind1/model-shelf: Model Shelf is a local-first model resolver that helps AI agents and scripts find model weights on your own storage before downloading from Hugging Face. Point it at an internal SSD, NAS, external SSD, or Thunderbolt DAS, and it returns the best local path for GGUF, MLX, safetensors, Ollama, vLLM, and other local AI workflows.

 🧠Deep Learning  Content type: Code
github.com·

Dew Drop - June 8, 2026 (#4685)

 🔄MLOps
alvinashcraft.com·

#068 - Apple runs Siri on Google's Gemini, OpenAI files a secret IPO at $852B, Xiaomi clocks 1,000 tps

 ML Inference
indiehacker.news·

AI Week in Review 26.06.06

 🧠Deep Learning  Content type: News  Content type: Blog

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help