Model Compression

Feeds to Scour
SubscribedAll
Scoured 46 posts in 4.1 ms

Knowledge Distillation for Visual Autoregressive Models

 ⚙️AutoML  Content type: Academic
arxiv.org·

Shrinking a Neural Network Often Makes It Smarter

 💬Prompt Engineering
siliconopera.com·

A generalist biomedical vision-language model via multi-CLIP knowledge distillation

 💬Prompt Engineering  Content type: Academic
nature.com·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

 ⚙️AutoML  Content type: News  Content type: Blog
blog.google··Hacker News

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

 💬Prompt Engineering
androidauthority.com·

Pruned YOLOv8 ONNX INT8 Fails: 3 Fixes That Work

 💬Prompt Engineering  Content type: Blog  Content type: Discussion
tildalice.io·

Holding the FP8 Quality Ceiling at 8-Bit Weights and Activations: INT8 and GGUF Post-Training Quantization of Ideogram 4.0 for Consumer GPUs

 ⚙️AutoML  Content type: Academic
arxiv.org·

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

 💬Prompt Engineering

NVIDIA五模态压进一套权重

 Cryptocurrency
ai-brief.liziran.com·
Less-relevant results

The key steps that will enable organizations to scale Physical AI

 🧬AGI Self-Evolution
techradar.com
·

Optimal Post-Training Quantization Scales and Where to Find Them

 💬Prompt Engineering  Content type: Academic
arxiv.org·

两部门:到 2026 年底,人形机器人等重点产品在一批代表性场景中率先完成应用验证和常态部署 - IT之家

 Cryptocurrency
ithome.com·

OpenAI govt stake 🇺🇸, Google compute deal 🚀, Microsoft Scout launch 🤖

 🧬AGI Self-Evolution
tldr.tech·

UniSVQ: 2-bit Unified Scalar-Vector Quantization

 🔍Vector Databases  Content type: Academic
arxiv.org·

Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling

 💬Prompt Engineering  Content type: Academic
arxiv.org·

apple/coreai-models: Model export recipes, Python primitives, and Swift runtime utilities for on-device AI

 🧠Symbolic AI  Content type: Code
github.com··Hacker News

Joint Structural Pruning and Mixed-Precision Quantization for LLM Compression

 ⚙️AutoML  Content type: Academic
arxiv.org·

Finding Sparse Subnetworks in One Training Cycle via Progressive Magnitude-Based Pruning

 ⚙️AutoML  Content type: Academic
arxiv.org·

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization

 ⚙️AutoML  Content type: Academic
arxiv.org·

LLM Research Papers: The 2026 List (January to May)

 🧬AGI Self-Evolution  Content type: News

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help