Model Compression

Feeds to Scour
SubscribedAll
Scoured 46 posts in 6.5 ms

Trainable Smooth-Rotation Transforms with Learned Channel Scales for LLM Quantization

鈿欙笍AutoMLContent type: Academic
arxiv.org
Less-relevant results

iblameandrew/open-deepthink: Grok-heavy at the price of API cost. You choose the model. An unlimited army to think about your problem.

馃Multi-Agent SystemsContent type: Code
github.comr/LocalLLaMA

Heterophily-Aware Adaptive Knowledge Distillation for Hypergraph Neural Networks

鈿欙笍AutoMLContent type: Academic
arxiv.org

MODF-SIR: A Multi-agent Omni-modal Distilled Framework for Social Intelligence Reasoning

馃挰Prompt EngineeringContent type: Academic
arxiv.org

Understanding Quantization-Aware Training: Gradients at Quantized Weights Bias to the Low-Loss Basin

鈿欙笍AutoMLContent type: Academic
arxiv.org

ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization

鈿欙笍AutoMLContent type: Academic
arxiv.org

Dew Drop - June 8, 2026 (#4685)

馃挰Prompt Engineering
alvinashcraft.com

LLM-Based User Personas for Recommendations at Scale

馃攳Vector DatabasesContent type: Academic
arxiv.org

PADD: Path-Aligned Decompression Distillation for Non-Router Teacher to Guide MoE Student Learning

馃攳Vector DatabasesContent type: Academic
arxiv.org

Cross-Modal Knowledge Distillation without Paired Data: Theoretical Foundation and Algorithm

鈿欙笍AutoMLContent type: Academic
arxiv.org

Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

鈿欙笍AutoMLContent type: Academic
arxiv.org

Beyond Dark Knowledge: Mixup-Based Distillation for Reliable Predictions

馃攳Vector DatabasesContent type: Academic
arxiv.org

TENP: Trapezoidal Expert Neuron Pruning For Mixture-of-Experts

馃挰Prompt EngineeringContent type: Academic
arxiv.org

FAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language Models

馃挰Prompt EngineeringContent type: Academic
arxiv.org

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

鈿欙笍AutoMLContent type: Academic
arxiv.org

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

鈿欙笍AutoMLContent type: Academic
arxiv.org

Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models

鈿欙笍AutoMLContent type: Academic
arxiv.org

Unsupervised Continual Clustering via Forward-Backward Knowledge Distillation

鈿欙笍AutoMLContent type: Academic
arxiv.org

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

馃挰Prompt EngineeringContent type: Academic
arxiv.org

Distilling first-principles accuracy into compact machine learning potentials for condensed-phase chemistry

馃挰Prompt EngineeringContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help