Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📉 Model Quantization
INT8, Post-Training, QAT, Pruning, Model Compression
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
111828
posts in
450.0
ms
EyesOff
: Why Some Models
Quantize
Better Than Others
ym2132.github.io
·
1d
·
Discuss:
Hacker News
🏎️
TensorRT
ROCKET: Rapid Optimization via Calibration-guided
Knapsack
Enhanced
Truncation
for Efficient Model Compression
arxiv.org
·
1d
🏎️
TensorRT
batteryphil/Primal-Discrete-LLM-Training
:
ComponentThe
"Secret Sauce"MemoryZero-Shadow Training: Training without FP16 master weights.MathPrime-Grid LUT: Better precision-per-bit than standard INT4.StabilityVote Buffering: Making Gradient Accumulation work for discrete weights.
github.com
·
12h
·
Discuss:
Hacker News
📊
Gradient Accumulation
BetaZero
V2: A Diffusion Model for Setting
Boulder
Problems
evmojo37.substack.com
·
12h
·
Discuss:
Substack
📊
Gradient Accumulation
mradermacher/Qwen3-Coder-Next-REAM-GGUF
huggingface.co
·
1d
·
Discuss:
r/LocalLLaMA
📜
TorchScript
Microgpt.py
gist.github.com
·
1d
·
Discuss:
Hacker News
,
Hacker News
🔍
Type Checkers
NanoQuant
: Efficient Sub-1-Bit
Quantization
of Large Language Models
arxiv.org
·
4d
·
Discuss:
Hacker News
,
r/LocalLLaMA
🏎️
TensorRT
Show HN: A
segmentation
model client-side via
WASM
qtoolkit.dev
·
22h
·
Discuss:
Hacker News
🧩
Attention Kernels
Gibbs Measures from Deep Shaped
Multilayer
Perceptrons
link.aps.org
·
23h
📊
Gradient Accumulation
Architectural and Mathematical
Foundations
of Machine Learning: A
Rigorous
Synthesis of Theory, Geometry, and Implementation
chizkidd.github.io
·
1d
·
Discuss:
Hacker News
🔗
Kernel Fusion
Quantization-Aware
Distillation
ternarysearch.blogspot.com
·
5d
·
Discuss:
Hacker News
,
ternarysearch.blogspot.com
🎓
Model Distillation
Running Machine Learning on
Arduino
Nano
hackster.io
·
2h
🎯
Tensor Cores
Antigravity
: Beyond the
Basics
of AI Coding
dev.to
·
3h
·
Discuss:
DEV
🤖
AI Coding Tools
The 4 Mixture of Experts Architectures: How to Train
100B
Models at
10B
Cost
pub.towardsai.net
·
23h
🎓
Model Distillation
Breaking the
Tractability
Barrier: A Generic Low-Level Solver for
NP-Hard
Instances (N=63) on Commodity 64-Bit Silicon
zenodo.org
·
2h
·
Discuss:
Hacker News
🎯
Tensor Cores
Diffusion Models for
ARC-AGI
: A
Retrospective
christopherhwood.com
·
1d
·
Discuss:
Hacker News
🏎️
TensorRT
Building an Embedding API with Rust, Arm, and
EmbeddingGemma
on AWS
Lambda
sobolev.substack.com
·
1h
·
Discuss:
Substack
🔄
ONNX
Float
vs
Int
Confidence Scores: Why LLM Output Format Changes Model Behavior
hackernoon.com
·
1d
🧠
BF16
A Hands-On Introduction to Restricted
Boltzmann
Machines with a Minimal
NumPy
Implementation
github.com
·
1d
·
Discuss:
DEV
🏎️
TensorRT
MAformer
: A multivariate prediction framework with adaptive multi-scale decomposition and phase correction for water quality in
aquaculture
environments
sciencedirect.com
·
1h
🧮
cuDNN
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help