Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📉 Model Quantization
INT8, Post-Training, QAT, Pruning, Model Compression
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
81922
posts in
496.6
ms
Quantization-Aware
Distillation
ternarysearch.blogspot.com
·
22h
·
Discuss:
Hacker News
🎓
Model Distillation
Regularized Calibration with
Successive
Rounding
for Post-Training Quantization
arxiv.org
·
2d
🏎️
TensorRT
Main
Content ||
Math
∩ Programming
jeremykun.com
·
2h
🔗
Kernel Fusion
Writing a
ONNX
Neural Network Inference Engine from Scratch in C to run image classification with
MobileNetV2
flexw.github.io
·
6h
·
Discuss:
r/C_Programming
⚡
ONNX Runtime
Adaptive
Neuro-Symbolic
Planning for smart agriculture
microgrid
orchestration in hybrid quantum-classical pipelines
dev.to
·
15h
·
Discuss:
DEV
⚡
ONNX Runtime
D$^2$
Quant
:
Accurate
Low-bit Post-Training Weight Quantization for LLMs
arxiv.org
·
4d
🎯
Tensor Cores
25W06
. Learning a language with the machine
z1nz0l1n.com
·
14h
🛠
Ml-eng
Show HN: Model Training Memory
Simulator
czheo.github.io
·
15h
·
Discuss:
Hacker News
📊
Gradient Accumulation
🥇Top AI
Papers
of the Week
nlp.elvissaravia.com
·
10h
⚡
ONNX Runtime
Convolutional
Neural Networks using
Logarithmic
Data Representation
dev.to
·
20h
·
Discuss:
DEV
🧮
cuDNN
Fastfood
: Approximate Kernel Expansions in
Loglinear
Time
paperium.net
·
23h
·
Discuss:
DEV
🔗
Kernel Fusion
=============================================================================================================================== **Abstract**
freederia.com
·
2d
🏎️
TensorRT
AI Sees And
Understands
Images Far More
Efficiently
With New Embedding Technique
quantumzeitgeist.com
·
2d
👁️
Attention Optimization
Normal Map
Compression
Revisited
ludicon.com
·
13h
⚡
CUDA Programming Patterns
Why are Neural Networks
architected
that way in the first place?
threads.championswimmer.in
·
1d
📊
Gradient Accumulation
deepmriprep
: voxel-based
morphometry
preprocessing via deep neural networks
nature.com
·
2d
🏎️
TensorRT
Proposal: A Framework for
Discovering
Alien Physics via Optimal
Compression
lesswrong.com
·
2d
⚡
ONNX Runtime
Crafting the Eyes for Thinking Machines: Rewiring the
Retina
- The Anatomy of
ViTStruct
pub.towardsai.net
·
1d
👁️
Attention Optimization
How I
squeezed
a
BERT
sentiment analyzer into 1GB RAM on a $5 VPS
mohammedeabdelaziz.github.io
·
1d
·
Discuss:
Hacker News
🏎️
TensorRT
Writing an LLM from scratch, part
32b
-- Interventions: gradient
clipping
gilesthomas.com
·
3d
·
Discuss:
Hacker News
📊
Gradient Accumulation
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help