Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📉 Model Quantization
Specific
INT8, Post-Training, QAT, Pruning, Model Compression
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
175839
posts in
20.2
ms
Frequency Matters: Fast Model-Agnostic Data
Curation
for
Pruning
and Quantization
arxiv.org
·
18h
🏎️
TensorRT
OpenAI turns model compression into a talent hunt with its 16
MB
"
Parameter
Golf" challenge
the-decoder.com
·
4h
🏎️
TensorRT
Post Training
Quantization
for Efficient Dataset
Condensation
arxiv.org
·
1d
🏎️
TensorRT
Choosing
the Right AI Model: Cost, Performance &
Trade-offs
peggie7191.medium.com
·
12h
🎓
Model Distillation
From Exact
kNN
to
DiskANN
: The Evolution of High-Performance Vector Search
hackernoon.com
·
16h
⚡
ONNX Runtime
Divetoxx/Mandelbrot
: True 24-bit BGR
TrueColor
. High-Precision Rendering (80-bit). Multi-threaded performance (OpenMP). True SSAA 8x8 (64 independent samples per pixel) direct RGB-space integration. G, B, R - The Red, Green, and Blue channels are calculated using sine and cosine waves
github.com
·
1d
·
Discuss:
r/programming
✂️
CUTLASS
From
Reactive
to Predictive: AI-Driven Optimization for ATE Performance &
Reliability
semiengineering.com
·
15h
⏱️
CUDA Events
Phonological
complexity, speech style, and individual differences influence ASR performance for
Tarifit
nature.com
·
1d
📊
Gradient Accumulation
`quantized_
matmul
` performance
degrades
significantly with `group_size=32` vs `group_size=128` · Issue #3251
github.com
·
2d
·
Discuss:
r/LocalLLaMA
✂️
CUTLASS
50x Faster Post-Training
workshoplabs.ai
·
5d
·
Discuss:
Hacker News
,
r/LocalLLaMA
🏎️
TensorRT
Quantization Explained: Q4_K_M vs
AWQ
vs
FP16
for Local LLMs
sitepoint.com
·
5d
🎯
Tensor Cores
Less-relevant results
Analyzing the Performance of the
K-Nearest
Neighbors (
KNN
) Algorithm with Different Values of k
medium.com
·
4d
🔗
Kernel Fusion
How post-training shapes legal representations: probing
SCOTUS
opinions
across model families
lesswrong.com
·
3d
🔄
ONNX
Show HN: We Built Private Post-Training and
Inference
for
Frontier
Models
workshoplabs.ai
·
2d
·
Discuss:
Hacker News
⚡
ONNX Runtime
MolmoPoint
: Better
pointing
architecture for vision-language models
allenai.org
·
7h
👁️
Attention Optimization
AI on
HPC
Workshop
2026
ai-on-hpc.github.io
·
5h
⚡
ONNX Runtime
Fine-Tuning Phi-3 &
Gemma
2: The Budget Path to GPT-4 Performance at a
Fraction
of the Cost
dev.to
·
5d
·
Discuss:
DEV
⚡
ONNX Runtime
Explainable artificial intelligence for early Alzheimer’s diagnosis using enhanced
grey
relational
features and multimodal data
nature.com
·
1d
👁️
Attention Optimization
How we optimized Dash's
relevance
judge with
DSPy
dropbox.tech
·
18h
·
Discuss:
Hacker News
👁️
Attention Optimization
The Performance and
Architecture
of
LeanStack
AI
builder.aws.com
·
6d
·
Discuss:
DEV
⚡
ONNX Runtime
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help