Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📉 Model Quantization
INT8, Post-Training, QAT, Pruning, Model Compression
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
80155
posts in
1.60
s
Quantization-Aware
Distillation
ternarysearch.blogspot.com
·
2d
·
Discuss:
Hacker News
🎓
Model Distillation
Regularized Calibration with
Successive
Rounding
for Post-Training Quantization
arxiv.org
·
4d
🏎️
TensorRT
Image
Classification
with
Convolutional
Neural Networks
dev.to
·
12h
·
Discuss:
DEV
🧮
cuDNN
Main
Content ||
Math
∩ Programming
jeremykun.com
·
1d
🔗
Kernel Fusion
Tutorial – What is a
variational
autoencoder
?
jaan.io
·
14h
·
Discuss:
Hacker News
🏎️
TensorRT
Autoregressive
Image Generation with
Masked
Bit Modeling
arxiv.org
·
3h
⚡
Flash Attention
Quantized
Tensor Train Compression For Turbulent Flow Simulation: O(log N) Scaling with
Reynolds-Independent
Bond Dimension
zenodo.org
·
19h
·
Discuss:
Hacker News
🏎️
TensorRT
Faster
AI Training
Unlocked
With New System For Massive Language Models
quantumzeitgeist.com
·
18h
🎯
Tensor Cores
Writing a
ONNX
Neural Network Inference Engine from Scratch in C to run image classification with
MobileNetV2
flexw.github.io
·
1d
·
Discuss:
r/C_Programming
⚡
ONNX Runtime
Guide: Getting started with
choosing
a Machine Learning CLIP Model for Smart Search ·
immich-app/immich
github.com
·
8h
👁️
Attention Optimization
Expectation
and
Copysets
buttondown.com
·
13h
·
Discuss:
Hacker News
🔄
ONNX
SAE
Feature
Matchmaking
(Layer-to-Layer)
lesswrong.com
·
3h
🔄
ONNX
A Note on
Flat
Abstract
Syntax
Trees
gist.github.com
·
13h
·
Discuss:
Hacker News
🔬
Static Analysis
Manufacturing
QMS
Software
samrian.com
·
16h
·
Discuss:
Hacker News
⏱️
Benchmarking
Sense8
WorldToolKit
Demo v1.01 :
Sense8
: Free Download, Borrow, and Streaming
archive.org
·
9h
🏎️
TensorRT
the
mathematics
of
compression
in database systems
bitsxpages.com
·
12h
📈
Occupancy Optimization
Gated
Attention &
DeltaNets
: The Missing Link for Long-Context AI
pub.towardsai.net
·
2h
👁️
Attention Optimization
Scale LLM fine-tuning with
Hugging
Face and Amazon
SageMaker
AI
aws.amazon.com
·
15h
🎓
Model Distillation
Geometrically
Allocated
Ads in AI Conversations
june.kim
·
5h
·
Discuss:
Hacker News
🧩
Attention Kernels
Automating Inference Optimizations with NVIDIA
TensorRT
LLM
AutoDeploy
developer.nvidia.com
·
13h
🏎️
TensorRT
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help