Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐ Model Quantization
Specific
INT8, Post-Training, QAT, Pruning, Model Compression
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
133
posts in
16.7
ms
baidu-baige/LoongForge: A modular, scalable, high-performance
training
framework for LLMs, VLMs, diffusion, and embodied
models
.
ย
๐๏ธ
TensorRT
github.com
ยท
4h
ยท
Hacker News
AMD Confirms FSR 4.1 Support for Radeon RX 7000 in July, RX 6000 GPUs Get it in 2027
ย
๐
Nsight
gizchina.com
ยท
6d
A cheap fix that saves the AI $400M dollars a year and brings 4B people online
ย
๐
ONNX
codecai.net
ยท
4d
ยท
Hacker News
CAR-SAM: Cross-Attention Reconstruction for
Post-Training
Quantization
of the Segment Anything Model
ย
๐งฉ
Attention Kernels
arxiv.org
ยท
2d
Whisper.cpp
vs Faster-Whisper: Why Speed Tests Lie
ย
๐
Profiling Tools
tildalice.io
ยท
4d
AMD surprises RDNA 3 and RDNA 2 owners with FSR 4.1 support, arriving in July for RX 7000 series
ย
๐
Nsight
tweaktown.com
ยท
6d
AMD FSR4 ยท Issue #2 ยท Korthos-Software/low_latency_layer
ย
๐
GPU Occupancy
github.com
ยท
2d
ยท
r/linux_gaming
AsymFlow Claims More Realistic AI Images by Moving Beyond Latent Diffusion
ย
๐๏ธ
TensorRT
firethering.com
ยท
4d
ยท
Hacker News
Rockchip unveils RK3572 processor with 4 TOPS NPU and LPDDR5X support
ย
๐ง
CPU Architecture
linuxgizmos.com
ยท
3d
Nonlinear Bipolar Compensation: Handling Outliers in
Post-Training
Quantization
ย
๐
Gradient Accumulation
arxiv.org
ยท
2d
AMD makes FSR 4 upscaling official for Radeon RX 7000- and 6000-series cards โ RDNA 3 and RDNA 2 chips will soon enjoy improved visuals
ย
๐ฎ
NVIDIA
tomshardware.com
ยท
6d
AI boom is pricing out PC builders and bootstrapped AI startups
ย
๐ค
AI Coding Tools
startupfortune.com
ยท
4d
AMD's FSR 4 coming to RDNA 2 could give the Xbox Series X a PS5 Pro-like upgrade
ย
๐ง
PTX
tweaktown.com
ยท
6d
MARR: Module-Adaptive Residual Reconstruction for
Low-Bit
Post-Training
Quantization
ย
๐๏ธ
TensorRT
arxiv.org
ยท
2d
TriAxialKV: Toward Extreme Low-Precision KV-Cache
Quantization
for Agentic Inference Tasks
ย
๐ฏ
Tensor Cores
arxiv.org
ยท
2d
KV Cache Optimization: 3x Faster LLM Inference on 24GB VRAM
ย
๐๏ธ
CUDA Optimization
tildalice.io
ยท
6d
PyTorch, rewritten from scratch in pure Rust
ย
๐
TorchScript
github.com
ยท
6d
ยท
Hacker News
Runtime-Certified Bounded-Error
Quantized
Attention
ย
๐๏ธ
Attention Optimization
arxiv.org
ยท
12h
Breaking Modality Heterogeneity in
Low-Bit
Quantization
for Large Vision-Language
Models
ย
๐๏ธ
TensorRT
arxiv.org
ยท
1d
LoopQ:
Quantization
for Recursive Transformers
ย
๐
Compiler Optimization
arxiv.org
ยท
2d
« Page 1
ยท
Page 3 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help