Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
馃敘 Quantization of LLMs
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
3880
posts in
65.5
ms
D$^2$
Quant
:
Accurate
Low-bit Post-Training Weight Quantization for LLMs
arxiv.org
路
4d
馃
Large Language Models (LLMs)
Hybrid Gated Flow (
HGF
):
Stabilizing
1.58-bit LLMs via Selective Low-Rank Correction
arxiv.org
路
2d
馃敡
Systems-level optimizations for LLM serving
Optimizing Communication for
Mixture-of-Experts
Training with Hybrid Expert
Parallel
developer.nvidia.com
路
5d
馃敡
Systems-level optimizations for LLM serving
How
Painkiller
RTX Uses Generative AI to
Modernize
Game Assets at Scale
developer.nvidia.com
路
2d
鈿欙笍
AI Infrastructure Automation
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help