Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📊 IVF Indexes
Specific
Inverted File Index, Vector Clustering, Quantization, ANN Search
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
25597
posts in
74.5
ms
MUXQ
: Mixed-to-Uniform Precision MatriX Quantization via Low-Rank
Outlier
Decomposition
🎯
Vector Quantization
arxiv.org
·
6d
Quantization
,
LoRA
, and the 8% Problem: Benchmarking Local LLMs for Production AI
🏗️
LLM Infrastructure
walsenburgtech.com
·
1d
·
Hacker News
Quantization Meets Projection: A Happy Marriage for
Approximate
k-Nearest
Neighbor Search
🔍
Vector Search Algorithms
vldb.org
·
2d
TurboQuant
- Extreme KV Cache Quantization ·
ggml-org
llama.cpp
🔬
RaBitQ
github.com
·
5d
·
r/LocalLLaMA
Breaking the Memory Wall:
TurboQuant
KV
Cache Quantization on Apple Silicon
🖥️
Hardware Architecture
pub.towardsai.net
·
4d
Optimal Privacy-Aware Co-Design of
Quantizer
and Controller in
Networked
Control Systems
🔬
RaBitQ
arxiv.org
·
1h
DeFakeQ
: Enabling Real-Time Deepfake Detection on Edge Devices via Adaptive
Bidirectional
Quantization
🔢
BitNet
arxiv.org
·
1h
I Ran My
KYB
Engine at Three
Quantization
Levels. Accuracy Didn't Move. Cost Dropped 6x.
🏗️
LLM Infrastructure
walsenburgtech.com
·
3d
·
Hacker News
Weight
Group-wise
Post-Training
Quantization
for Medical Foundation Model
🔬
RaBitQ
arxiv.org
·
3d
REAM
:
Merging
Improves Pruning of Experts in LLMs
🧩
MoE
arxiv.org
·
6d
MoBiE
: Efficient Inference of Mixture of Binary Experts under Post-Training
Quantization
🧩
MoE
arxiv.org
·
4d
STQuant
: Spatio-Temporal Adaptive Framework for
Optimizer
Quantization in Large Multimodal Model Training
📦
Batch Embeddings
arxiv.org
·
4d
QaRL
: Rollout-Aligned Quantization-Aware RL for Fast and Stable Training under Training--Inference
Mismatch
🧠
LLM Inference
arxiv.org
·
3d
Geometric Properties of the
Voronoi
Tessellation
in Latent Semantic Manifolds of Large Language Models
📉
Embeddings Optimization
arxiv.org
·
4d
3DTurboQuant
: Training-Free Near-Optimal
Quantization
for 3D Reconstruction Models
🔬
RaBitQ
arxiv.org
·
5d
Zero-Shot
Quantization
via Weight-Space
Arithmetic
🎯
Vector Quantization
arxiv.org
·
6d
Initialisation
Determines the Basin: Efficient
Codebook
Optimisation for Extreme LLM Quantization
🧠
LLM Inference
arxiv.org
·
3d
Efficient
Quantization
of Mixture-of-Experts with
Theoretical
Generalization Guarantees
🧠
LLM Inference
arxiv.org
·
4d
DiffHDR
: Re-Exposing
LDR
Videos with Video Diffusion Models
📊
Embeddings
arxiv.org
·
5d
Rethinking
Residual
Errors in Compensation-based LLM
Quantization
🧠
LLM Inference
arxiv.org
·
3d
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help