Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
⚡ Quantization
Specific
GGUF, GPTQ, AWQ, int4, model compression
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
168
posts in
12.8
ms
RedToasty/llama.cpp
_qts: Fixing --
split-mode
tensor, with different KV cache quantization types.
🔓
Open Source AI
github.com
·
3d
·
r/LocalLLaMA
GPU Memory Math for LLMs: Formula That Tells You What Fits on Your GPU
🚀
LLM Deployment
theahmadosman.substack.com
·
8h
·
Substack
,
r/LocalLLaMA
DiRotQ: Rotation-Aware
Quantization
for
4-bit
Diffusion Transformers
⚙️
Transformers
arxiv.org
·
2d
Why Shrinking an AI
Model
Often Makes It More Useful
🏢
LLM Adoption
siliconopera.com
·
20h
Luce DFlash + PFlash on 7900XTX: Qwen3.6-27B at 2.24x decode and 3.05x prefill vs
llama.cpp
HIP
🎯
LLM Finetuning
lucebox.com
·
2d
·
r/LocalLLaMA
DreamFast/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-Safetensor-Benchmark
🎯
LLM Finetuning
huggingface.co
·
3d
Benchmarking
llama.cpp
's brand-new MTP support on Strix Halo
🎯
LLM Finetuning
calebcoffie.com
·
2d
·
Hacker News
Ollama vs vLLM vs
llama.cpp
: Which Wins for Your Use Case
🚀
LLM Deployment
tildalice.io
·
5d
Science and Technology News and Commentary: Aardvark Daily
💻
Local AI
aardvark.co.nz
·
15h
HF downloader utility tampermonkey
🔓
Open Source AI
greasyfork.org
·
2d
·
r/LocalLLaMA
Find bugs in YOUR code using OpenCode,
Llama.cpp
and Qwen3.6
💻
Local AI
wtarreau.blogspot.com
·
3d
·
Lobsters
,
Hacker News
,
wtarreau.blogspot.com
Command A+: Making sovereign agentic capabilities available to all
🤖
AI Agents
cohere.com
·
12h
·
Hacker News
Unleashing Blackwell's
4-bit
: a surgical look at MXFP4 and NVFP4
🎯
LLM Finetuning
emre570.bearblog.dev
·
1d
Can You Run LLMs Locally Without a GPU? I Tested 8
Models
on Linux
🎯
LLM Finetuning
itsfoss.com
·
5d
·
Hacker News
Building a Controllable
Inference
Platform on Kubernetes with AI Runway
🚀
LLM Deployment
techcommunity.microsoft.com
·
2d
qskousen/ggufy
: CLI/GUI tool for efficient and easy safetensors and gguf
model
conversion
🎯
LLM Finetuning
github.com
·
3h
·
r/StableDiffusion
Tokenizer Tampering
🧪
Synthetic Data
hiddenlayer.com
·
2d
What's in a
GGUF
, besides the
weights
- and what's still missing?
🧠
LLMs
nobodywho.ooo
·
6d
·
Hacker News
,
r/LocalLLaMA
tvall43/Qwen3.5-14B-A3B-Claude-4.6-Opus-Reasoning-Distilled-reap-gguf
at main
💻
Local AI
huggingface.co
·
18h
·
r/LocalLLaMA
Qwen 3.7 Preview
🚀
LLM Deployment
news.ycombinator.com
·
2d
·
Hacker News
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help