Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Local llm
🧠 Local llm
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
76
posts in
6.9
ms
Show HN: Run
Llama.cpp
In-Process from Java with Project Panama FFM
🧠
LLM Inference
deemwar-products.github.io
·
5d
5 days ago
·
Hacker News
Actions for Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM
Fixing a stuck
Ollama
runner and building a
GPU
watchdog
🏠
Self-Hosting
patrickmccanna.net
·
2d
2 days ago
·
Hacker News
Actions for Fixing a stuck Ollama runner and building a GPU watchdog
martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by
local
LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.
🤖
Qwen
Content type:
Code
github.com
·
8h
8 hours ago
·
Hacker News
Actions for martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.
Qwen 3.6 27B AutoRound
GGUF
, need your feedback
⚡
LLM Quantization
huggingface.co
·
1d
1 day ago
·
r/LocalLLaMA
Actions for Qwen 3.6 27B AutoRound GGUF, need your feedback
GGUF
vs GPTQ vs AWQ: The Plain-English Guide to
LLM
Quantization
(and Which One to Pick)
⚡
LLM Quantization
vettedconsumer.com
·
4d
4 days ago
·
Hacker News
Actions for GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)
On-device AI is a margin decision
🧠
LLM Inference
Content type:
Blog
ziraph.com
·
5h
5 hours ago
·
Hacker News
Actions for On-device AI is a margin decision
Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency
🧠
LLM Inference
Content type:
News
Content type:
Blog
blog.google
·
5d
5 days ago
·
Hacker News
Actions for Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency
A system programmer’s guide to
LLM
inference
🧠
LLM Inference
Content type:
Blog
blog.xiangpeng.systems
·
2d
2 days ago
·
Hacker News
Actions for A system programmer’s guide to LLM inference
Token4Token — pay-per-token inference on Gnosis + Swarm
🧠
LLM Inference
t4t.eth.link
·
1d
1 day ago
·
Hacker News
Actions for Token4Token — pay-per-token inference on Gnosis + Swarm
MoQ
GGUFs
and GSQ: Low-Bit
GGUFs
Are About to Get Much Better
🧠
LLM Inference
Content type:
News
Content type:
Blog
kaitchup.substack.com
·
5d
5 days ago
·
r/LocalLLaMA
Actions for MoQ GGUFs and GSQ: Low-Bit GGUFs Are About to Get Much Better
KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4
GPU
(gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for
llama.cpp
, fully measured on real hardware.
🧠
LLM Inference
Content type:
Code
github.com
·
7h
7 hours ago
·
Hacker News
Actions for KaiFelixBennett/gemma4-turboquant-rdna4: Run Gemma-4-31B at full 256K context on a $1,400 AMD RDNA4 GPU (gfx1201): TurboQuant KV cache + HIP-graph-safe Flash-Attention for llama.cpp, fully measured on real hardware.
local
AI agents for Cursor with pre-tuned marketplace/commu
🔌
Model Context Protocol
locaible.com
·
10h
10 hours ago
·
Hacker News
Actions for local AI agents for Cursor with pre-tuned marketplace/commu
Running Qwen 35B MoE at 450k Context on a Single 32GB
GPU
🧠
LLM Inference
local-llm.utop.workers.dev
·
3d
3 days ago
·
Hacker News
Actions for Running Qwen 35B MoE at 450k Context on a Single 32GB GPU
Here's a
llama.cpp
CLI Command builder.
🧠
LLM Inference
llamabuilding.com
·
1d
1 day ago
·
r/LocalLLaMA
Actions for Here's a llama.cpp CLI Command builder.
Purpose-built
local
AI agents
🤖
Qwen
Content type:
Blog
samihonkonen.com
·
2d
2 days ago
·
Hacker News
Actions for Purpose-built local AI agents
Run (your largest)
local
models from your iPhone
🧠
LLM Inference
Content type:
Blog
lmstudio.ai
·
6d
6 days ago
·
Hacker News
,
r/LocalLLaMA
Actions for Run (your largest) local models from your iPhone
Evaluating bigaspv2-5, a Flow Matching Alternative to SDXL
⚡
LLM Quantization
hackernoon.com
·
12h
12 hours ago
Actions for Evaluating bigaspv2-5, a Flow Matching Alternative to SDXL
DeskDash - a free Windows tool to easily manage your
GGUF
files
⚡
LLM Quantization
gerry7.itch.io
·
3d
3 days ago
·
r/LocalLLaMA
Actions for DeskDash - a free Windows tool to easily manage your GGUF files
Remove padding and multiple D2D copies for MTP by gaugarg-nv · Pull Request #24086 ·
ggml-org/llama.cpp
🧠
LLM Inference
Content type:
Code
github.com
·
5h
5 hours ago
·
r/LocalLLaMA
Actions for Remove padding and multiple D2D copies for MTP by gaugarg-nv · Pull Request #24086 · ggml-org/llama.cpp
Omnifs: APIs and data sources as files you can ls, cat, grep, and pipe
🕸️
WebAssembly
omnifs.dev
·
1d
1 day ago
·
Hacker News
Actions for Omnifs: APIs and data sources as files you can ls, cat, grep, and pipe
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help