Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Back to article
GitHub here . You can follow the build instructions below as well. Change -DGGML_CUDA=ON to -DGGML_CUDA=OFF if you don't have a GPU or just want CPU inferen...
(opens in new tab)
46
articles covering this post
github.com
·
168w
168 weeks ago
·
DEV
,
r/GooglePixel
,
r/LocalLLaMA
,
r/LocalLLaMA
·
Open original
(opens in new tab)
Save
Love
Like
Dislike
|
Add interest
Feeds
Share
|
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block
Add interest
Show Feeds
Share
Report
Off Topic
Harmful Content
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Block Domain
Covered in 46 articles
GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)
vettedconsumer.com
·
1w
1 week ago
·
Hacker News
Actions for GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)
I switched from LM Studio to llama.cpp, and I'm never going back to a bloated wrapper
howtogeek.com
·
1w
1 week ago
Actions for I switched from LM Studio to llama.cpp, and I'm never going back to a bloated wrapper
Pairing Claude Code with Local Models
kdnuggets.com
·
6d
6 days ago
Actions for Pairing Claude Code with Local Models
I built a private ChatGPT for my family
fulghum.io
·
3d
3 days ago
·
Hacker News
Actions for I built a private ChatGPT for my family
llama-bench skipped FA on capable GPUs — b9437 corrects it
dev.to
·
7h
7 hours ago
·
DEV
Actions for llama-bench skipped FA on capable GPUs — b9437 corrects it
Run GLM-5.2 Locally: The Open Model Nobody Can Ban
dev.to
·
3d
3 days ago
·
DEV
Actions for Run GLM-5.2 Locally: The Open Model Nobody Can Ban
The 0$ AI Achitecture Stack (2026)
dev.to
·
1w
1 week ago
·
DEV
Actions for The 0$ AI Achitecture Stack (2026)
Introducing LlamaStash: a zero-overhead, terminal-native llama.cpp launcher
dev.to
·
2w
2 weeks ago
·
DEV
Actions for Introducing LlamaStash: a zero-overhead, terminal-native llama.cpp launcher
How fast is LlamaStash? Overhead, throughput, and a fair comparison with Ollama and LM Studio
dev.to
·
2w
2 weeks ago
·
DEV
Actions for How fast is LlamaStash? Overhead, throughput, and a fair comparison with Ollama and LM Studio
Gemma 4 Didn't Just Get Smarter. It Became a Different Kind of Model. Here's What the Agentic Numbers Actually Mean.
dev.to
·
4w
4 weeks ago
·
DEV
Actions for Gemma 4 Didn't Just Get Smarter. It Became a Different Kind of Model. Here's What the Agentic Numbers Actually Mean.
Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)
dev.to
·
4w
4 weeks ago
·
DEV
Actions for Building llama.cpp from source on a Dell Precision T5820 with an RTX 3090 Ti (after seven power cycles)
AI Gave the Solo Creator a Studio. The Studio Is Rented.
dev.to
·
4w
4 weeks ago
·
DEV
Actions for AI Gave the Solo Creator a Studio. The Studio Is Rented.
Google unveils DiffusionGemma, an AI model that breaks free of left-to-right processing
infoworld.com
·
5d
5 days ago
Actions for Google unveils DiffusionGemma, an AI model that breaks free of left-to-right processing
Benchmarking a real Futhark application
futhark-lang.org
·
3w
3 weeks ago
Actions for Benchmarking a real Futhark application
LLM, give me a JSON. Make no mistakes.
nobodywho.ooo
·
2w
2 weeks ago
·
Hacker News
Actions for LLM, give me a JSON. Make no mistakes.
What's in a GGUF, besides the weights - and what's still missing?
nobodywho.ooo
·
4w
4 weeks ago
·
Hacker News
,
r/LocalLLaMA
Actions for What's in a GGUF, besides the weights - and what's still missing?
DeepSeek-V4-Flash makes LLM steering interesting again
seangoedecke.com
·
4w
4 weeks ago
·
Lobsters
,
Hacker News
Actions for DeepSeek-V4-Flash makes LLM steering interesting again
Why and How to Run Local Models in Zed
zed.dev
·
4w
4 weeks ago
·
Hacker News
Actions for Why and How to Run Local Models in Zed
A Comma and a Question Mark
thetypicalset.com
·
3w
3 weeks ago
·
Hacker News
Actions for A Comma and a Question Mark
yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF
huggingface.co
·
7h
7 hours ago
Actions for yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF
bartowski/command-a-plus-05-2026-GGUF
huggingface.co
·
1d
1 day ago
·
r/LocalLLaMA
Actions for bartowski/command-a-plus-05-2026-GGUF
Qwen 3.6 27B AutoRound GGUF, need your feedback
huggingface.co
·
1w
1 week ago
·
r/LocalLLaMA
Actions for Qwen 3.6 27B AutoRound GGUF, need your feedback
llama.cpp vs. vLLM: Choosing the right local LLM inference engine
developers.redhat.com
·
3d
3 days ago
Actions for llama.cpp vs. vLLM: Choosing the right local LLM inference engine
Unsloth Gemma 4 QAT
unsloth.ai
·
1w
1 week ago
Actions for Unsloth Gemma 4 QAT
AI game jam starting today: Token Game Jam 1
itch.io
·
4w
4 weeks ago
·
r/vibecoding
Actions for AI game jam starting today: Token Game Jam 1
How to Setup a Local Coding Agent on macOS
ikyle.me
·
6d
6 days ago
·
Hacker News
Actions for How to Setup a Local Coding Agent on macOS
Improved performance and model support with GGUF
ollama.com
·
1w
1 week ago
Actions for Improved performance and model support with GGUF
How to Run an LLM Locally on Your Mobile Phone with QVAC and Expo
freecodecamp.org
·
2w
2 weeks ago
Actions for How to Run an LLM Locally on Your Mobile Phone with QVAC and Expo
The LLM Inference Optimization Stack: From Quantization to Speculative Decoding Part 1
digitalocean.com
·
3w
3 weeks ago
Actions for The LLM Inference Optimization Stack: From Quantization to Speculative Decoding Part 1
Running LLMs locally on a Mac
danmackinlay.name
·
3w
3 weeks ago
Actions for Running LLMs locally on a Mac
Anthropic raises $65B in Series H at a $965B post-money valuation, releases Opus 4.8 and Dynamic Workflows
news.smol.ai
·
3w
3 weeks ago
Actions for Anthropic raises $65B in Series H at a $965B post-money valuation, releases Opus 4.8 and Dynamic Workflows
not much happened today
news.smol.ai
·
3w
3 weeks ago
Actions for not much happened today
not much happened today
news.smol.ai
·
4w
4 weeks ago
Actions for not much happened today
Llama.cpp now has an official website: llama.app
llama.app
·
2w
2 weeks ago
·
Hacker News
Actions for Llama.cpp now has an official website: llama.app
147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens
adambien.blog
·
1w
1 week ago
Actions for 147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens
146th airhacks tv: Rust, Java 25, AI Agents, BCE, Web Components, zunit, zb
adambien.blog
·
1w
1 week ago
Actions for 146th airhacks tv: Rust, Java 25, AI Agents, BCE, Web Components, zunit, zb
lightmetal: GPU LLM Inference From a Single Java 25 JAR
adambien.blog
·
1w
1 week ago
Actions for lightmetal: GPU LLM Inference From a Single Java 25 JAR
The teleskopio MCP Server and llama.cpp
rkiselenko.dev
·
3w
3 weeks ago
Actions for The teleskopio MCP Server and llama.cpp
Hosting AI on your own computer? Learn how to do it
421.news
·
3d
3 days ago
Actions for Hosting AI on your own computer? Learn how to do it
Using local LLMs for agentic coding
blog.alexewerlof.com
·
2w
2 weeks ago
Actions for Using local LLMs for agentic coding
In other languages
개발자들, 다양한 기기에서 로컬 AI 도구 선보여
kite.kagi.com
·
1w
1 week ago
Actions for 개발자들, 다양한 기기에서 로컬 AI 도구 선보여
在 Fedora 44 上编译支持 CUDA 的 llama.cpp:完整指南
insidentally.com
·
3w
3 weeks ago
Actions for 在 Fedora 44 上编译支持 CUDA 的 llama.cpp:完整指南
¿Hostear la IA en tu propia computadora? Aprendé cómo hacerlo
421.news
·
3d
3 days ago
Actions for ¿Hostear la IA en tu propia computadora? Aprendé cómo hacerlo
AI资讯日报 2026/5/19
hex2077.dev
·
4w
4 weeks ago
Actions for AI资讯日报 2026/5/19
DeepSeek-V4-Flash로 LLM 조향(Steering)이 다시 흥미로워졌다
news.hada.io
·
4w
4 weeks ago
Actions for DeepSeek-V4-Flash로 LLM 조향(Steering)이 다시 흥미로워졌다
GGUF에는 가중치 외에 무엇이 들어 있고, 아직 무엇이 빠져 있나?
news.hada.io
·
4w
4 weeks ago
Actions for GGUF에는 가중치 외에 무엇이 들어 있고, 아직 무엇이 빠져 있나?
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help
Like
Save
Dislike
Report