LocalLlama · Scour

Void-Compute/AMD-Ghost-Enviroment: Allows AMD GPU's to run CUDA only software

github.com·2w·r/LocalLLaMA

Qwen3.6 (35B-A3B) with OpenCode. Running locally with llama.cpp

youtube.com·2w·r/LocalLLaMA

vignesan/mnemic-mre: 🧬 Zero-cost autonomous expert steering for Mixture-of-Experts language models — detect domains and amplify relevant experts at inference time, no training required.

github.com·2w·r/LocalLLaMA

raullenchai/Rapid-MLX: The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.

github.com·2w·Hacker News, r/LocalLLaMA

What I got by 5060Ti 16GB + Qwen3.6-35B-A3B-UD-Q5_K_M

huggingface.co·2w·r/LocalLLaMA

Intel Arc Pro B70 Open-Source Linux Performance Against NVIDIA RTX & AMD Radeon AI PRO Review

phoronix.com·2w·Hacker News, r/LocalLLaMA

ibm-granite/granite-4.1-8b

huggingface.co·2w·r/LocalLLaMA

DeepSeek seeks $300M in first outside funding at $10B valuation

cryptobriefing.com·2w·r/LocalLLaMA

Qwen 3.6-35B-A3B on dual 5060 Ti with --cpu-moe: 21.7 tok/s at 90K context, with benchmarks vs dense 3.5 and Coder variant

llmkube.com·2w·r/LocalLLaMA

Use RLS in Postgres instead of app-level filtering by prvnsmpth · Pull Request #165

github.com·2w·r/LocalLLaMA

Qwen 3.6 35 UD 2 K_XL is pulling beyond its weight and quantization (No one is GPU Poor now)

github.com·2w·r/LocalLLaMA

Made a video game that uses local LLMs

quarter2.itch.io·9w·r/LocalLLaMA, r/LocalLLaMA

Thunderbird Team Unveils Thunderbolt Self-Hostable AI Client

linuxiac.com·2w·r/LocalLLaMA

Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference

arxiv.org·2w·r/LocalLLaMA

The case for AI “Cooperatives”

nunodonato.com·2w·Hacker News, r/LocalLLaMA

Cache TTL silently regressed from 1h to 5m around early March 2026, causing quota and cost inflation · Issue #46829

github.com·3w·Hacker News, r/ClaudeAI, r/LocalLLaMA

The only metric that matters: "[Qwen3.6-35B-A3B-GGUF] drew a better pelican riding a bicycle than Opus 4.7 did!"

news.ycombinator.com·2w·r/LocalLLaMA

Claude may require identity verification in some cases

support.claude.com·3w·Hacker News, r/LocalLLaMA, r/privacy

ResBM: Residual Bottleneck Models for Low-Bandwidth Pipeline Parallelism

arxiv.org·3w·r/LocalLLaMA

Mozilla Announces "Thunderbolt" As An Open-Source, Enterprise AI Client

phoronix.com·2w·Hacker News, r/LocalLLaMA

Log in to enable infinite scrolling