Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LocalLlama
reddit.com
Siriusquirrel/SongGeneration
: Memory-optimized SongGeneration (v2 Large) for 16GB VRAM GPUs. Features 8-bit
µ-law
KV-caching, fused layers, and SDPA/Triton integration.
github.com
·
3w
·
r/LocalLLaMA
cmhamiche/kld-sweep
: A cross-platform Python script to evaluate and compare GGUF
quantizations
of a model against its BF16/F16 baseline using KL Divergence and Perplexity, powered by llama.cpp
github.com
·
3w
·
r/LocalLLaMA
ahb-sjsu/turboquant-pro
: First open-source TurboQuant (
Zandieh
et al. ICLR 2026) for LLM KV cache compression. 5x memory reduction, 0.978 cosine similarity.
github.com
·
3w
·
Hacker News
,
r/LocalLLaMA
JordiSilvestre/Spectral-AI
: "O(log N) MoE Expert routing via RT Core ray tracing.
BVH
traversal replaces matrix multiplication in neural language models."
github.com
·
3w
·
Hacker News
,
r/LocalLLaMA
Gemma
4 vs Qwen3.5: benchmarking
quantized
local LLMs on Go coding
msf.github.io
·
3w
·
r/LocalLLaMA
Finetuned
a
270M
model on CPU only - full weights, no LoRA, no GPU
promptinjection.net
·
3w
·
r/LocalLLaMA
Using
OCR
models with llama.cpp (by
ngxson
)
huggingface.co
·
3w
·
r/LocalLLaMA
RyjoxTechnologies/Octopoda-OS
: The open-source memory operating system for AI agents. Persistent memory, semantic search, loop detection, agent messaging, crash recovery, and real-time observability.
github.com
·
4w
·
Hacker News
,
r/LocalLLaMA
Huawei’s Atlas
300I
Duo offers 96GB
VRAM
for local LLMs under $1500. Is this the budget
VRAM
breakthrough?
hardware-corner.net
·
3w
·
r/LocalLLaMA
VoxCPM2 is out - 2B
params
, 30 languages. Major upgrade over
VoxCPM1.5
.
huggingface.co
·
3w
·
r/LocalLLaMA
AuthBits/webmcp
: A lightweight, prompt-driven MCP web research server for high-quality LLM powered information extraction.
github.com
·
3w
·
Hacker News
,
r/LocalLLaMA
From 1939 to voice
clones
in 3
seconds
— the full AI speech timeline and where it's heading
youtu.be
·
3w
·
r/LocalLLaMA
pwilkin/catapult
: A Tauri-based cross-platform launcher /
updater
/ model manager for llama.cpp
github.com
·
3w
·
r/LocalLLaMA
atomicmemory/llm-wiki-compiler
: The knowledge compiler. Raw sources in,
interlinked
wiki out. Inspired by Karpathy's LLM Wiki pattern.
github.com
·
4w
·
Hacker News
,
r/LLM
,
r/LocalLLaMA
,
r/PromptEngineering
,
r/artificial
AI Cybersecurity After
Mythos
: The
Jagged
Frontier
aisle.com
·
3w
·
Hacker News
,
Hacker News
,
Hacker News
,
r/BetterOffline
,
r/LocalLLaMA
,
r/singularity
The
Mythos
Preview "Safety"
Gaslight
: Anthropic is just hiding insane compute costs. Open models are already doing this.
youtube.com
·
3w
·
r/LocalLLaMA
ggml
: backend-agnostic tensor parallelism by
JohannesGaessler
· Pull Request #19378
github.com
·
12w
·
r/LocalLLaMA
,
r/LocalLLaMA
gemma-4-31b-abliterated-Q4
_K_M.gguf ·
paperscarecrow/Gemma-4-31B-it-abliterated
at main
huggingface.co
·
3w
·
r/LocalLLaMA
vocab: add
gemma4
tokenizer tests, fix edge case by
pwilkin
· Pull Request #21534
github.com
·
3w
·
r/LocalLLaMA
New Model!
LGAI-EXAONE/EXAONE-4.5-33B
huggingface.co
·
3w
·
r/LocalLLaMA
« Page 9
·
Page 11 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help