Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LocalLlama
reddit.com
Attention
Is All You Need, But All You Can't
Afford
codeberg.org
·
4w
·
r/LocalLLaMA
,
r/artificial
OpenAI, Anthropic, Google
Unite
to Combat Model
Copying
in China
bloomberg.com
·
4w
·
Hacker News
,
Hacker News
,
r/LocalLLaMA
Emotion
Concepts
and their Function in a Large Language Model
transformer-circuits.pub
·
4w
·
DEV
,
Hacker News
,
r/LocalLLaMA
,
r/artificial
,
r/singularity
I
benchmarked
37 LLMs on MacBook Air
M5
32GB — full results + open-source tool to benchmark your own Mac
github.com
·
4w
·
r/LocalLLaMA
Rtalabs-ai/aura-research
: LLM-powered research knowledge base —
compile
raw documents into a living wiki with persistent agent memory and RAG retrieval.
github.com
·
4w
·
r/LocalLLaMA
trevorgordon981/alfred-abliterate
: Residual-stream abliteration toolkit for MoE models (Qwen3.5-397B-A10B) on Apple Silicon. Removes PRC-aligned content policies from local inference. Tested on Mac Studio M3 Ultra 512GB.
github.com
·
4w
·
r/LocalLLaMA
ai-infos/vllm-gfx906-mobydick
: A high-throughput and memory-efficient inference and serving engine for LLMs - Optimized for AMD
gfx906
GPUs, e.g. Radeon VII / MI50 / MI60
github.com
·
4w
·
r/LocalLLaMA
If an Agent only works on my machine, that's
usually
state
leakage
, not bad prompting
github.com
·
4w
·
r/LocalLLaMA
,
r/PromptEngineering
,
r/opensource
gemma-4-26B-A4B-it-UD-IQ4
_XS.gguf ·
unsloth/gemma-4-26B-A4B-it-GGUF
at main
huggingface.co
·
4w
·
r/LocalLLaMA
a-ghorbani/pocketpal-ai
: An app that brings language models directly to your phone.
github.com
·
4w
·
r/LocalLLaMA
I made a 35%
REAP
of
397B
with potentially usable quality in 96GB GPU
huggingface.co
·
4w
·
r/LocalLLaMA
intelb70vsrtx4070superdata/README.md
at main ·
hungryblocko/intelb70vsrtx4070superdata
github.com
·
4w
·
r/LocalLLaMA
,
r/hardware
lechmazur/nyt-connections
: Benchmark that
evaluates
LLMs using 759 NYT Connections puzzles extended with extra trick words
github.com
·
13w
·
r/LocalLLaMA
,
r/LocalLLaMA
,
r/singularity
Embarrassingly
Simple
Self-Distillation
Improves Code Generation
arxiv.org
·
4w
·
Lobsters
,
Hacker News
,
Hacker News
,
r/LocalLLaMA
REPRODUCE.md ·
nohurry/gemma-4-26B-A4B-it-heretic-GUFF
at main
huggingface.co
·
4w
·
r/LocalLLaMA
JohannaWeb/Monarch
: Custom Small Language Model Acting as falcon expert
github.com
·
4w
·
r/LocalLLaMA
I let Gemma 4 (
31B
) debate Gemini 3
Deepthink
. The result is insane.
litter.catbox.moe
·
4w
·
r/LocalLLaMA
,
r/singularity
Gemma-4-31B
NVFP4
inference numbers on 1x RTX Pro 6000
huggingface.co
·
4w
·
r/LocalLLaMA
nvidia/nemotron-ocr-v2
huggingface.co
·
4w
·
Hacker News
,
r/LocalLLaMA
Gemma 4
31B
at
256K
Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark
github.com
·
4w
·
r/LocalLLaMA
« Page 11
·
Page 13 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help