Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LocalLlama
reddit.com
I spent 96 hours setting up dual
DGX
Sparks and a Mac Studio M3 Ultra for the same
397B
model. Neither won.
alooftwaffle.substack.com
·
5w
·
r/LocalLLaMA
RYS
Part 3: LLMs think in
geometry
, not language — new results across 4 models, including code and math
dnhkng.github.io
·
5w
·
Hacker News
,
r/LocalLLaMA
Google's
TurboQuant
AI-compression
algorithm can reduce LLM memory usage by 6x
arstechnica.com
·
6w
·
Hacker News
,
Hacker News
,
Hacker News
,
r/LocalLLaMA
Who is
liable
when the AI
decides
?
aifactoryinsider.com
·
6w
·
Hacker News
,
r/LocalLLaMA
Standard
LoRA
is quietly losing 68% of quality on
FP8
hardware and most people have no idea
koscak.ai
·
5w
·
r/LocalLLaMA
kevin-hs-sohn/memaware
: Benchmark for measuring memory awareness in AI agents — the ability to surface relevant past context without being asked
github.com
·
5w
·
r/LocalLLaMA
TurboQuant
for weights: near‑optimal 4‑bit LLM quantization with
lossless
8‑bit residual
github.com
·
5w
·
r/LocalLLaMA
chromadb/context-1
huggingface.co
·
5w
·
r/LocalLLaMA
soy-tuber/nemotron
: Local multimodal LLM gateway unifying NVIDIA
Nemotron
models on a single GPU
github.com
·
5w
·
r/LocalLLaMA
Judge blocks Pentagon’s effort to ‘
punish
’ Anthropic by
labeling
it a supply chain risk
cnn.com
·
5w
·
Hacker News
,
r/LocalLLaMA
lightningpixel/modly
: Desktop app to generate 3D models from images using local AI — runs entirely on your GPU
github.com
·
5w
·
r/LocalLLaMA
Apple stopped selling 512gb
URAM
mac studios, now the max
amount
is 256GB!
apple.com
·
5w
·
r/LocalLLaMA
1 Million Tokens Per Second: Qwen 3.5
27B
on GKE with
B200
GPUs
medium.com
·
5w
·
Hacker News
,
r/LocalLLaMA
mistralai/Voxtral-4B-TTS-2603
huggingface.co
·
5w
·
r/LocalLLaMA
CohereLabs/cohere-transcribe-03-2026
huggingface.co
·
5w
·
r/LocalLLaMA
nokodo-labs/os1
: the next-gen open source AI platform
github.com
·
8w
·
r/LocalLLaMA
,
r/LocalLLaMA
Quantization
from the
ground
up
ngrok.com
·
6w
·
Lobsters
,
Hacker News
,
r/LocalLLaMA
,
r/programming
steipete/mcporter
: Call MCPs via TypeScript, masquerading as simple TypeScript API. Or package them as cli.
github.com
·
14w
·
Hacker News
,
r/LocalLLaMA
kevdogg102396-afk/free-claude-code
: Nemo Code — FREE Claude Code alternative. NVIDIA open models + Claude Code CLI framework. One command install. Zero cost. By
ClawdWorks
.
github.com
·
5w
·
r/LocalLLaMA
,
r/coding
Level1techs
initial review of ARC
B70
for Qwen and more. (He has 4
B70
pros)
youtu.be
·
5w
·
r/LocalLLaMA
« Page 15
·
Page 17 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help