Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
LocalLlama
reddit.com
Local models are a
godsend
when it comes to
discussing
personal matters
reddit.com
·
3w
·
r/LocalLLaMA
michiosw/oamc
: Local-first LLM wiki for research workflows with Obsidian, a dashboard, and a macOS menubar runtime.
github.com
·
3w
·
r/LocalLLaMA
MiniMax released
MMX-CLI
: one CLI for text, image, video, speech, music, vision, and web search — no MCP server needed. Works
natively
in Claude Code, Cursor, O...
aiuniverse.news
·
3w
·
r/LocalLLaMA
gtausa197-svg
/-Project-Nord-Spiking-Neural-Network-Language-Model: The first pure SNN language model trained from scratch with a fully original architecture.
618M
parameters • 93% sparsity • Runs on phone • Online learning via STDP • $260 total training cost
github.com
·
3w
·
r/LocalLLaMA
,
r/OpenAI
Aryagm/dflash-mlx
: Exact speculative decoding on Apple Silicon, powered by MLX.
github.com
·
3w
·
r/LocalLLaMA
mtmd
: qwen3 audio support (qwen3-omni and qwen3-asr) by
ngxson
· Pull Request #19441
github.com
·
3w
·
r/LocalLLaMA
patilyashvardhan2002-byte/lazy-moe
: The GPU-free LLM inference engine. Combines lazy expert loading +
TurboQuant
KV compression to run models that shouldn't fit on your hardware. Built from scratch, fully local, zero cloud.
github.com
·
3w
·
r/LocalLLaMA
turning
my phone into a local AI server (open source project update)
github.com
·
3w
·
r/LocalLLaMA
Unsloth
MiniMax M2.7
quants
just finished uploading to HF
huggingface.co
·
3w
·
r/LocalLLaMA
ai-dynamo/aitune
: NVIDIA
AITune
is an inference toolkit designed for tuning and deploying Deep Learning models with a focus on NVIDIA GPUs.
github.com
·
3w
·
r/LocalLLaMA
MiniMax-M2.7 GGUF
Quants
— Full Set (Q2_K to Q8_0 +
BF16
)
huggingface.co
·
3w
·
r/LocalLLaMA
MiniMax-M2.7 Q3_K_L &
Q8
_0 — First GGUF
quants
, Apple Silicon (M3 Max 128GB)
huggingface.co
·
3w
·
r/LocalLLaMA
LICENSE ·
MiniMaxAI/MiniMax-M2.7
at main
huggingface.co
·
3w
·
r/LocalLLaMA
Minimax
M2.7
Weights
Released
huggingface.co
·
3w
·
Hacker News
,
r/LocalLLaMA
A Mac Studio for Local AI
spicyneuron.substack.com
·
3w
·
Substack
,
r/LocalLLaMA
Analysis of
spilling
MoE weights onto SSD: GLM-5 is surprisingly
usable
even with over 1/3rd of weights left on SSD, due to caching dynamics
rentry.org
·
3w
·
r/LocalLLaMA
chat_template.jinja ·
froggeric/Qwen3.5-35B-A3B-Uncensored-FernflowerAI-MLX-8bit
at main
huggingface.co
·
3w
·
r/LocalLLaMA
Simulating human cognition in LLM agents: a free
126K-word
book covering memory decay, emotion engines, personality drift, and 12 other cognitive
subsystems
github.com
·
3w
·
r/LocalLLaMA
ShaikhWarsi/free-ai-tools
: Curated list of free and low cost AI tools, LLM APIs,
IDEs
, agents, and infrastructure for building real AI apps
github.com
·
3w
·
r/LocalLLaMA
,
r/PromptEngineering
,
r/artificial
Siriusquirrel/SongGeneration
: Memory-optimized SongGeneration (v2 Large) for 16GB VRAM GPUs. Features 8-bit
µ-law
KV-caching, fused layers, and SDPA/Triton integration.
github.com
·
3w
·
r/LocalLLaMA
« Page 8
·
Page 10 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help