Local LLMs

Feeds to Scour
SubscribedAll
Scoured 146 posts in 37.0 ms

Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)

 🖥️Local AI  Content type: Blog
dev.to··DEV

I switched from LM Studio to llama.cpp, and I'm never going back to a bloated wrapper

 🖥️Local AI
howtogeek.com·

LM Studio now lets you use your iPhone to talk to local models on your Mac

 🖥️Local AI
9to5mac.com··r/apple

On-device AI agents hit a hard memory limit. Apple's new architecture routes around it.

 ☁️GCP
venturebeat.com·

alexziskind1/model-shelf: Model Shelf is a local-first model resolver that helps AI agents and scripts find model weights on your own storage before downloading from Hugging Face. Point it at an internal SSD, NAS, external SSD, or Thunderbolt DAS, and it returns the best local path for GGUF, MLX, safetensors, Ollama, vLLM, and other local AI workflows.

 🖥️Local AI  Content type: Code
github.com·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

 🔓Open Source AI  Content type: News  Content type: Blog
blog.google··Hacker News

I built a fully local AI coding assistant in Windows with Ollama and VS Code

 🧠LLM Tooling
howtogeek.com·

On-Device AI in SwiftUI Apps

 Quantization  Content type: Blog
dev.to··DEV

Google’s latest on-device AI model is custom-made for your laptop

 🔓Open Source AI
androidauthority.com·

zaydmulani09/mnemo: Local-first AI memory layer for any LLM. Persistent knowledge graph, entity extraction, semantic retrieval. Works with Ollama, OpenAI, Anthropic, or any OpenAI-compatible backend.

 🦙Ollama  Content type: Code
github.com··Hacker News

Running a Local AI Engineering Agent with deepstrain: A Step-by-Step Tutorial

 🧠LLM Tooling  Content type: Blog
dev.to··DEV

shoo99/paper-rag: A private, fully-local RAG over your own PDFs: BGE-M3 + embedded Qdrant + a local LLM via Ollama. ~150 lines, nothing leaves your machine.

 🤖Large Language Models  Content type: Code
github.com··DEV

sancheznot/Godot-AI-Assistant: Golem-AI is an AI-powered editor assistant for Godot 4. Chat with local or cloud models (Ollama, LM Studio, OpenAI, Anthropic, Gemini, Cursor) directly from an editor dock.

 🖥️Local AI  Content type: Code
github.com··DEV

Run Coding Agents on Local AI — Zero Cloud, Full Control

 🧠LLM Tooling  Content type: Blog
dev.to··DEV

NVIDIA and Apple Solved the Hardware. Here's What's Left to Build.

 Quantization  Content type: Blog
dev.to··DEV

I Connected PewDiePie's Odysseus to a Cloud Memory Stack — Zero API Costs, Persistent Memory

 🧠LLM Tooling  Content type: Blog
dev.to··DEV

Your AI Vendor Says 'Trust Us' with Your Data. There's a Better Option.

 🔒Privacy  Content type: Blog
dev.to··DEV

I Benchmarked 3 Local LLMs on My Laptop — Here's What the Numbers Actually Show

 🧠LLM  Content type: Blog
dev.to··DEV

Running AI Locally: Skip the API Bills and Build Faster

 🖥️Local AI  Content type: Blog
dev.to··DEV

I kept using Claude Code. Added one thing to it. Cut AI engineering costs by 62%.

 🤖Large Language Models  Content type: Blog
dev.to··DEV

No more posts from buckman's subscribed feeds.

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help