Open Source LLMs

Feeds to Scour
SubscribedAll
Scoured 234 posts in 7.7 ms

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

 🧠LLMs  Content type: News  Content type: Blog
blog.google··Hacker News

Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)

 🛠️AI Tooling  Content type: Blog
dev.to··DEV

Neo-X7/Neo-AI: A fully offline AI assistant powered by Ollama. Stores and retrieves conversations using SQLite + LanceDB vector search. No cloud. No API keys. Runs entirely on your machine.

 🛠️AI Tooling  Content type: Code
github.com··DEV

Integrate on-device AI models into your app using Core AI - WWDC26 - Videos

 💡AI

Ask HN: Is it feasible to run a model on device for complete privacy?

 🧠LLMs  Content type: Discussion

Token4Token — pay-per-token inference on Gnosis + Swarm

 🛠️AI Tooling

Show HN: Audit any AI/data pairing with Veritrooper

 🧠LLMs
Less-relevant results

Node.js Annual Releases, Terraform 1.15, Gemma 4 Multimodal

 Vibe Coding  Content type: Discussion
thedevsignal.com··DEV

No Cloud, No Cost: Build an Offline Visual AI Agent with Gemma 4

 🛠️AI Tooling  Content type: Blog
dev.to··DEV

Introducing the Google Colab CLI

 ⚙️Workflow Automation  Content type: Blog

local AI agents for Cursor with pre-tuned marketplace/commu

 🛠️AI Tooling
locaible.com··Hacker News

Claude Now Writes 80% of Its Own Code — Anthropic's Self-Improvement Milestone Arrives Faster Than Expected

 🔶Claude

Job Searcher

 💡AI  Content type: Blog
huggingface.co·

How I benchmarked a 100% local RAG pipeline to 9/9 (zero API keys)

 📚RAG
buy.polar.sh··DEV

Large companies can add a local LLM filter layer to considerably reducing their AI costs

 🧠LLMs

Project Log #2: The AI Phone Agent Has a Repo

 🛠️AI Tooling  Content type: Blog
dev.to··DEV

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

 💡AI  Content type: Blog

Purpose-built local AI agents

 ✍️Prompt Engineering  Content type: Blog

fix(memory-core): filter stale recall entries in REM harness preview · openclaw/openclaw@92418fc

 🛠️AI Tooling  Content type: Code
github.com·

ComfyUI NVFP4 in 2026: 3 Faster Image Generation on RTX 50-Series (and the Right Format for RTX 40-Series)

 💡AI  Content type: Blog
dev.to··DEV

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help