🖥️ Local AI - buckman

alexziskind1/model-shelf: Model Shelf is a local-first model resolver that helps AI agents and scripts find model weights on your own storage before downloading from Hugging Face. Point it at an internal SSD, NAS, external SSD, or Thunderbolt DAS, and it returns the best local path for GGUF, MLX, safetensors, Ollama, vLLM, and other local AI workflows.

🔓Open Source AI Code

github.com·

Tales of an Ollama Honeypot (Part 3): More Traffic, More Findings

🍯Deception Technology

posts.inthecyber.com·

LM Studio now lets you use your iPhone to talk to local models on your Mac

💻Local LLMs

9to5mac.com··r/apple

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🔓Open Source AI News Blog

blog.google··Hacker News

I built a fully local AI coding assistant in Windows with Ollama and VS Code

🧠LLM Tooling

howtogeek.com·

How to Tune llama.cpp --n-gpu-layers: A Practical VRAM Guide (2026)

🟩Nvidia Blog

dev.to··DEV

Google Gemma 4 12B: Architecture, Benchmarks, Access, and Hands-on Guide for Developers

🔓Open Source AI Blog

analyticsvidhya.com·

zaydmulani09/mnemo: Local-first AI memory layer for any LLM. Persistent knowledge graph, entity extraction, semantic retrieval. Works with Ollama, OpenAI, Anthropic, or any OpenAI-compatible backend.

🦙Ollama Code

github.com··Hacker News

Open-LLM-VTuber Review: Offline AI Companion with Live2D

🧠LLM Blog

dev.to··DEV

shoo99/paper-rag: A private, fully-local RAG over your own PDFs: BGE-M3 + embedded Qdrant + a local LLM via Ollama. ~150 lines, nothing leaves your machine.

🤖Large Language Models Code

github.com··DEV

Running a Local AI Engineering Agent with deepstrain: A Step-by-Step Tutorial

🧠LLM Tooling Blog

dev.to··DEV

sancheznot/Godot-AI-Assistant: Golem-AI is an AI-powered editor assistant for Godot 4. Chat with local or cloud models (Ollama, LM Studio, OpenAI, Anthropic, Gemini, Cursor) directly from an editor dock.

♟️Game Theory Code

github.com··DEV

Run Coding Agents on Local AI — Zero Cloud, Full Control

🧠LLM Tooling Blog

dev.to··DEV

How to Tune --n-gpu-layers for Your VRAM Budget

📊Compute Markets Blog

dev.to··DEV

Running AI Locally: Skip the API Bills and Build Faster

💻Local LLMs Blog

dev.to··DEV

I built a self-hosted AI workspace for macOS — meet Odysee

🧠LLM Tooling Blog

dev.to··DEV

Run Gemma-4 12B on WSL2 with llama.cpp

🔓Open Source AI Blog

dev.to··DEV

Built an AI-Powered Spring Boot Log Analyzer Using RAG + Ollama

🤖Large Language Models Blog

dev.to

··DEV

No more posts from buckman's subscribed feeds.

Scour all 25255 feeds Learn more about Feeds

I switched from LM Studio to llama.cpp, and I'm never going back to a bloated wrapper

Doubling Qwen3.6-27B on One RTX 3090: ollama llama.cpp + MTP, Lever by Lever (35.7 80.2 tok/s)

Tales of an Ollama Honeypot (Part 3): More Traffic, More Findings

LM Studio now lets you use your iPhone to talk to local models on your Mac

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

I built a fully local AI coding assistant in Windows with Ollama and VS Code

How to Tune llama.cpp --n-gpu-layers: A Practical VRAM Guide (2026)

Google Gemma 4 12B: Architecture, Benchmarks, Access, and Hands-on Guide for Developers

zaydmulani09/mnemo: Local-first AI memory layer for any LLM. Persistent knowledge graph, entity extraction, semantic retrieval. Works with Ollama, OpenAI, Anthropic, or any OpenAI-compatible backend.

Open-LLM-VTuber Review: Offline AI Companion with Live2D

shoo99/paper-rag: A private, fully-local RAG over your own PDFs: BGE-M3 + embedded Qdrant + a local LLM via Ollama. ~150 lines, nothing leaves your machine.

Running a Local AI Engineering Agent with deepstrain: A Step-by-Step Tutorial

sancheznot/Godot-AI-Assistant: Golem-AI is an AI-powered editor assistant for Godot 4. Chat with local or cloud models (Ollama, LM Studio, OpenAI, Anthropic, Gemini, Cursor) directly from an editor dock.

Run Coding Agents on Local AI — Zero Cloud, Full Control

How to Tune --n-gpu-layers for Your VRAM Budget

Running AI Locally: Skip the API Bills and Build Faster

I built a self-hosted AI workspace for macOS — meet Odysee

Run Gemma-4 12B on WSL2 with llama.cpp

Built an AI-Powered Spring Boot Log Analyzer Using RAG + Ollama