🖥️ Ollama - moderation

DevMando/MandoCode: A .NET C# CLI Coding Agent powered by Ollama + Semantic Kernel and RazorConsole. Run locally or in the cloud. Refactors code, proposes diffs, and updates your project safely — no API keys required.

🔧MCP Code

github.com··Hacker News

Ollama's highest performance on Apple Silicon yet with MLX

🧠LLMs Blog

ollama.com·

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

🧠LLMs

vettedconsumer.com··Hacker News

6. Air-Gapped Claude Code - The Claude Code SRE Handbook

💻AI

har-ki.github.io··Hacker News

How to Run an LLM Locally: Ultimate Guide to Local AI 2026

🧠LLMs Blog

cswithsanjay.blogspot.com·

From Chatbot Hallucinations to Deterministic Agents: Forcing Local LLMs to Run Production-Grade…

🧠LLMs Blog

medium.com

local llm on laptop 780M GPU using llama + gemma 4 qat

🧠LLMs Blog

alper.bearblog.dev·

What Ollama Reveals About Local AI, Agents, and Open Models

🤖AI software development Blog

odsc.medium.com·

I gave a local LLM access to my Docker containers, and it replaced my monitoring scripts

💻Wezterm

xda-developers.com·

I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why

🤖AI Agents News Tutorial

zdnet.com·

lightmetal: GPU LLM Inference From a Single Java 25 JAR

💻Wezterm Blog

adambien.blog·

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

💻Wezterm Blog

bric.pe.kr··DEV·Cited by 1 article

Ask HN: What's the best LLM model that on a 24 GB VRAM GPU?

🧠LLMs Discussion

news.ycombinator.com··Hacker News

fix: resolve managed secretref provider auth (#92235) · openclaw/openclaw@9386d62

🔌LSP Code Release

github.com·

Self-hosted remote access for Ollama without complicated setup

🔌LSP

oab.arc-i.co.uk··r/selfhosted

Less-relevant results

DiffusionGemma: 4x Faster Text Generation

🧠LLMs News Blog 19

blog.google··Hacker News, r/LocalLLaMA, r/singularity·Cited by 21 articles

Fixing a stuck Ollama runner and building a GPU watchdog

💻Wezterm

patrickmccanna.net··Hacker News

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

Unsloth Minimax M3 GGUF

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

DevMando/MandoCode: A .NET C# CLI Coding Agent powered by Ollama + Semantic Kernel and RazorConsole. Run locally or in the cloud. Refactors code, proposes diffs, and updates your project safely — no API keys required.

Ollama's highest performance on Apple Silicon yet with MLX

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

6. Air-Gapped Claude Code - The Claude Code SRE Handbook

How to Run an LLM Locally: Ultimate Guide to Local AI 2026

From Chatbot Hallucinations to Deterministic Agents: Forcing Local LLMs to Run Production-Grade…

local llm on laptop 780M GPU using llama + gemma 4 qat

What Ollama Reveals About Local AI, Agents, and Open Models

I gave a local LLM access to my Docker containers, and it replaced my monitoring scripts

I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why

lightmetal: GPU LLM Inference From a Single Java 25 JAR

MTP Isn't Always a Win: 1.95x on My 3090, but Speculative Decoding Is Hardware-Dependent

Ask HN: What's the best LLM model that on a 24 GB VRAM GPU?

fix: resolve managed secretref provider auth (#92235) · openclaw/openclaw@9386d62

Self-hosted remote access for Ollama without complicated setup

DiffusionGemma: 4x Faster Text Generation

Fixing a stuck Ollama runner and building a GPU watchdog