🏠 Local AI - bigkevuk · Scour

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

🛠️Developer Tooling

deemwar-products.github.io··Hacker News

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

everylocalai.com··DEV

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

🤖AI Agents Blog

adambien.blog·

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

alternativeto.net·

I built an open-source persistent memory layer for AI coding agents

💻AI Coding Code

github.com··r/GithubCopilot

Using Scikit-LLM with Open-Source LLMs

📊Data Science

machinelearningmastery.com·

LLM Routing: From Strategy Selection to Production Architecture

⚙️AI Automation Blog

A system programmer’s guide to LLM inference

🧠LLMs Blog

blog.xiangpeng.systems··Hacker News

On-device AI is a margin decision

💾ARM Blog

ziraph.com··Hacker News

Improved performance and model support with GGUF

💾ARM Blog

You don't need Copilot for code completion, try this instead

mistral.ai··r/GithubCopilot

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

🛡️AI Safety Academic

I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why

🤖AI Agents News Tutorial

local llm on laptop 780M GPU using llama + gemma 4 qat

⚙️LLM Fine-tuning Blog

alper.bearblog.dev·

Qwen 3.6 27B AutoRound GGUF, need your feedback

huggingface.co··r/LocalLLaMA

I added this open-source tool to my local AI stack, and my local LLM finally has persistent memory

✍️Prompt Engineering

xda-developers.com·

Unsloth Gemma 4 QAT

Fixing a stuck Ollama runner and building a GPU watchdog

🖥️Self-Hosting

patrickmccanna.net··Hacker News

Using local LLMs for agentic coding

⚙️LLM Fine-tuning Blog

blog.alexewerlof.com·

Gemma 4 QAT on 10GB Laptop: Local AI with 6.7GB VRAM

everylocalai.com··DEV

Log in to enable infinite scrolling