🧠 Local LLMs - flipperjibber · Scour

Improved performance and model support with GGUF

🟣Claude Blog

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

🔬Deep Learning

everylocalai.com··DEV

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

alternativeto.net·

UniSVQ: 2-bit Unified Scalar-Vector Quantization

🤖Machine Learning Academic

Qwen 3.6 27B AutoRound GGUF, need your feedback

huggingface.co··r/LocalLLaMA

Neo-X7/Neo-AI: A fully offline AI assistant powered by Ollama. Stores and retrieves conversations using SQLite + LanceDB vector search. No cloud. No API keys. Runs entirely on your machine.

🗄️SQL Code

github.com··DEV

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

deemwar-products.github.io··Hacker News

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🧠OpenAI Blog

adambien.blog·

I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why

🤖AI Agents News Tutorial

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

vettedconsumer.com··Hacker News

A system programmer’s guide to LLM inference

📝NLP Blog

blog.xiangpeng.systems··Hacker News

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

🤖Qwen Code

github.com··Hacker News

Unsloth Gemma 4 QAT

I added this open-source tool to my local AI stack, and my local LLM finally has persistent memory

👨‍💻AI Coding

xda-developers.com·

Optimal Post-Training Quantization Scales and Where to Find Them

🤖LLMs Academic

On-device AI is a margin decision

🧠OpenAI Blog

ziraph.com··Hacker News

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

📝NLP News Blog

blog.google··Hacker News

Token4Token — pay-per-token inference on Gnosis + Swarm

t4t.eth.link··Hacker News

Fixing a stuck Ollama runner and building a GPU watchdog

✍️Prompt Engineering

patrickmccanna.net··Hacker News

Log in to enable infinite scrolling