🧠 Local llm - akapaka

Neo-X7/Neo-AI: A fully offline AI assistant powered by Ollama. Stores and retrieves conversations using SQLite + LanceDB vector search. No cloud. No API keys. Runs entirely on your machine.

📝SQLite WAL Code

github.com··DEV

lightmetal: GPU LLM Inference From a Single Java 25 JAR

⚡LLM Quantization Blog

adambien.blog·

I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why

🤖Qwen News Tutorial

zdnet.com·

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

⚡LLM Quantization

vettedconsumer.com··Hacker News

Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good

🧠LLM Inference Blog

towardsai.net·

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

🤖Qwen Code

github.com··Hacker News

Unsloth Gemma 4 QAT

⚡LLM Quantization

unsloth.ai·

Fixing a stuck Ollama runner and building a GPU watchdog

🏠Self-Hosting

patrickmccanna.net··Hacker News

Video: So installiert und nutzt ihr lokale KI-Modelle

🧠LLM Inference News

heise.de·

local llm on laptop 780M GPU using llama + gemma 4 qat

⚡LLM Quantization Blog

alper.bearblog.dev·

I added this open-source tool to my local AI stack, and my local LLM finally has persistent memory

🧠LLM Inference

xda-developers.com·

An LLM that reviews your code, challenges your decisions, but never writes code for you

🤖Qwen Blog

blog.adafruit.com·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🧠LLM Inference News Blog

blog.google··Hacker News

Tales of an Ollama Honeypot (Part 3): More Traffic, More Findings

📊Prometheus

posts.inthecyber.com·

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

Ollama 0.30 GPU Boost: Faster local Qwen inference on NVIDIA

What Ollama Reveals About Local AI, Agents, and Open Models

Qwen 3.6 27B AutoRound GGUF, need your feedback

Improved performance and model support with GGUF

Neo-X7/Neo-AI: A fully offline AI assistant powered by Ollama. Stores and retrieves conversations using SQLite + LanceDB vector search. No cloud. No API keys. Runs entirely on your machine.

lightmetal: GPU LLM Inference From a Single Java 25 JAR

I've tested so many desktop AI tools, but Hermes with Ollama is my new favorite - here's why

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

Google Shrank Gemma 4 by 72% and Unsloth Fixed the 4-Bit Bug Nobody Else Caught on One 4090, and 4-Bit Shouldn’t Be This Good

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

Unsloth Gemma 4 QAT

Fixing a stuck Ollama runner and building a GPU watchdog

Video: So installiert und nutzt ihr lokale KI-Modelle

local llm on laptop 780M GPU using llama + gemma 4 qat

I added this open-source tool to my local AI stack, and my local LLM finally has persistent memory

An LLM that reviews your code, challenges your decisions, but never writes code for you

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

Tales of an Ollama Honeypot (Part 3): More Traffic, More Findings