🤗 Open Source AI - kudolink · Scour

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🏠Local LLMs News Blog

blog.google··Hacker News

google/gemma-4-31B-it · fix: chat template — null handling, reasoning preservation, turn-tag balance, input validation

huggingface.co··r/LocalLLaMA

defai-digital/ax-engine: Apple Silicon LLM runtime supporting Gemma 4 and Qwen 3.6 MTP modes

🏠Local LLMs Code

github.com··Hacker News

NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI

🟢NVIDIA Blog

blogs.nvidia.com·

Breaking the Ice: Analyzing Cold Start Latency in vLLM

🧠LLMs Academic

arxiv.org··Hacker News

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

deemwar-products.github.io··Hacker News

Inferoa AI harness claimed 90% cache savings. We ran it and measured 97.8%

zozo123.github.io··Hacker News

Fixing a stuck Ollama runner and building a GPU watchdog

patrickmccanna.net··Hacker News

DiffusionGemma: The Developer Guide- Google Developers Blog

🧠LLMs Blog

developers.googleblog.com··r/LocalLLaMA

Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB

🏠Local LLMs Blog

ziraph.com··Hacker News

Running Two LLMs on a Mini PC Sounds Great Until the Benchmarks Arrive

hackernoon.com·

"North Mini Code"; open weights, 30B param, Canadian coding model

🧠LLMs Blog

cohere.com··Hacker News

A drop-in replacement chat template for google/gemma-4-31B-it tuned for open-source agentic coding harnesses.

gist.github.com··r/LocalLLaMA

Re-quantizing a local LLM 14x faster by skipping the tensors that didn't change

🏠Local LLMs News Blog

andreaborio.substack.com··Substack

Token4Token — pay-per-token inference on Gnosis + Swarm

t4t.eth.link··Hacker News

Large companies can add a local LLM filter layer to considerably reducing their AI costs

umrashrf.github.io··Hacker News

DeepSeekV4 1.6T Day 0 to Day 43 Performance Over Time - Huawei, GB300 NVL72, MI355X, B200

🧠LLMs News

newsletter.semianalysis.com

··Hacker News

martidu4/honey-ai: 🍯 All-in-one AI honeypot powered by local LLMs. SSH, HTTP, FTP, Telnet, SMTP, MySQL, Redis, Git, VNC, RDP — with canary tokens, tarpits, GZIP bombs, and threat intel reporting.

🏠Local LLMs Code

github.com··Hacker News

Florian Brand, Prime Intellect research engineer, adopts Gemma 4 E4B 6-bit quantized as his primary local Mac LLM

🏠Local LLMs News

digg.com··Hacker News

Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information

venturebeat.com··Hacker News

Log in to enable infinite scrolling