🦙 Local LLM - tyler · Scour

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

deemwar-products.github.io··Hacker News

lightmetal: GPU LLM Inference From a Single Java 25 JAR

🧠AI Blog

adambien.blog·

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

alternativeto.net·

Unsloth Gemma 4 QAT

Improved performance and model support with GGUF

🧠AI Blog

local llm on laptop 780M GPU using llama + gemma 4 qat

🤖LLMs Blog

alper.bearblog.dev·

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

🧠AI Code

github.com··Hacker News

Using Scikit-LLM with Open-Source LLMs

machinelearningmastery.com·

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

🧠AI Blog

adambien.blog·

google/gemma-4-12B-it-qat-q4_0-gguf

huggingface.co·

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

🧠AI News Blog

blog.google··Hacker News

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

vettedconsumer.com··Hacker News

Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB

🧠AI Blog

ziraph.com··Hacker News

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

androidauthority.com·

Google Gemma4 12B released

🧠AI Blog

Google’s new Mac app keeps your AI chats off the internet

cultofmac.com·

Creating ADK Agent using locally running Gemma 4

🐍Python Blog

docs: document lmstudio runtime contracts · openclaw/openclaw@82710b4

🔌LSP Protocol Code

alexziskind1/model-shelf: Model Shelf is a local-first model resolver that helps AI agents and scripts find model weights on your own storage before downloading from Hugging Face. Point it at an internal SSD, NAS, external SSD, or Thunderbolt DAS, and it returns the best local path for GGUF, MLX, safetensors, Ollama, vLLM, and other local AI workflows.

🧠AI Code

Google’s latest on-device AI model is custom-made for your laptop

androidauthority.com·

Log in to enable infinite scrolling