Local LLM

Feeds to Scour
SubscribedAll
Scoured 380 posts in 14.0 ms

Show HN: Run Llama.cpp In-Process from Java with Project Panama FFM

馃AI

lightmetal: GPU LLM Inference From a Single Java 25 JAR

馃AIContent type: Blog
adambien.blog

Ollama 0.30 delivers faster NVIDIA GPU performance and wider hardware support

馃AI
alternativeto.net

Unsloth Gemma 4 QAT

馃AI
unsloth.ai

Improved performance and model support with GGUF

馃AIContent type: Blog
ollama.com

local llm on laptop 780M GPU using llama + gemma 4 qat

馃LLMsContent type: Blog
alper.bearblog.dev

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

馃AIContent type: Code
github.comHacker News

Using Scikit-LLM with Open-Source LLMs

馃LLMs

147th airhacks tv: Local LLMs, LightMetal, ZSmith Agents, AI Rails, Saving Tokens

馃AIContent type: Blog
adambien.blog

google/gemma-4-12B-it-qat-q4_0-gguf

馃AI
huggingface.co

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency

馃AIContent type: NewsContent type: Blog
blog.googleHacker News

GGUF vs GPTQ vs AWQ: The Plain-English Guide to LLM Quantization (and Which One to Pick)

馃AI

Apples to Apples: MLX vs. Llama.cpp for Gemma 4 12B on an M1 16GB

馃AIContent type: Blog
ziraph.comHacker News

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

馃AI
androidauthority.com

Google Gemma4 12B released

馃AIContent type: Blog
medium.com

Google鈥檚 new Mac app keeps your AI chats off the internet

馃崕Apple
cultofmac.com

Creating ADK Agent using locally running Gemma 4

馃悕PythonContent type: Blog
medium.com

docs: document lmstudio runtime contracts 路 openclaw/openclaw@82710b4

馃攲LSP ProtocolContent type: Code
github.com

alexziskind1/model-shelf: Model Shelf is a local-first model resolver that helps AI agents and scripts find model weights on your own storage before downloading from Hugging Face. Point it at an internal SSD, NAS, external SSD, or Thunderbolt DAS, and it returns the best local path for GGUF, MLX, safetensors, Ollama, vLLM, and other local AI workflows.

馃AIContent type: Code
github.com

Google鈥檚 latest on-device AI model is custom-made for your laptop

馃AI
androidauthority.com

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help