Local LLMs

Feeds to Scour
SubscribedAll
Scoured 255 posts in 8.9 ms

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

馃捇macOSContent type: Blog
dnhkng.github.io

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

馃Akkoma
androidauthority.com

How I benchmarked a 100% local RAG pipeline to 9/9 (zero API keys)

馃Akkoma
buy.polar.shDEV

Run (your largest) local models from your iPhone

馃崕AppleContent type: Blog
Less-relevant results

fix(memory-core): filter stale recall entries in REM harness preview 路 openclaw/openclaw@92418fc

馃AkkomaContent type: Code
github.com

Indirect Prompt Injection remains a fundamental security challenge for AI

馃崕AppleContent type: Blog
brave.com

DeskDash - a free Windows tool to easily manage your GGUF files

馃Akkoma
gerry7.itch.ior/LocalLLaMA

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

馃Akkoma
phoronix.com

LM Link launches on iPhone, bringing local AI model access to iOS devices

馃崕Apple
alternativeto.net

andreyvgavrilov/food_database: AI agent to evaluate recipe nutrition

馃崕AppleContent type: Code
github.comr/mcp

Ideogram4 GGUF is out!

馃搵WCAG

Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends

馃摗LoRaContent type: Academic
arxiv.org

1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM

馃敆Tailscale
smolhub.comr/LocalLLaMA

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

馃獰Windows

"AI" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY

鈿旓笍Progression FantasyContent type: NewsContent type: Blog

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

馃崕AppleContent type: Code
github.comHacker News

Apple rebuilt its on-device AI stack at WWDC 2026

馃崕AppleContent type: Blog
ziraph.comHacker News

Large companies can add a local LLM filter layer to considerably reducing their AI costs

馃摗LoRa

My Notes on the Progression from Context to Prompt to Harness engineering in making GPT LLMs Useful: (TUESDAY) MAMLMs

馃攰Screen ReadersContent type: NewsContent type: Blog

Self-hosted remote access for Ollama without complicated setup

馃彔Self-hosting
oab.arc-i.co.ukr/selfhosted
Sign up or log in to see more results

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help