🤖 Local LLMs - samuelfastfinge · Scour

Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent

💻macOS Blog

dnhkng.github.io·

The latest Gemma 4 models use a training trick to slash their on-device memory footprint

androidauthority.com·

How I benchmarked a 100% local RAG pipeline to 9/9 (zero API keys)

buy.polar.sh··DEV

Run (your largest) local models from your iPhone

🍎Apple Blog

lmstudio.ai··Hacker News, r/LocalLLaMA

Less-relevant results

fix(memory-core): filter stale recall entries in REM harness preview · openclaw/openclaw@92418fc

🦋Akkoma Code

Indirect Prompt Injection remains a fundamental security challenge for AI

🍎Apple Blog

DeskDash - a free Windows tool to easily manage your GGUF files

gerry7.itch.io··r/LocalLLaMA

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

LM Link launches on iPhone, bringing local AI model access to iOS devices

alternativeto.net·

andreyvgavrilov/food_database: AI agent to evaluate recipe nutrition

🍎Apple Code

github.com··r/mcp

Ideogram4 GGUF is out!

huggingface.co··r/StableDiffusion

Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends

📡LoRa Academic

1-bit and 1.58 bit LLM Benchmarking on Jetson Orin Nano Super | Bonsai LM

smolhub.com··r/LocalLLaMA

Running Qwen 35B MoE at 450k Context on a Single 32GB GPU

local-llm.utop.workers.dev··Hacker News

"AI" Is Eating Platform Monopolist Free Cash Flow, Not the World: CHART OF THE DAY

⚔️Progression Fantasy News Blog

braddelong.substack.com··Substack

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

🍎Apple Code

github.com··Hacker News

Apple rebuilt its on-device AI stack at WWDC 2026

🍎Apple Blog

ziraph.com··Hacker News

Large companies can add a local LLM filter layer to considerably reducing their AI costs

umrashrf.github.io··Hacker News

My Notes on the Progression from Context to Prompt to Harness engineering in making GPT LLMs Useful: (TUESDAY) MAMLMs

🔊Screen Readers News Blog

braddelong.substack.com

Self-hosted remote access for Ollama without complicated setup

🏠Self-hosting

oab.arc-i.co.uk··r/selfhosted

Sign up or log in to see more results

Log in to enable infinite scrolling