🤖 AI - emschwartz · Scour

The Best Open Source and Open-Weight LLM Models to Run Locally in 2026 🏗️LLM Infrastructure

huggingface.co·2d

Find bugs in YOUR code using OpenCode, Llama.cpp and Qwen3.6 💉Prompt Injection

wtarreau.blogspot.com·3d·Lobsters, Hacker News, wtarreau.blogspot.com

Qwen 3.7 🤖, Cursor Composer 2.5 👨‍💻, Anthropic acquires Stainless 🛠️ 🔥Prometheus

Can You Run LLMs Locally Without a GPU? I Tested 8 Models on Linux 🏗️LLM Infrastructure

itsfoss.com·5d·Hacker News

ndom91/llama-dash: The operations layer for your local LLM stack 🏗️LLM Infrastructure

github.com·1d·Hacker News

Running PyTorch Models on Apple Silicon GPUs with the ExecuTorch MLX Delegate 🕯️Candle

pytorch.org·2d·Hacker News

Local LLMs have one advantage ChatGPT and Claude can’t match, and it's why I'm switching 🏗️LLM Infrastructure

makeuseof.com·3d

My local LLM can call Claude when it's stuck, and it changed everything about my local-first setup 🏗️LLM Infrastructure

xda-developers.com·3d

Towards local plug-and-play AI 🏗️LLM Infrastructure

adlrocha.substack.com·3d·Substack

antoinezambelli/forge: A Python framework for self-hosted LLM tool-calling and multi-step agentic workflows 🦙Ollama

github.com·1d·Hacker News

Doorman11991/smallcode: AI coding agent optimized for small LLMs. 87% benchmark with 4B-active model. 💻Coding Agents

github.com·2d·Hacker News, r/SideProject, r/vibecoding

Gemini Extended Thinking ✨, ChatGPT finance 📱, Claude Code at scale 👨‍💻 🖥GPUs

BuffaloTechRider/Autodidact: Self-learning AI agent that gets smarter and cheaper over time. Routes between local and cloud LLMs, learns from every interaction, remembers everything. 🆕New AI

github.com·1d·Hacker News

MegaTrain Full Precision Training of 100B+ Parameter LLMs on a Single GPU 🏗️LLM Infrastructure

github.com·3d·Hacker News

RedToasty/llama.cpp_qts: Fixing --split-mode tensor, with different KV cache quantization types. 🏗️LLM Infrastructure

github.com·3d·r/LocalLLaMA

I've updated my glorified Llama fork (LLM Inference Server) for P40's to utilise MTP + TurboQuant + DFlash 🏗️LLM Infrastructure

github.com·4d·r/LocalLLaMA

2.3x KV Cache Compression at 32k Context 🏗️LLM Infrastructure

github.com·6d·Hacker News

5p00kyy/club-5060ti: Practical local LLM recipes and benchmarks for RTX 5060 Ti setups 🏗️LLM Infrastructure

github.com·6d·r/LocalLLaMA

oobabooga/textgen: Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 🚀Modal

Why Gemma-4 26B MoE works in HuggingFace but breaks in prod inference engines 🏗️LLM Infrastructure

github.com·5d·Hacker News

No more posts from emschwartz's subscribed feeds.

Scour all 24660 feeds Learn more about Feeds

Log in to enable infinite scrolling