🏠 Local LLM Deployment - masterdev · Scour

Where to Buy or Rent GPUs for LLM Inference: The 2026 GPU Procurement Guide

bentoml.com·1d·

Discuss: Hacker News

🖥Home Lab Setup

Flag this post

A Tale of LLMs and Induced Small Proxies: Scalable Agents for Knowledge Mining

dev.to·1h·

Discuss: DEV

🖥️Self-hosted apps

Flag this post

Kimi Linear: An Expressive, Efficient Attention Architecture

arxiviq.substack.com·11m·

Discuss: Substack

Flag this post

zFLoRA: Zero-Latency Fused Low-Rank Adapters

arxiv.org·1d

Flag this post

Belhold! My setup.

reddit.com·11h·

Discuss: r/homelab

🖥Home Lab Setup

Flag this post

Your AI Models Aren’t Slow, but Your Data Pipeline Might Be

thenewstack.io·1d

Flag this post

Announcing llm-docs-builder: OSS library for optimizing documentation for AI/RAG systems

mensfeld.pl·2d·

Discuss: r/opensource

🖥️Self-hosted apps

Flag this post

Speedrunning an RL Environment

sidb.in·12h·

Discuss: Hacker News

🪟Awesome windows command-line

Flag this post

Build LLM Agents Faster with Datapizza AI

towardsdatascience.com·2d

🖥️Self-hosted apps

Flag this post

Custom Intelligence: Building AI that matches your business DNA

aws.amazon.com·1d

🖥️Self-hosted apps

Flag this post

Smaller Surfaces

nrempel.com·1h·

Discuss: Hacker News

🖥️Self-hosted apps

Flag this post

Context-Bench: Benchmarking LLMs on Agentic Context Engineering

letta.com·1d·

Discuss: Hacker News

🖥️Self-hosted apps

Flag this post

How fast can an LLM go?

fergusfinn.com·2d·

Discuss: Hacker News

Flag this post

Toward provably private insights into AI use

research.google·2d·

Discuss: Hacker News

🖥️Self-hosted apps

Flag this post

Opportunistically Parallel Lambda Calculus

dl.acm.org·2d·

Discuss: Hacker News

Flag this post

The End of Cloud Inference

docs.google.com·7h·

Discuss: Hacker News

🖥️Self-hosted apps

Flag this post

A Senior Developer's Guide to the Model Context Protocol

dev.to·1d·

Discuss: DEV

🖥️Self-hosted apps

Flag this post

Beyond the Hype: The Hidden Economics of AI Inference

dev.to·1d·

Discuss: DEV

🖥️Self-hosted apps

Flag this post

Show HN: Everything it took to run an LLM at 10k tok/s on H200s

relace.ai·3d·

Discuss: Hacker News

🖥️Self-hosted apps

Flag this post

Show HN: Why write code if the LLM can just do the thing? (web app experiment)

github.com·5h·

Discuss: Hacker News

🖥️Self-hosted apps

Flag this post

Loading more...