🚀 Model Serving - micaleel · Scour

shreyansh26/Speculative-Decoding: Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch 🔨LLVM

github.com·3d·r/LLM, r/LocalLLaMA

My Calculator Is a Transformer 🐍Programming

sinclairs.gitlab.io·3h·Hacker News, r/LocalLLaMA

Prefetching Weights in llama.cpp 🔨LLVM

am17an.bearblog.dev·2d

inclusionAI/Ling-2.6-1T 🐹Go

huggingface.co·23h·r/LocalLLaMA

Vibe Training - Auto Train a Small Language Model for Your Use Case 🤖Transformers

diamantai.substack.com·2d·Substack, r/LocalLLaMA

I Built a WebAssembly Runtime in 5 Days 🐹Go

tingouw.com·3h·Hacker News

Maybe I was too harsh on deep learning theory (three days ago) 🤖Machine Learning

lesswrong.com·10h

Lambda Calculus Benchmark for AI 🔄Concurrency

victortaelin.github.io·5d·Hacker News

LingBot-Map: Streaming 3D reconstruction with geometric context transformer 📓Jupyter Notebooks

technology.robbyant.com·2d·Hacker News

Lessons from Building an OTel Normalizer for GenAI (Part 1) 🛠️Feature Engineering

groundcover.com·12h·Hacker News

Scaling Pain of Coding Agent Serving: Lessons from Debugging GLM-5 at Scale 🐍Programming

z.ai·15h·Lobsters, Hacker News

Qwen 3.6-35B-A3B KV cache bench: f16 vs q8_0 vs turbo3 vs turbo4 from 0 to 1M context on M5 Max 🔨LLVM

llmkube.com·2d·r/LocalLLaMA

Vibin’ With Erlang 🐹Go

blog.whenhen.com·6d·Lobsters

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size 🛠️Feature Engineering

firethering.com·6h·Hacker News

Changes, New Features, and Fixes 🔨LLVM

gcc.gnu.org·5h·Hacker News, r/cpp

How we built ten custom subagents to tame a 500K-line Clojure codebase 🛠️Feature Engineering

metabase.com·2d·Hacker News, r/programming

Clojure us the future of AI coding, but you won't use it 🛠️Feature Engineering

latypoff.com·21h·Hacker News

vLLM-Lens: Fast Interpretability Tooling That Scales to Trillion-Parameter Models 🔨LLVM

lesswrong.com·6d

Sequoia Ascent 2026 summary 🛠️Feature Engineering

karpathy.bearblog.dev·1h

Letting AI play my game – building an agentic test harness to help play-testing 🤖Transformers

blog.jeffschomay.com·1d·Hacker News

Log in to enable infinite scrolling