🧠 Inference Serving - emschwartz · Scour

OSTEP Chapter 8

muratbuffalo.blogspot.com·14h·

Discuss: Blogger

Understanding Dynamic Compute Allocation in Recurrent Transformers

arxiv.org·1d

🧠LLM Inference

A practical systems engineering guide: Architecting AI-ready infrastructure for the agentic era

thenewstack.io·1d

🏗️LLM Infrastructure

Generative Engine Optimization: The Patterns Behind AI Visibility

searchengineland.com·1h

📊Feed Optimization

When Bigger Instances Don’t Scale

scylladb.com·1d·

Discuss: r/programming

⚡Systems Performance

Show HN: Latent-k – Persistent dependency map to reduce AI coding token usage

latentk.org·5h·

Discuss: Hacker News

🔌Claude Plugins

Machine Learning Based SPAM Detection Using ONNX in Java

foojay.io·1d

🧹Spam Filters

AFMTJ Model For In-Memory Computing (University of Arizona)

semiengineering.com·1d

📦In-process Databases

How LinkedIn Built a Next-Gen Service Discovery for 1000s of Services

blog.bytebytego.com·1d

🌐Distributed systems

hit-box/hitbox: Highly customizable async caching framework for Rust - from in-memory to distributed solutions, designed for high-performance applications

github.com·2d·

Discuss: r/rust

Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy

developer.nvidia.com·2d

🏗️LLM Infrastructure

How We Built Platybot: An AI-Powered Analytics Assistant

pulumi.com·18h

Garnix Blog: Forwardly-evaluated build systems

garnix.io·6h·

Discuss: Lobsters

🏗️Build Systems

Carnegie Mellon at NeurIPS 2025

blog.ml.cmu.edu·2h

🛡️AI Safety

Search Engine Tooling

joshbeckman.org·3h

📊Search Ranking

Benchmark & Compare the Best AI Models

arena.ai·3h

🏆LLM Benchmarking

🎲 Architecting for Resilience: When 150 RPS Becomes 2,000: Finding the Bottleneck

d13z.dev·1d

🏗️Infrastructure Economics

Mastra: Build AI agents with a modern TypeScript stack

producthunt.com·23h

Structured Context Engineering for File-Native Agentic Systems

simonwillison.net·1d

🏗️LLM Infrastructure

DFlash: Block Diffusion for Flash Speculative Decoding

z-lab.ai·1d·

Discuss: Hacker News

🕯️Candle ML

Loading more...