🧠 Inference Serving - emschwartz · Scour

🔗 Explainer: Tree-sitter vs. LSP

yellowduck.be·7h

Generative Engine Optimization: The Patterns Behind AI Visibility

searchengineland.com·23h

📊Feed Optimization

🎲 Architecting for Resilience: When 150 RPS Becomes 2,000: Finding the Bottleneck

d13z.dev·2d

🏗️Infrastructure Economics

A Reference Architecture for a Next Generation Global Reporting Platform

cockroachlabs.com·1d

🏗️Search Infrastructure

Eye on AI 👁️2/12

itsdougholland.com·8h

A training principle for drifting models

breno.bearblog.dev·5h

🧠LLM Inference

The middle ground between canonical models and data mesh

frederickvanbrabant.com·2d·

Discuss: r/programming

☁️Cloudflare D1

Building a Regex Engine with a team of parallel Claudes

lesswrong.com·1d

🔍RegEx Engines

How we cut Vertex AI latency by 35% with GKE Inference Gateway

cloud.google.com·5d

🏗️LLM Infrastructure

Learning to Detect Baked Goods with Limited Supervision

arxiv.org·1d

Efficient Remote Prefix Fetching with GPU-native Media ASICs

arxiv.org·1d

🔮Prefetching

The AI-Moderated Research Platform

localhost·2d

Is Local Hardware Is All You Need?

wwws.nightwatchcybersecurity.com·2d·

Discuss: Hacker News

🏗️LLM Infrastructure

Vibe Coding for Scientists

vibe-coding-intro.vercel.app·1d

🔌Claude Plugins

Show HN: Fighting the War Against Expensive Reinforcement Learning

cadenza-landing-qtu7gbjwb-akshparekh123-3457s-projects.vercel.app·9h·

Discuss: Hacker News

Testing 80 LLMs on spatial reasoning on grids

mihai.page·3d·

Discuss: Hacker News

🏆LLM Benchmarking

Embedded Agency (full-text version)

lesswrong.com·1d

🪄Prompt Engineering

How I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

mohammedeabdelaziz.github.io·5d·

Discuss: Hacker News

🏗️LLM Infrastructure

LLM Performance in Astro, React, Tailwind and Cloudflare

10xbench.ai·1d·

Discuss: Hacker News

🏆LLM Benchmarking

ktop: Python-based terminal system resource monitor for hybrid LLM workloads

github.com·22h

⚡Systems Performance

Loading more...