⚡ Performance - nmarshall · Scour

uiCA: Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures

⚙️CPU Microarchitecture Academic

arxiv.org··Hacker News

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

🧠AI Blog

tilert.ai··Hacker News

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

🤖AI Inference Code

github.com··Hacker News

DiffusionGemma: 4x Faster Text Generation

🌟Ray Tracing News Blog

blog.google··Hacker News, r/LocalLLaMA, r/singularity

Why AI code optimization needs production-grounded benchmarks

⚙️Performance Profiling Blog

datadoghq.com··Hacker News

The Edge LLM Offload Story

semiengineering.com·

On-device AI is a margin decision

🧠AI Blog

ziraph.com··Hacker News

Homebrew, Again

🧠AI Blog

jerryz.bearblog.dev·

How much do amd64 microarchitecture levels help in Go?

🏗Computer Architecture Blog

lemire.me··Lobsters, Hacker News, r/golang

The economics of speculative decoding

✨vibe-coding Blog

fergusfinn.com··Hacker News

How We Ditched Postgres for ClickHouse to Process 12 Billion Caches Per Day

📊ClickHouse Blog

momentic.ai··Hacker News

A cute little trick to running classic IIR filters on the GPU

🎵Audio DSP Blog

themaister.net··Hacker News

Global memory shortage throws wrench into IT pros’ budgets, planning

🏗️System Design News

itbrew.com··Hacker News

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

💻Local LLMs Discussion

news.ycombinator.com··Hacker News

18 months later, the RTX 50 series' biggest feature is still waiting for games that don't exist

🌟Ray Tracing

xda-developers.com·

Blaise v0.10.0 (alpha) — Mandatory `()`, Native Backend, Threads & Incremental Compilation 🎉 · graemeg blaise · Discussion #82

🔨Build Systems Code

github.com··Hacker News, r/programming

Passing DBs Through Continuations

↩️Continuation Passing Blog

remy.wang··Lobsters, Hacker News

Beyond the Memory Wall: The CPU Was Helping You All Along

💾Cache Optimization Blog

prawns.dev··Hacker News

Why I care so much about energy per token

💻Local LLMs Blog

ziraph.com··Hacker News

AMD shipped Nvidia's new AI laptop over a year ago, and the software is finally catching up

⚡Hardware Acceleration

xda-developers.com·

Log in to enable infinite scrolling