Performance

Feeds to Scour
SubscribedAll
Scoured 24 posts in 9.8 ms

uiCA: Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures

 ⚙️CPU Microarchitecture  Content type: Academic
arxiv.org··Hacker News

Two Leaps to 1000 Tokens/s on a 1T-Parameter Model: On Inference Systems, Execution Boundaries, and Co-Design

 🧠AI  Content type: Blog
tilert.ai··Hacker News

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

 🤖AI Inference  Content type: Code
github.com··Hacker News

Why AI code optimization needs production-grounded benchmarks

 ⚙️Performance Profiling  Content type: Blog
datadoghq.com··Hacker News

The Edge LLM Offload Story

 🤖LLMs
semiengineering.com·

Why I care so much about energy per token

 💻Local LLMs  Content type: Blog
ziraph.com··Hacker News

How much do amd64 microarchitecture levels help in Go?

 🏗Computer Architecture  Content type: Blog

Homebrew, Again

 🧠AI  Content type: Blog
jerryz.bearblog.dev·

How We Ditched Postgres for ClickHouse to Process 12 Billion Caches Per Day

 📊ClickHouse  Content type: Blog
momentic.ai··Hacker News

The economics of speculative decoding

 vibe-coding  Content type: Blog

fully offline, human-powered local AI

 🍓single board computers

A cute little trick to running classic IIR filters on the GPU

 🎵Audio DSP  Content type: Blog

Global memory shortage throws wrench into IT pros’ budgets, planning

 🏗️System Design  Content type: News
itbrew.com··Hacker News

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

 💻Local LLMs  Content type: Discussion

Blaise v0.10.0 (alpha) — Mandatory `()`, Native Backend, Threads & Incremental Compilation 🎉 · graemeg blaise · Discussion #82

 🔨Build Systems  Content type: Code

18 months later, the RTX 50 series' biggest feature is still waiting for games that don't exist

 🌟Ray Tracing
xda-developers.com·

Passing DBs Through Continuations

 ↩️Continuation Passing  Content type: Blog

Beyond the Memory Wall: The CPU Was Helping You All Along

 💾Cache Optimization  Content type: Blog
prawns.dev··Hacker News

AMD shipped Nvidia's new AI laptop over a year ago, and the software is finally catching up

 Hardware Acceleration
xda-developers.com·

Aperio: Lightweight search engine in Rust – GBs of data in < 1ms, < 256MB RAM

 🦀Rust  Content type: Code

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help