⚡ High Performance - CWhiting · Scour

Benchmarking Subquadratic's latest model and SSA Kernel 📊AI Performance Profiling

appen.com·10h·Hacker News

How Superhuman and Databricks built a 200K QPS inference platform together 🏗️LLM Infrastructure

databricks.com·6d

Exploring LLMs Speed Benchmarks 🏗️LLM Infrastructure

mlops.community·1d

Let's talk benchmarking 📊Benchmarking

spacetimedb.com·8h·r/rust

nviennot/core-to-core-latency: Measures the latency between CPU cores ⚙️CPUs

github.com·2d·Hacker News

AMD uProf 5.3: Profiling Tool Gets DuckDB, Faster Reports and More Zen Analysis ⚡Performance Tools

igorslab.de·21h

AI versus Throughput 📊AI Performance Profiling

michaelnygard.com·3d

MLCommons Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces ⚡Performance Mythology

Tenstorrent Unveils Galaxy AI Platform Targeting Scale and Efficiency 🌊Streaming Systems

forbes.com·17h·Hacker News

Scaling PCIe Controllers for AI Bandwidth: A Multistream Architecture Analysis for 64 GT/s and 128 GT/s 🎮GPU Microarchitecture

semiengineering.com·1d

What Breaks at 1M AI Requests per Day? 📊Model Serving Economics

digitalocean.com·3d

Jankmarking: Janky Benchmarking 📊AI Performance Profiling

williamangel.net·6d·Hacker News

Non-Monotonic Latency in Apple MPS Decoding: KV Cache Interactions and Execution Regimes 🎯Emulator Accuracy

FractalSortCPU: Bandwidth-Efficient Compressed Radix Sort on CPU 📊Columnar Databases

arxiv.org·2d·Hacker News

KV-RM: Regularizing KV-Cache Movement for Static-Graph LLM Serving 🎯Data Locality

Beyond Static Policies: Exploring Dynamic Policy Selection for Single-Thread Performance Optimization ⏱️Runtime Performance Analysis

Enhancing Instruction Prefetching via Cache and TLB Management 🖥️Hardware Architecture

CUDAHercules: Benchmarking Hardware-Aware Expert-level CUDA Optimization for LLMs 🏗️LLM Infrastructure

Latency Analysis and Optimization of Alpamayo 1 via Efficient Trajectory Generation ⚡Interpreter Optimization

gym-invmgmt: An Open Benchmarking Framework for Inventory Management Methods 🏆LLM Benchmarking

Log in to enable infinite scrolling