Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.net·1d·
Discuss: DEV
🏗️NUMA
Flag this post
Myths Programmers Believe about CPU Caches
software.rajivprab.com·5d·
Discuss: Hacker News
🧠Memory Models
Flag this post
Predicting & Mitigating Data Corruption in Pure Storage Flash Arrays via Adaptive Bit Error Rate Modeling
dev.to·21h·
Discuss: DEV
📋Columnar Storage
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.com·3d
📊Profile-Guided Optimization
Flag this post
How to Design Efficient Memory Architectures for Agentic AI Systems
pub.towardsai.net·12h
🧠Memory Models
Flag this post
Enabling Trillion-Parameter Models on AWS EFA
research.perplexity.ai·7h·
Discuss: Hacker News
Performance Engineering
Flag this post
Inside Pinecone: Slab Architecture
pinecone.io·14h·
Discuss: Hacker News
📋Columnar Storage
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·14h·
Discuss: Hacker News
🏗️NUMA
Flag this post
'No Free Lunch: Deconstruct Efficient Attention with MiniMax M2'
lmsys.org·1d
📊Profile-Guided Optimization
Flag this post
Don't let these 3 CPU specs trick you into paying more
xda-developers.com·1d
Performance Engineering
Flag this post
On Designing Low-Latency Systems for High-Traffic Environments
hackernoon.com·1d
⚖️Load Balancing
Flag this post
Porting Lean to the ESP32-C3 RISC-V Microcontroller
kuruczgy.com·6h·
⚙️Systems Programming
Flag this post
Crushing ML Latency: The (Un)Official Best Practices for Systems Optimisation
pub.towardsai.net·2h
📊Profile-Guided Optimization
Flag this post
My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.io·2d·
Discuss: Hacker News
🗑️Garbage Collection
Flag this post
Low-Level Hacks
blog.raycursive.com·1d·
Discuss: Hacker News
⚙️Systems Programming
Flag this post
Sable and Able: A Tale of Two ASIs
lesswrong.com·1h
🔐Capability Systems
Flag this post
MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning
arxiv.org·2h
🧠Memory Models
Flag this post
Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)
semiengineering.com·1d
🔌Embedded Systems
Flag this post
Showcase: In Memoria - Rust core with TypeScript/NAPI interface for high-performance AI tooling
reddit.com·15h·
Discuss: r/rust
🕸️WebAssembly
Flag this post
Reliability assessment of multi-performance system incorporating multiple common buses and transformation devices
sciencedirect.com·16h
🔌Embedded Systems
Flag this post