Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.net·22h·
Discuss: DEV
💾Cache Design
Flag this post
Myths Programmers Believe about CPU Caches
software.rajivprab.com·5d·
Discuss: Hacker News
⚙️Systems Programming
Flag this post
Predicting & Mitigating Data Corruption in Pure Storage Flash Arrays via Adaptive Bit Error Rate Modeling
dev.to·15h·
Discuss: DEV
🔌Embedded Systems
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.com·2d
SIMD
Flag this post
How to Design Efficient Memory Architectures for Agentic AI Systems
pub.towardsai.net·7h
🛡️Memory Safety
Flag this post
Enabling Trillion-Parameter Models on AWS EFA
research.perplexity.ai·2h·
Discuss: Hacker News
Performance Engineering
Flag this post
Inside Pinecone: Slab Architecture
pinecone.io·9h·
Discuss: Hacker News
🗄️Database Internals
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·9h·
Discuss: Hacker News
🔢algo
Flag this post
'No Free Lunch: Deconstruct Efficient Attention with MiniMax M2'
lmsys.org·1d
📝Parser Combinators
Flag this post
Don't let these 3 CPU specs trick you into paying more
xda-developers.com·1d
Performance Engineering
Flag this post
On Designing Low-Latency Systems for High-Traffic Environments
hackernoon.com·1d
⚖️Load Balancing
Flag this post
Porting Lean to the ESP32-C3 RISC-V Microcontroller
kuruczgy.com·1h·
⚙️Systems Programming
Flag this post
My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.io·2d·
Discuss: Hacker News
🗑️Garbage Collection
Flag this post
Low-Level Hacks
blog.raycursive.com·23h·
Discuss: Hacker News
⚙️Systems Programming
Flag this post
Co-Simulation Framework for Parallel DNN Execution on Chiplet-Based Systems (UW–Madison, Washington State)
semiengineering.com·1d
🔌Embedded Systems
Flag this post
Showcase: In Memoria - Rust core with TypeScript/NAPI interface for high-performance AI tooling
reddit.com·10h·
Discuss: r/rust
🕸️WebAssembly
Flag this post
Reliability assessment of multi-performance system incorporating multiple common buses and transformation devices
sciencedirect.com·11h
🔌Embedded Systems
Flag this post
Show HN: Polyglot standard library HTTP client C/C++/Rust/Python and benchmarks
github.com·21h·
Discuss: Hacker News
🔨Compiler Design
Flag this post
Benchmarking the cost of Java's EnumSet - A Second Look
kinnen.de·6h·
Discuss: r/programming
#️⃣Hash Tables
Flag this post
Running MiniMax-M2 locally - Existing Hardware Advice
reddit.com·9h·
Discuss: r/LocalLLaMA
⚙️Systems Programming
Flag this post