Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.net·1d·
Discuss: DEV
🔁Cache Coherence
Flag this post
Low-Level Hacks
blog.raycursive.com·1d·
Discuss: Hacker News
🦀Rust
Flag this post
Crushing ML Latency: The (Un)Official Best Practices for Systems Optimisation
pub.towardsai.net·12h
🚀Performance
Flag this post
Inside Pinecone: Slab Architecture
pinecone.io·1d·
Discuss: Hacker News
📋Columnar Storage
Flag this post
10 Smart Performance Hacks For Faster Python Code
blog.jetbrains.com·7h
FastAPI
Flag this post
Run LLMs Locally
ikangai.com·10m·
Discuss: Hacker News
🚀Performance
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.com·3d
🏗Computer Architecture
Flag this post
A C example with objects and a arena for allocations, what do you think?
reddit.com·6h·
🦀Rust
Flag this post
DCcluster-Opt: Benchmarking Dynamic Multi-Objective Optimization for Geo-Distributed Data Center Workloads
arxiv.org·1d
🏗️System Design
Flag this post
Boosting React Performance: A Guide to Optimization
dev.to·8h·
Discuss: DEV
📊Performance Tools
Flag this post
Disassembling Terabytes of Random Data with Zig and Capstone to Prove a Point
jstrieb.github.io·8h·
🔓Binary Exploitation
Flag this post
Porting Lean to the ESP32-C3 RISC-V Microcontroller
kuruczgy.com·16h·
⚙️Systems Programming
Flag this post
'No Free Lunch: Deconstruct Efficient Attention with MiniMax M2'
lmsys.org·1d
📱Edge AI
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·1d·
Discuss: Hacker News
🎴TAO
Flag this post
[Talk] Improving the Incremental System in the Rust Compiler
blog.goose.love·22h
🔨Incremental Compilation
Flag this post
Benchmarking the cost of Java's EnumSet - A Second Look
kinnen.de·22h·
Discuss: r/programming
📏Linear Types
Flag this post
The state of SIMD in Rust in 2025
shnatsel.medium.com·2h·
Discuss: r/rust
🔀SIMD Programming
Flag this post
Thoughts on "Static Retrival Revisited"
curiouscoding.nl·1d
#️⃣Hash Tables
Flag this post
Generalizing Test-Time Compute-Optimal Scaling as an Optimizable Graph
huggingface.co·13h·
Discuss: Hacker News
🎴TAO
Flag this post
Show HN: A pragmatic SQLite schema for application-level caching
gist.github.com·2d·
Discuss: Hacker News
🗄️SQLite
Flag this post