Attention Is All You Need for KV Cache in Diffusion LLMs
paperium.net·1d·
Discuss: DEV
🔁Cache Coherence
Flag this post
Low-Level Hacks
blog.raycursive.com·1d·
Discuss: Hacker News
🦀Rust
Flag this post
Crushing ML Latency: The (Un)Official Best Practices for Systems Optimisation
pub.towardsai.net·7h
🚀Performance
Flag this post
Inside Pinecone: Slab Architecture
pinecone.io·20h·
Discuss: Hacker News
📋Columnar Storage
Flag this post
DCcluster-Opt: Benchmarking Dynamic Multi-Objective Optimization for Geo-Distributed Data Center Workloads
arxiv.org·1d
🏗️System Design
Flag this post
Moving past speculation: How deterministic CPUs deliver predictable AI performance
venturebeat.com·3d
🏗Computer Architecture
Flag this post
Benchmarking the cost of Java's EnumSet - A Second Look
kinnen.de·17h·
Discuss: r/programming
📏Linear Types
Flag this post
Cutting LLM Batch Inference Time in Half: Dynamic Prefix Bucketing at Scale
daft.ai·19h·
Discuss: Hacker News
🎴TAO
Flag this post
A C example with objects and a arena for allocations, what do you think?
reddit.com·1h·
🦀Rust
Flag this post
Boosting React Performance: A Guide to Optimization
dev.to·3h·
Discuss: DEV
📊Performance Tools
Flag this post
Disassembling Terabytes of Random Data with Zig and Capstone to Prove a Point
jstrieb.github.io·3h·
Discuss: Hacker News
🔓Binary Exploitation
Flag this post
Porting Lean to the ESP32-C3 RISC-V Microcontroller
kuruczgy.com·11h·
⚙️Systems Programming
Flag this post
'No Free Lunch: Deconstruct Efficient Attention with MiniMax M2'
lmsys.org·1d
📱Edge AI
Flag this post
[Talk] Improving the Incremental System in the Rust Compiler
blog.goose.love·17h
🔨Incremental Compilation
Flag this post
Building a highly-available web service without a database
screenshotbot.io·4h·
Discuss: r/programming
🦀Rust
Flag this post
Thoughts on "Static Retrival Revisited"
curiouscoding.nl·1d
#️⃣Hash Tables
Flag this post
Essential Things to Know Before Upgrading Your Computer Memory
buysellram.com·1d·
Discuss: Hacker News
🧠Memory Management
Flag this post
Generalizing Test-Time Compute-Optimal Scaling as an Optimizable Graph
huggingface.co·8h·
Discuss: Hacker News
🎴TAO
Flag this post
Show HN: A pragmatic SQLite schema for application-level caching
gist.github.com·1d·
Discuss: Hacker News
🗄️SQLite
Flag this post
Myths Programmers Believe about CPU Caches
software.rajivprab.com·5d·
Discuss: Hacker News
🔁Cache Coherence
Flag this post