H-FA: A Hybrid Floating-Point and Logarithmic Approach to Hardware Accelerated FlashAttention
arxiv.org·11h
🔢SIMD
Flag this post
Low-Level Hacks
🦀Rust
Flag this post
Detailed Technical Documentation on AI Implementation Logic (Taking Large Language Models as an Example )
🏗️CPU Architecture
Flag this post
TIL: For long-lived LLM sessions, swapping KV Cache to RAM is ~10x faster than recalculating it. Why isn't this a standard feature?
📊Performance Tools
Flag this post
Building blobd: single-machine object store with sub-millisecond reads and 15 GB/s uploads
⚡Zig
Flag this post
Free Functions Don't Change Performance (Much)
🦀Rust
Flag this post
Dive into Systems
🖥️Operating Systems
Flag this post
GDM: Consistency Training Helps Limit Sycophancy and Jailbreaks in Gemini 2.5 Flash
lesswrong.com·15m
✨Shader Programming
Flag this post
Taming the Billion Dollar Mistake: Maarten Balliauw’s Guide to C# Nullable Reference Types
blog.jetbrains.com·3h
🦀Rust
Flag this post
Radar Trends to Watch: November 2025
oreilly.com·4h
⌨Programming
Flag this post
Don't let these 3 CPU specs trick you into paying more
xda-developers.com·20h
🔬RISC-V
Flag this post
Reverse Engineering Googles BotGuard
🔩Assembly
Flag this post
Loading...Loading more...