My First Multi-GPU Kernel: Writing All-to-All for AMD MI300X
gau-nernst.github.io·11h·
Discuss: Hacker News
📊Performance Profiling
Flag this post
Why Multimodal AI Broke the Data Pipeline — And How Daft Is Beating Ray and Spark to Fix It
hackernoon.com·7h
SIMD
Flag this post
Vectorizing for Fun and Performance
ibm.com·4d·
Discuss: Hacker News
SIMD
Flag this post
Kimi Linear: An Expressive, Efficient Attention Architecture
arxiviq.substack.com·1d·
Discuss: Substack
SIMD
Flag this post
Can-t stop till you get enough
cant.bearblog.dev·18h·
Discuss: Hacker News
📏Linear Types
Flag this post
Programming for Computations: Matlab/Octave
link.springer.com·8h·
Discuss: Hacker News
💻Programming languages
Flag this post
Cons Should Not Cons Its Arguments, Part II: Cheney on the MTA
web.archive.org·6h·
Discuss: Hacker News
💻Programming languages
Flag this post
A hitchhiker's guide to CUDA programming
seanzhang.me·3d·
Discuss: Hacker News
📊Performance Profiling
Flag this post
Building Yantra: A Visual Workflow Automation Engine
patali.dev·9h·
Discuss: Hacker News
🌊Stream Processing
Flag this post
Cycle-accurate 6502 emulator as coroutine in Rust
github.com·1d·
🔄Async Rust
Flag this post
A Thesis and Playbook for Edge AI
ondeviceguy.substack.com·2h·
Discuss: Substack
🌐Edge Computing
Flag this post
Machine Scheduler in LLVM – Part II
myhsu.xyz·1d·
📊Performance Profiling
Flag this post
A New Faster Algorithm for Gregorian Date Conversion
benjoffe.com·5h·
Discuss: Hacker News, r/cpp
💹Rust Finance
Flag this post
The Evolution of GPUs: How Floating-Point Changed Computing
dell.com·22h·
Discuss: Hacker News
SIMD
Flag this post
Incremental Compilation in Recursive‑Descent Parser (Roslyn)
langdev.stackexchange.com·17h·
Discuss: Hacker News
💻Programming languages
Flag this post
Generation at the Speed of Thought: Speculative Decoding
bittere.substack.com·1d·
Discuss: Substack
💻Programming languages
Flag this post
Doo: A Simple, Fast Programming Language Built on Rust and LLVM
news.ycombinator.com·4h·
Discuss: Hacker News
💻Programming languages
Flag this post
Scaling Coding-Agent RL to 32x H100s. 160% Improvement on Stanford's TBench
github.com·20m·
Discuss: Hacker News
🌐Distributed systems
Flag this post
The next RISC-V processor frontier: AI
edn.com·3d·
Discuss: Hacker News
SIMD
Flag this post
A Practitioner's Guide to Kolmogorov-Arnold Networks
arxiviq.substack.com·18h·
Discuss: Substack
📈Time Series ML
Flag this post