CUDA Linear Algebra, Matrix Operations, GPU BLAS, cuBLASLt

Flying with Andes and Condor
jonpeddie.com·10h
⏱️CUDA Events
Flag this post
Enhanced spatial clustering of single-molecule localizations with graph neural networks
nature.com·2d
🔗Kernel Fusion
Flag this post
I just trained a physics-based earthquake forecasting model on a $1000 GPU
news.ycombinator.com·1d·
Discuss: Hacker News
🔗NCCL
Flag this post
🏚️ CSS Art: Haunted House with Parallax Layers
codepen.io·54m·
Discuss: DEV
Flash Attention
Flag this post
Defeating KASLR by Doing Nothing at All
googleprojectzero.blogspot.com·1d·
📊Profiling Tools
Flag this post
Supercharging the ML and AI Development Experience at Netflix
netflixtechblog.com·12h
🤖AI Coding Tools
Flag this post
Petri Dish Neural Cellular Automata
pub.sakana.ai·8h·
Discuss: Hacker News
🔗NCCL
Flag this post
Get Ready for Clojure, GPU, and AI in 2026 with CUDA 13.0
dragan.rocks·5d·
Discuss: Hacker News
⏱️CUDA Events
Flag this post
Playing Around with ARM Assembly
blog.nobaralabs.com·1d·
Discuss: Hacker News
📊Profiling Tools
Flag this post
Optimizing filtered vector queries from tens of seconds to single-digit milliseconds in PostgreSQL
reddit.com·15m·
Discuss: r/programming
🐕Ruff
Flag this post
Kahn’s Algorithm and Cycle Detection in Directed Graphs
dev.to·1d·
Discuss: DEV
🔀Operator Fusion
Flag this post
Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization
arxiv.org·1d
🛠Ml-eng
Flag this post
Show HN: a Rust ray tracer that runs on any GPU – even in the browser
github.com·1d·
Discuss: Hacker News
🔍Nsight
Flag this post
Biological Regulatory Network Inference through Circular Causal Structure Learning
arxiv.org·4h
🔄ONNX
Flag this post
On-chip cavity electro-acoustics using lithium niobate phononic crystal resonators
arxiv.org·2d
🔄ONNX
Flag this post
The Curvature Rate {\lambda}: A Scalar Measure of Input-Space Sharpness in Neural Networks
arxiv.org·1d
📉Model Quantization
Flag this post
I Built Figma for AI Coding (Using Itself)
dev.to·18h·
Discuss: DEV
🤖AI Coding Tools
Flag this post
Unlock Linear Solver Speed: Symbolic Preconditioning for Hyper-Performance
dev.to·6d·
Discuss: DEV
🎯Tensor Cores
Flag this post
Transformer-Based Decoding in Concatenated Coding Schemes Under Synchronization Errors
arxiv.org·1d
Flash Attention
Flag this post