🎯 GPU Kernels - miterion · Scour

No high-quality results found.

Less-relevant results

CUDA-Oxide 0.2 Brings Early Improvements To Pure Rust CUDA Kernels

⚡CUDA Programming Patterns

Neural Cellular Automata with WebGPU

🔗NCCL Blog

ivanludvig.dev··Hacker News

On the Limits of Performance Portability in Directive-Based GPU Programming

🌐Distributed Computing Academic

GPUsnek is Python on nVidia’s CUDA

⚡CUDA Programming Patterns Blog

blog.adafruit.com·

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

📊CUDA Graphs Code

github.com··Hacker News

The Inference Alpha: Maximizing Frontier Models on AMD

📈Occupancy Optimization Blog

digitalocean.com·

Making FlashAttention-4 faster for inference

🎯Tensor Cores Blog

modal.com··Hacker News

The Parallel Revolution: A Comprehensive Guide to GPU Computing

🔥PyTorch Blog

fitservers.com·

Virtual Thread Pinning: The Silent Performance Killer in Your Codebase

📈Occupancy Optimization

javacodegeeks.com·

Game Porting Toolkit 4 is here!

🏗️Build Systems

developer.apple.com··Hacker News, r/macgaming·Cited by 1 article

🥇Top AI Papers of the Week

📈Occupancy Optimization News

nlp.elvissaravia.com·

Density Field State Space Models: 1-Bit Distillation, Efficient Inference, and Knowledge Organization in Mamba-2

🎯Tensor Cores Academic

Nvidia DGX Spark GB10 – AI Models and Guide with vLLM and Autonomous Script

🎮NVIDIA Code

github.com··Hacker News

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

⚡CUDA Programming Patterns Academic

arxiv.org··Hacker News

Coupling Complementary Simulations for Combined Performance and Energy Optimization

🌐Distributed Computing Academic

Log in to enable infinite scrolling