🎮 SIMT Execution - hello · Scour

CUDA 13.3: NVIDIA continues to move GPU programming from the thread to the tile 🎨WGPU

igorslab.de·5d

CUDA Cores vs. Tensor Cores ⚡Hardware Acceleration

+12 years of programming, now what? 🎮WebGPU

en.wikipedia.org·21h·r/programming

Nvidia Maxwell Architecture 🎮WebGPU

developer.nvidia.com·14h·Hacker News

Caspar: CUDA Accelerator for Symbolic Programming with Adaptive Reordering 🌀Naiad

llama.cpp B9387 Significant AMD/ROCm PP Update 🧮MKL

github.com·4d·r/LocalLLaMA

When does fragmentation occur in the CUDA caching allocator? 🧱Slab Allocation

docs.pytorch.org·11h·Hacker News

Nvidia ARM Laptop Chip N1X Confirmed for Computex: CUDA and RTX 5070 GPU Onboard ⚡Hardware Acceleration

techtimes.com·2d

NVIDIA CUDA 13.3 Rolls Out CUDA Python 1.0, CUDA Tile For C++ ⚡Hardware Acceleration

Nvidia's long-awaited N1/N1X SoC specs leak ahead of Computex launch — N1 to feature up to 20 Arm-based cores, standard N1 equipped with 12- and 10-core configs ⚡Hardware Acceleration

tomshardware.com

·1d

ROS2 vs Isaac ROS: 8x Perception Speedup with NITROS ⚡LMAX Disruptor

tildalice.io·3d

openclaw/clawpatch v0.5.0 ⚡Ruff

RAFI -- A Ray/Work Forwarding Infrastructure for Data Parallel Multi-Node/Multi-GPU Computing 🚀Milvus

avencera/speakrs: Speaker diarization in Rust. 312–912x realtime on Apple Silicon, 50–121x on CUDA. Matches pyannote accuracy. 🍱Nom

github.com·6d·Hacker News, r/rust

Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation 💬Prompt Engineering

jmaczan/tiny-vllm: Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM 🦙Ollama

github.com·3d·Hacker News

TC-MIS: Maximal Independent Set on Tensor-cores 🕸️GraphBLAS

jndean/gpusnek: GPU-Parallelizing Arbitrary Python Code By Running 1 Million Python Interpreters on a GPU 🐍 🎮WebGPU

github.com·5d·Hacker News

zayokami/Talos-XII: A deep learning framework based on the gacha mechanics of Arknights: Endfield. 以《明日方舟：终末地》的抽卡学习为基准的深度学习框架。 ⚙️XLA

github.com·3d·r/rust

NVIDIA CUDA 13.3 Rolls Out CUDA Python 1.0, CUDA Tile For C++ ⚡Hardware Acceleration

phoronix.com·5d·Hacker News

Log in to enable infinite scrolling