GPU Programming

Feeds to Scour
SubscribedAll
Scoured 193 posts in 21.8 ms

Vortex 3.0 Released As Full-Stack, Open-Source RISC-V GPU Now With 3D Pipeline

 💾Computer Architecture
phoronix.com·

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

 🤖AI  Content type: Academic
arxiv.org··Hacker News

KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++

 Flash Attention  Content type: Code
github.com··Hacker News

RightNow-AI/AutoMegaKernel: An agent harness that compiles a model into one provably-correct, self-retargeting CUDA megakernel and self-tunes it past cuBLAS at batch-1 LLM decode.

 🤖AI  Content type: Code
github.com··Hacker News

LLM-Based Porting of Optimized C++ to CUDA Through Deoptimization and Reoptimization

 💬LLMs  Content type: Academic
arxiv.org·

SET: Stream-Event-Triggered Scheduling for Efficient CUDA Graph Pipelines

 Hardware Acceleration  Content type: Academic
arxiv.org·

xarray/osgverse: osgVerse, a complete 3d engine solution based on OpenSceneGraph. It supports OpenGL/OpenGLES/Vulkan/DirectX/Metal backends, and also works on modern browsers using WASM.

 Computer Graphics  Content type: Code
github.com·

Communication Strategy Selection for Multi-GPU 3D FDTD with Convolutional Perfectly Matched Boundary Layers

 Computer Graphics  Content type: Academic
arxiv.org·

Vulkan 1.4.353 Released With Three New Extensions

 🎮Game Engines
phoronix.com·

Trystan-SA/rproc: A Linux resource & process monitor inspired by Windows 11's Task Manager. Written in Rust with Slint

 Hardware Acceleration  Content type: Code
github.com··DEV

On GPU Implementation for Multi-Precision Integer Division

 Hardware Acceleration  Content type: Academic
arxiv.org·

HigherOrderCO/Bend: A massively parallel, high-level programming language

 Computer Graphics  Content type: Code
github.com·

MusaCoder: Native GPU Kernel Generation with Full-Stack Training on Moore Threads GPU

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability

 💬LLMs  Content type: Code
github.com··Hacker News

CodegenBench: Can LLMs Write Efficient Code Across Architectures?

 🤖AI  Content type: Academic
arxiv.org··Hacker News

Show HN: One-Shot Program Generation Through Direct Memory Diffusion

 🤖AI  Content type: Code
github.com··Hacker News

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help