Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
GPU Programming
🖥️ GPU Programming
CUDA, Parallel Computing, Graphics APIs, Compute Shaders
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
193
posts in
21.8
ms
Vortex 3.0 Released As Full-Stack,
Open-Source
RISC-V
GPU
Now With 3D Pipeline
💾
Computer Architecture
phoronix.com
·
1d
1 day ago
Actions for Vortex 3.0 Released As Full-Stack, Open-Source RISC-V GPU Now With 3D Pipeline
AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
🤖
AI
Content type:
Academic
arxiv.org
·
1d
1 day ago
·
Hacker News
Actions for AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis
KJLdefeated/RL.cu
: RLVR training for LLM in CUDA/C++
⚡
Flash Attention
Content type:
Code
github.com
·
3d
3 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
RightNow-AI/AutoMegaKernel: An agent harness that compiles a model into one provably-correct, self-retargeting
CUDA
megakernel and self-tunes it past cuBLAS at batch-1 LLM decode.
🤖
AI
Content type:
Code
github.com
·
2d
2 days ago
·
Hacker News
Actions for RightNow-AI/AutoMegaKernel: An agent harness that compiles a model into one provably-correct, self-retargeting CUDA megakernel and self-tunes it past cuBLAS at batch-1 LLM decode.
LLM-Based Porting of Optimized C++ to
CUDA
Through Deoptimization and Reoptimization
💬
LLMs
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for LLM-Based Porting of Optimized C++ to CUDA Through Deoptimization and Reoptimization
SET:
Stream-Event-Triggered
Scheduling for Efficient
CUDA
Graph
Pipelines
⚡
Hardware Acceleration
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for SET: Stream-Event-Triggered Scheduling for Efficient CUDA Graph Pipelines
xarray/osgverse: osgVerse, a complete 3d engine solution based on OpenSceneGraph. It supports
OpenGL/OpenGLES/Vulkan/DirectX/Metal
backends, and also works on modern browsers using WASM.
✨
Computer Graphics
Content type:
Code
github.com
·
4d
4 days ago
Actions for xarray/osgverse: osgVerse, a complete 3d engine solution based on OpenSceneGraph. It supports OpenGL/OpenGLES/Vulkan/DirectX/Metal backends, and also works on modern browsers using WASM.
Communication Strategy Selection for
Multi-GPU
3D FDTD with Convolutional Perfectly Matched Boundary Layers
✨
Computer Graphics
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Communication Strategy Selection for Multi-GPU 3D FDTD with Convolutional Perfectly Matched Boundary Layers
Vulkan
1.4.353 Released With Three New Extensions
🎮
Game Engines
phoronix.com
·
5d
5 days ago
Actions for Vulkan 1.4.353 Released With Three New Extensions
Trystan-SA/rproc: A Linux resource &
process
monitor inspired by Windows 11's Task Manager. Written in Rust with Slint
⚡
Hardware Acceleration
Content type:
Code
github.com
·
2d
2 days ago
·
DEV
Actions for Trystan-SA/rproc: A Linux resource & process monitor inspired by Windows 11's Task Manager. Written in Rust with Slint
On
GPU
Implementation for Multi-Precision Integer Division
⚡
Hardware Acceleration
Content type:
Academic
arxiv.org
·
5d
5 days ago
Actions for On GPU Implementation for Multi-Precision Integer Division
HigherOrderCO/Bend: A massively
parallel
, high-level
programming
language
✨
Computer Graphics
Content type:
Code
github.com
·
6d
6 days ago
Actions for HigherOrderCO/Bend: A massively parallel, high-level programming language
MusaCoder: Native
GPU
Kernel Generation with Full-Stack Training on Moore Threads
GPU
🎮
Reinforcement Learning
Content type:
Academic
arxiv.org
·
6d
6 days ago
Actions for MusaCoder: Native GPU Kernel Generation with Full-Stack Training on Moore Threads GPU
zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP
APIs
for
programmatic
access. It supports Windows/MacOS/Linux with full
GPU
capability
💬
LLMs
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for zhongkaifu/TensorSharp: A C# inference engine for running large language models (LLMs) locally using GGUF model files. TensorSharp provides a console application, a web-based chatbot interface, and Ollama/OpenAI-compatible HTTP APIs for programmatic access. It supports Windows/MacOS/Linux with full GPU capability
CodegenBench: Can LLMs Write Efficient Code Across
Architectures
?
🤖
AI
Content type:
Academic
arxiv.org
·
6d
6 days ago
·
Hacker News
Actions for CodegenBench: Can LLMs Write Efficient Code Across Architectures?
Show HN: One-Shot
Program
Generation Through Direct Memory Diffusion
🤖
AI
Content type:
Code
github.com
·
5d
5 days ago
·
Hacker News
Actions for Show HN: One-Shot Program Generation Through Direct Memory Diffusion
« Page 1
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help