Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Parallel Prefix Scan
🧮 Parallel Prefix Scan
Specific
prefix sum, Blelloch scan, inclusive scan, exclusive scan
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
149
posts in
10.9
ms
KJLdefeated/RL.cu
: RLVR training for LLM in CUDA/C++
🔢
Tensor Cores
Content type:
Code
github.com
·
6d
6 days ago
·
Hacker News
Actions for KJLdefeated/RL.cu: RLVR training for LLM in CUDA/C++
WarpGuard: Protected-Site Control-Flow Integrity for
CUDA
SASS Binaries
⚡
Hardware Acceleration
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for WarpGuard: Protected-Site Control-Flow Integrity for CUDA SASS Binaries
Less-relevant results
Training Cycle Halved: LoongForge End-to-End Optimization for GR00T N1.6 Delivers 2.3× Throughput
🖥️
GPU Computing
baidu-baige.github.io
·
6h
6 hours ago
·
Hacker News
Actions for Training Cycle Halved: LoongForge End-to-End Optimization for GR00T N1.6 Delivers 2.3× Throughput
Nvidia GeForce RTX 2080 Ti Super prototype shows what could have been, with 4,608
CUDA
cores
🖥️
GPU Computing
club386.com
·
1d
1 day ago
Actions for Nvidia GeForce RTX 2080 Ti Super prototype shows what could have been, with 4,608 CUDA cores
GPUsnek is Python on nVidia’s
CUDA
🖥️
GPU Computing
Content type:
Blog
blog.adafruit.com
·
3d
3 days ago
Actions for GPUsnek is Python on nVidia’s CUDA
AMD's Lemonade SDK For Local AI Adds NVIDIA
CUDA
Support
⚡
Hardware Acceleration
phoronix.com
·
3d
3 days ago
·
r/artificial
·
Cited by 1 article
Actions for AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support
Polars
GPU
engine —
cudf
26.06.01 documentation
🖥️
GPU Computing
Content type:
Reference
docs.rapids.ai
·
2d
2 days ago
·
Hacker News
Actions for Polars GPU engine — cudf 26.06.01 documentation
Framework Desktop AMD 395+ (rdna 3.5) cannot run confyui err Fix 2026
⚡
Hardware Acceleration
Content type:
Blog
runaihome.com
·
5d
5 days ago
·
DEV
Actions for Framework Desktop AMD 395+ (rdna 3.5) cannot run confyui err Fix 2026
RTX 5080 + RTX 3090 Setup: 80+ Tok/s on Qwen 3.6 27B Q8
🥾
Bootloaders
Content type:
Blog
imil.net
·
14h
14 hours ago
·
Hacker News
,
r/LocalLLaMA
·
Cited by 2 articles
Actions for RTX 5080 + RTX 3090 Setup: 80+ Tok/s on Qwen 3.6 27B Q8
Gerrymandering the Warp: Non-Control-Data Attacks on
CUDA
Collective Decision
⚡
Hardware Acceleration
Content type:
Academic
arxiv.org
·
2d
2 days ago
Actions for Gerrymandering the Warp: Non-Control-Data Attacks on CUDA Collective Decision
Exploiting
GPU
Tensor Cores from Java using Babylon [Juan Fumero]
🔢
Tensor Cores
openjdk.org
·
4d
4 days ago
·
Lobsters
,
r/java
Actions for Exploiting GPU Tensor Cores from Java using Babylon [Juan Fumero]
How to fit Qwen 3.6 35B A3B into 16GB of VRAM, & run it with Llama.cpp on an RTX 3080
⚡
Hardware Acceleration
autodidacts.io
·
51m
51 minutes ago
Actions for How to fit Qwen 3.6 35B A3B into 16GB of VRAM, & run it with Llama.cpp on an RTX 3080
Redditor buys RTX 2080 Ti Super engineering sample on eBay, has the same number of cores as an RTX Titan but half the VRAM
🖥️
GPU Computing
Content type:
News
tweaktown.com
·
1d
1 day ago
Actions for Redditor buys RTX 2080 Ti Super engineering sample on eBay, has the same number of cores as an RTX Titan but half the VRAM
Nvidia’s RTX Spark to fuel Adobe creative apps
🖥️
GPU Computing
jonpeddie.com
·
1d
1 day ago
Actions for Nvidia’s RTX Spark to fuel Adobe creative apps
NVIDIA RTX Pro 6000 Blackwell: 96GB GDDR7 and the End of VRAM Anxiety
🖥️
GPU Computing
Content type:
Blog
fitservers.com
·
4d
4 days ago
Actions for NVIDIA RTX Pro 6000 Blackwell: 96GB GDDR7 and the End of VRAM Anxiety
Making FlashAttention-4 faster for inference
🔢
Tensor Cores
Content type:
Blog
modal.com
·
2d
2 days ago
·
Hacker News
Actions for Making FlashAttention-4 faster for inference
Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent
🖥️
GPU Computing
Content type:
Blog
dnhkng.github.io
·
5d
5 days ago
Actions for Building & Benchmarking: LLMs on a 16GB Jetson Orin NX for Hermes Agent
hasktorch/hasktorch: Tensors and neural networks in Haskell
🤖
AI
Content type:
Code
github.com
·
20h
20 hours ago
Actions for hasktorch/hasktorch: Tensors and neural networks in Haskell
nomp: A Framework for Building Domain Specific Compilers
⚡
Hardware Acceleration
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for nomp: A Framework for Building Domain Specific Compilers
Flatpak 1.18 adds AMD ROCm support, improved error output, and faster Fish shell start-up
⚡
Hardware Acceleration
alternativeto.net
·
4d
4 days ago
Actions for Flatpak 1.18 adds AMD ROCm support, improved error output, and faster Fish shell start-up
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help