Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔧 PTX
GPU Assembly, CUDA ISA, Kernel Optimization, Low-level Programming
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
123752
posts in
1.01
s
ZipFlow
: a Compiler-based Framework to Unleash
Compressed
Data Movement for Modern GPUs
arxiv.org
·
1d
🌊
CUDA Streams
How
Anam
Achieved 250% Faster Inference Using
Zymtrace
Continuous GPU Profiling
zymtrace.com
·
2d
🔍
Nsight
Rewrote
my Node.js data generator in Rust. 20x faster, but the 15MB binary (vs 500MB node_
modules
) is the real win.
algomimic.com
·
18h
·
Discuss:
r/rust
📊
Profiling Tools
Running my
kernel
on real
hardware
kamkow1lair.pl
·
1d
·
Discuss:
Hacker News
🏗️
Build Systems
Simulate
Faster with
SimAI
Software for High Returns at a Low Cost of Ownership
semiengineering.com
·
8h
🔄
SIMD Programming
Reverse Engineering the
PROM
for the Silicon Graphics
O2
blog.adafruit.com
·
23h
⏱️
CUDA Events
Parallel Track Transformers:
Enabling
Fast GPU Inference with Reduced
Synchronization
machinelearning.apple.com
·
1d
⏱️
CUDA Events
Our testing shows the Ryzen 7 9800X3D can match the pricier Ryzen 7
9850X3D
with simple
PBO
settings — AMD's latest CPU can't leverage extra clock speed in game...
tomshardware.com
·
2h
⏱️
Benchmarking
RISC-V
Mentorship
Taught Me the RISC-V
ISA
Is Far More Than a Reference Manual
riscv.org
·
22h
🔄
SIMD Programming
JUXT
Blog: From
specification
to stress test: a weekend with Claude
juxt.pro
·
16h
⚡
CUDA Programming Patterns
AMD Medusa Halo “Ryzen AI MAX”
SoCs
rely on
LPDDR6
– bandwidth as a strategic lever
igorslab.de
·
11h
⏱️
CUDA Events
MPSpeed
: Implementing and Optimizing
MPC-in-the-Head
Digital Signatures in Hardware
eprint.iacr.org
·
2d
⚡
CUDA Programming Patterns
New
microkernel
OS in 10 days: From zero to Google
Compute
Engine
seiya.me
·
1d
·
Discuss:
Hacker News
⚙️
Systems Programming
ALPHA-PIM
: Analysis of Linear
Algebraic
Processing for High-Performance Graph Applications on a Real Processing-In-Memory System
arxiv.org
·
11h
🔢
cuBLAS
DeepComputing
Unveils
RVA23-Compliant
Mainboard III for Linux on Framework 13
lxer.com
·
1h
🎯
Tensor Cores
Anubis
OSS
— Local LLM Benchmarking for Apple Silicon
devpadapp.com
·
1d
·
Discuss:
r/opensource
📊
Profiling Tools
Getting Started with
Sapphire
Edge+ and AMD
Embedded
+
hackster.io
·
1d
·
Discuss:
Hacker News
⏱️
CUDA Events
building
cuda-gdb
from sources
redplait.blogspot.com
·
3d
·
Discuss:
redplait.blogspot.com
⚡
CUDA Programming Patterns
Heterogeneous
Processing: A Strategy for
Augmenting
Moore's Law (2006)
linuxjournal.com
·
3d
·
Discuss:
Hacker News
⚡
CUDA Programming Patterns
CUDA
Guide:
Workflow
for Performance Tuning
digitalocean.com
·
6d
⚡
CUDA Programming Patterns
Loading...
Loading more...
« Page 1
•
Page 3 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help