Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 GPU Kernels
CUDA Kernels, Optimization, Memory Coalescing, Shared Memory
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
122475
posts in
1.96
s
Execution-Centric Characterization of
FP8
Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD
MI300A
arxiv.org
·
13h
🌊
CUDA Streams
The Linux driver
implementer
’s API guide
kernel.org
·
9h
📊
Profiling Tools
AI
Inference
Needs A
Mix-And-Match
Memory Strategy
semiengineering.com
·
9h
🎯
Tensor Cores
borodark/exmc
: Probabilistic programming in BEAM
github.com
·
21h
⚡
ONNX Runtime
DeepComputing
Unveils
RVA23-Compliant
Mainboard III for Linux on Framework 13
lxer.com
·
1d
🎯
Tensor Cores
Timing
and Memory
Telemetry
on GPUs for AI Governance
arxiv.org
·
1d
⏱️
CUDA Events
New AMD
Adrenalin
Driver
bluesnews.com
·
16h
🎮
NVIDIA
AI agent
sandboxing
in 2026: how to choose between primitives,
runtimes
, and platforms
manveerc.substack.com
·
23h
·
Discuss:
Substack
🏗️
Bazel
How a ‘
zombie
’
chipmaker
became Nvidia’s vital AI ally
ft.com
·
1d
⚡
CUDA Programming Patterns
Hitting
1,000
tokens
per second on a single RTX 5090
blog.alpindale.net
·
3d
·
Discuss:
Hacker News
,
Hacker News
🎛️
CUDA Optimization
Ph42oN
/
dxvk-gplasync
gitlab.com
·
20h
⏱️
CUDA Events
Inference
Providers
Leverage NVIDIA
Blackwell
to Drive 10x Reduction in Token Costs
storagereview.com
·
1h
🏎️
TensorRT
Intel Releases New Compute Runtime,
Upstreams
More
SYCL
Code To LLVM
phoronix.com
·
1d
🔧
PTX
building
cuda-gdb
from sources
redplait.blogspot.com
·
4d
·
Discuss:
redplait.blogspot.com
⚡
CUDA Programming Patterns
Running
Mistral-7B
on Intel
NPU
— 12.6 tokens/s, zero CPU/GPU usage
github.com
·
12h
·
Discuss:
r/LocalLLaMA
📊
Profiling Tools
How to connect
Convex
to
RunPod
for serverless GPU workloads
stack.convex.dev
·
2d
🔧
PTX
What Nvidia, Google and Meta Are Building Beyond
Chips
and
Compute
pymnts.com
·
1d
🔍
Nsight
AMD's 3D
V-Cache
is still the best gaming upgrade money can buy
xda-developers.com
·
19h
🔧
PTX
AI inference costs dropped up to 10x on Nvidia's
Blackwell
— but hardware is only half the
equation
venturebeat.com
·
2h
🏎️
TensorRT
Building
DamN64
: LLM-Assisted
N64
Development
vieux.fr
·
22h
·
Discuss:
Hacker News
💡
LSP
Loading...
Loading more...
« Page 2
•
Page 4 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help