Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🎯 GPU Kernels
CUDA Kernels, Optimization, Memory Coalescing, Shared Memory
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
122691
posts in
1.18
s
GamingOnLinux - Vulkan-based translation layer
D7VK
officially expands to include
Direct3D
5 support
store.steampowered.com
·
3d
📈
GPU Occupancy
Learning Optimization Tools
trendhunter.com
·
2d
🔗
Kernel Fusion
Guney-olu/nanoslg
: A from-scratch implementation of distributed LLM inference in simple readable Python
github.com
·
3d
·
Discuss:
Hacker News
,
r/LLM
⏱️
CUDA Events
Kaoru
Pairs A Novel Parallel Readout Architecture via Software-Level
Transistor
Grouping
zenodo.org
·
2d
·
Discuss:
Hacker News
⚡
CUDA Programming Patterns
How GPU Cloud Providers Handle
Long-Tail
Job
Backlogs
acecloud.ai
·
3d
·
Discuss:
DEV
🚀
MLOps
Google
Tensor
G3
: Difference between revisions
wiki.postmarketos.org
·
2d
🎯
Tensor Cores
The RAM shortage finally
convinced
me to learn memory
overclocking
xda-developers.com
·
3h
📈
Occupancy Optimization
GeForce RTX 6090 in 2028 at the
earliest
: When memory shortages
dictate
Nvidia's roadmap
igorslab.de
·
3d
⏱️
CUDA Events
Stable
kernels
for Wednesday
lwn.net
·
1d
🔗
Kernel Fusion
Graphics
Programming
Conference
graphicsprogrammingconference.com
·
4d
🎮
NVIDIA
AI's GPU problem is actually a data
delivery
problem
venturebeat.com
·
3d
⏱️
CUDA Events
H100
GPU:
Powering
the Next Era of AI and High-Performance Computing
dev.to
·
6d
·
Discuss:
DEV
🔗
NCCL
AndPuQing/gflow
: A lightweight, single-node GPU job scheduler implemented in Rust.
github.com
·
1d
·
Discuss:
Hacker News
📊
CUDA Graphs
Is
Micron
the New Nvidia?
finance.yahoo.com
·
3d
🔍
Nsight
ECHO: Efficient
Covertly-Secure
Three-party
Computation
with Applications to Private Machine Learning
eprint.iacr.org
·
2d
🎯
Tensor Cores
Memory
Bandwidth
Napkin
Math
forrestthewoods.com
·
4d
🔲
Loop Tiling
🦆 Lance x
DuckDB
SQL Retrieval, 🚗 Uber-Scale Storage, ⚡ 1.5M
IOPS
lancedb.com
·
3d
🌳
Git Internals
AFMTJ
Model For In-Memory Computing (University of
Arizona
)
semiengineering.com
·
1d
⚡
CUDA Programming Patterns
Quantized
Tensor Train Compression For Turbulent Flow Simulation: O(log N) Scaling with
Reynolds-Independent
Bond Dimension
zenodo.org
·
3d
·
Discuss:
Hacker News
🏎️
TensorRT
Using Accelerated Computing to
Live-Steer
Scientific
Experiments
at Massive Research Facilities
developer.nvidia.com
·
1d
🌊
CUDA Streams
Loading...
Loading more...
« Page 6
•
Page 8 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help