Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔢 cuBLAS
CUDA Linear Algebra, Matrix Operations, GPU BLAS, cuBLASLt
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
81088
posts in
675.6
ms
ggml
: backend-agnostic tensor parallelism by
JohannesGaessler
· Pull Request #19378
github.com
·
3d
·
Discuss:
r/LocalLLaMA
🎯
Tensor Cores
The Little Book of
Linear
Algebra
little-book-of.github.io
·
2d
📉
Model Quantization
Heterogeneous
Processing: A Strategy for
Augmenting
Moore's Law (2006)
linuxjournal.com
·
16h
·
Discuss:
Hacker News
⚡
CUDA Programming Patterns
building
cuda-gdb
from sources
redplait.blogspot.com
·
16h
·
Discuss:
redplait.blogspot.com
⚡
CUDA Programming Patterns
Hitting
1,000
tokens
per second on a single RTX 5090
blog.alpindale.net
·
6h
·
Discuss:
Hacker News
🎛️
CUDA Optimization
Main
Content ||
Math
∩ Programming
jeremykun.com
·
7h
📉
Model Quantization
Fastfood
: Approximate Kernel Expansions in
Loglinear
Time
paperium.net
·
1d
·
Discuss:
DEV
🔗
Kernel Fusion
How
Anam
Achieved 250% Faster Inference Using
Zymtrace
Continuous GPU Profiling
zymtrace.com
·
4h
🔍
Nsight
🚀
OLSRT
v1.2: A Powerful
Runtime
for All Programming Languages!
dev.to
·
6h
·
Discuss:
DEV
💡
LSP
Getting
started
with C++
MathGL
on Windows and Linux
solarianprogrammer.com
·
1d
✂️
CUTLASS
Graphics
Programming
Conference
graphicsprogrammingconference.com
·
13h
🎮
NVIDIA
Hierarchical Low‑Rank
Multigrid
Preconditioning
for 3‑D Acoustic Boundary Element Matrices A Practical Path Toward Real‑Time Scattering Simulation
freederia.com
·
2d
🎯
Tensor Cores
llama.cpp
guide - Running LLMs
locally
, on any hardware, from scratch
blog.steelph0enix.dev
·
1h
💡
LSP
Build a
Compiler
in Five Projects
kmicinski.com
·
1d
🚀
Compiler Optimization
From Prediction to
Compilation
: A Manifesto for
Intrinsically
Reliable AI
news.ycombinator.com
·
17h
·
Discuss:
Hacker News
🤖
AI Coding Tools
How PCIe,
NVLink
, and
NUMA
Topology Affect GPU Scheduling Outcomes
dev.to
·
16m
·
Discuss:
DEV
📊
CUDA Graphs
Reducing the Computational Cost Scaling of Tensor Network Algorithms via
Field-Programmable
Gate Array
Parallelism
arxiv.org
·
3d
🎯
Tensor Cores
Dirk
Eddelbuettel
:
chronometre
: A new package (pair) demo for R and Python
dirk.eddelbuettel.com
·
12h
🔄
ONNX
Normal Map
Compression
Revisited
ludicon.com
·
18h
⚡
CUDA Programming Patterns
MicroBlaze
MCS Seven-Segment Counter on
Basys
3 FPGA
hackster.io
·
1d
🔧
PTX
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help