Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🖥️ OpenCL
Specific
Heterogeneous Computing, GPU Programming, Kernels, SPIR-V
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
121897
posts in
28.2
ms
A
n00b
PM’s guide to vibe coding
kernels
from scratch
⚡
Hardware Acceleration
ddmckinnon.com
·
4d
·
Hacker News
·
…
CUDA
Tile
Programming Now Available for BASIC!
🎮
SIMT Execution
developer.nvidia.com
·
1d
·
Hacker News
·
…
New
stable
kernels
for Thursday
🦅
Falco
lwn.net
·
9h
·
…
KTransformers
Adds AVX2 MoE Support For Viable Performance On CPUs Without
AMX/AVX-512
⚡
Hardware Acceleration
phoronix.com
·
12h
·
…
Inside the
Together
AI
kernels
team
🧩
mimalloc
together.ai
·
1d
·
…
Model2Kernel
: Model-Aware Symbolic Execution For Safe CUDA
Kernels
🔍
KLEE
arxiv.org
·
6d
·
…
From one model to seven — what it took to make
TurboQuant
model-portable
🦙
Ollama
dev.to
·
1d
·
DEV
·
…
yash27-lab/batch
_forge: A high-performance, bare-metal inference engine for JAX and Equinox models written in Rust. Features zero-copy
Safetensors
loading and hand-optimized Metal/Vulkan compute kernels for Transformers, Vision Language Models, and State-Space Models
🏛️
Embassy
github.com
·
3d
·
Hacker News
·
…
LLM Quantization,
Kernels
, and Deployment: How to Fine-Tune
Correctly
, Part 5
⏪
Deoptimization
pub.towardsai.net
·
1d
·
…
MXFP8
GEMM: Up to 99% of
cuBLAS
Performance Using CUDA and PTX
🧩
mimalloc
danielvegamyhre.github.io
·
4d
·
Hacker News
·
…
PyTorch
Call Stack Deep Dive: Tracing Tensor Operations from Python to C++
Kernels
🔥
PyTorch
next.redhat.com
·
6d
·
…
[Testing Update] 2026-03-28 -
Kernels
, Plasma 6.6.3,
LibreOffice
, Systemd, Mesa
🐧
Linux
forum.manjaro.org
·
5d
·
…
Standard Quantum Phase Estimation Detects All
Eigenvalues
via
Randomized
Initial States
⚛️
Quantum Computing
arxiv.org
·
18h
·
…
SCALE-TRACK: Asynchronous
Euler-Lagrange
particle tracking on heterogeneous computing architecture
🔍
DTrace
arxiv.org
·
2d
·
…
Open-Source
RadeonSI
+
Rusticl
Nearing Formal OpenCL 3.0 Conformance
⚡
Hardware Acceleration
phoronix.com
·
3d
·
…
Loop Control Management in
Tightly
Coupled Processor Arrays (
TCPAs
)
🗺️
Memory-Mapped Queues
arxiv.org
·
2d
·
…
Once-for-All Channel
Mixers
(
HYPERTINYPW
): Generative Compression for TinyML
🗜️
Columnar Compression
arxiv.org
·
6d
·
…
Agent
Factories
for High Level Synthesis: How Far Can
General-Purpose
Coding Agents Go in Hardware Optimization?
🎭
Program Synthesis
arxiv.org
·
6d
·
…
Adaptive
Learned
Image
Compression
with Graph Neural Networks
🗜️
Roaring Bitmaps
arxiv.org
·
6d
·
…
Non-local
Potts
model on random lattice and
chromatic
number of a plane
🔲
Cellular Automata
arxiv.org
·
3d
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help