Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔲 Loop Tiling
Cache Optimization, Blocking, Matrix Multiplication, Locality
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
112619
posts in
396.2
ms
Versor
: A Geometric
Sequence
Architecture
arxiv.org
·
1d
⚡
CUDA Programming Patterns
Evaluating Claude’s C
Compiler
Against
GCC
shbhmrzd.github.io
·
1d
·
Discuss:
r/C_Programming
🚀
Compiler Optimization
Proximity-driven
acceleration of challenging solid-phase peptide
couplings
pnas.org
·
2d
⚡
ONNX Runtime
Benchmarking for Single Feature Attribution with
Microarchitecture
Cliffs
arxiv.org
·
16h
🧠
CPU Architecture
Unleashing Computational Power: Ultimate Latency Optimization of Qwen3 and
Qwen3-VL
on AMD
MI300X
Series
lmsys.org
·
2d
🎛️
CUDA Optimization
christopherkarani/Wax
: 🍯 Memory layer for on-device AI Agents. Replace complex RAG pipelines with a serverless, single-file memory layer.
github.com
·
4h
·
Discuss:
Hacker News
⚡
Flash Attention
Breaking the
Tractability
Barrier: A Generic Low-Level Solver for
NP-Hard
Instances (N=63) on Commodity 64-Bit Silicon
zenodo.org
·
11h
·
Discuss:
Hacker News
🎯
Tensor Cores
Bitsum
. Real-time
CPU
Optimization and Automation
bitsum.com
·
1d
📊
Profiling Tools
Beyond
Latency
and Communication Complexity - A Tutorial on the
Pipes
Model
decentralizedthoughts.github.io
·
16h
🌊
CUDA Streams
BalatroBench
Benchmarks
Large Language Models Playing Balatro
balatrobench.com
·
10h
·
Discuss:
Hacker News
⚡
ONNX Runtime
OpenAI GPT-5.3-Codex-Spark Now Running at 1K Tokens Per
Secondon
BIG
Cerebras
Chips
servethehome.com
·
4h
·
Discuss:
Hacker News
⚡
Flash Attention
Minimum
Energy Per
Query
semiengineering.com
·
1d
📈
Occupancy Optimization
Best CPU 2026 – the top AMD
Ryzen
and Intel Core
processors
tested
club386.com
·
11h
🧠
CPU Architecture
SIEVE
: an Efficient Turn-Key Eviction Algorithm for Web
Caches
cachemon.github.io
·
2d
·
Discuss:
Hacker News
📊
Profiling Tools
Zero State
Architecture
deep
dive
news.ycombinator.com
·
1d
·
Discuss:
Hacker News
🎯
Tensor Cores
Nvidia’s new
technique
cuts LLM reasoning costs by 8x without losing
accuracy
venturebeat.com
·
23h
·
Discuss:
r/LocalLLaMA
🔗
NCCL
Parallel Track Transformers:
Enabling
Fast GPU Inference with Reduced
Synchronization
machinelearning.apple.com
·
3d
⏱️
CUDA Events
How
octorus
Renders
300K
Lines of Diff at High Speed
dev.to
·
14h
·
Discuss:
DEV
🏗️
Build Optimization
Allocators
from C to
Zig
antonz.org
·
1d
·
Discuss:
Lobsters
,
Hacker News
,
r/C_Programming
,
r/programming
🧠
CUDA Memory Management
RocksDB
10 and
TidesDB
8 Benchmark Analysis on Dedicated Threadripper
tidesdb.com
·
22h
·
Discuss:
Hacker News
📊
Profiling Tools
Sign up or log in to see more results
Sign Up
Login
« Page 2
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help