Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔲 Loop Tiling
Cache Optimization, Blocking, Matrix Multiplication, Locality
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
80246
posts in
740.8
ms
FCDP
: Fully
Cached
Data Parallel for Communication-Avoiding Large-Scale Training
arxiv.org
·
1d
🔗
NCCL
DualMap
: Enabling Both Cache
Affinity
and Load Balancing for Distributed LLM Serving
arxiv.org
·
1d
📈
Occupancy Optimization
Show HN:
LocalGPT
– A local-first AI assistant in Rust with
persistent
memory
dev.to
·
2d
·
Discuss:
DEV
💡
LSP
Efficiency
and Performance
dev.to
·
3d
·
Discuss:
DEV
⚙️
Systems Programming
Performance Tip of the Week #53: Precise C++ benchmark
measurements
with Hardware Performance
Counters
abseil.io
·
2d
📊
Profiling Tools
Performance Tip of the Week #74: Avoid
sweeping
street lights under
rugs
abseil.io
·
2d
📊
Profiling Tools
ggml
: backend-agnostic tensor parallelism by
JohannesGaessler
· Pull Request #19378
github.com
·
4d
·
Discuss:
r/LocalLLaMA
🎯
Tensor Cores
The Adventures of a
Pythonista
in
Schemeland/29
artima.com
·
2d
🔍
Type Checkers
ahead-of-time wasm
gc
in
wastrel
wingolog.org
·
4d
·
Discuss:
Lobsters
,
Hacker News
🚀
Compiler Optimization
llOOPy
lOOPs (Dave
Jarvis
)
dave.autonoma.ca
·
4d
⚙️
Systems Programming
Clojure
’s Persistent Data Structures:
Immutability
Without the Performance Hit
javacodegeeks.com
·
4d
⚡
CUDA Programming Patterns
4x NVMe SSD home server (
CNC
aluminum and
walnut
)
umbrel.com
·
4d
·
Discuss:
Hacker News
🏗️
Build Systems
Virtual AI Memory
Chips
pgsgrove.com
·
4d
⚡
Flash Attention
Training language models on
TPUs
shouldn't be
scary
dogac.dev
·
4d
·
Discuss:
Hacker News
🏎️
TensorRT
[$]
Modernizing
swapping
: the end of the swap map
lwn.net
·
5d
⚡
CUDA Programming Patterns
Great Power, Great
Latency
: The Spider-Sense of
NUMA
Tuning
mydbanotebook.org
·
5d
📊
Profiling Tools
feldera/feldera
: The
Feldera
Incremental
Computation Engine
github.com
·
3d
🏗️
Build Optimization
1M
token context: The good, the bad and the
ugly
(2025)
micron.com
·
4d
·
Discuss:
Hacker News
⏱️
CUDA Events
**Abstract:** Modern ray tracing implementations in real-time rendering engines face significant performance bottlenecks in
fragment
shaders due to the
compl
...
freederia.com
·
5d
🔍
Nsight
Efficient
Benchmarking
of
Logical
Magic State
link.aps.org
·
5d
⏱️
Benchmarking
Loading...
Loading more...
« Page 7
•
Page 9 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help