Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔲 Loop Tiling
Cache Optimization, Blocking, Matrix Multiplication, Locality
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
82840
posts in
1.65
s
WritePolicyBench
: Benchmarking Memory Write Policies under
Byte
Budgets
arxiv.org
·
8h
📈
Occupancy Optimization
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Stratum
: Architecting a
Configurable
Cache Simulator with C++ and Racket
thecloudlet.github.io
·
2d
·
Discuss:
Hacker News
🧠
CPU Architecture
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Mitigating
Staleness
in Asynchronous Pipeline
Parallelism
via Basis Rotation
arxiv.org
·
8h
🌊
CUDA Streams
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
The
Heartbeat
of Tetris 🟥🟥🟥🟥: What a
1x1
Pixel Taught Me About Concurrency
qianarthurwang.substack.com
·
20h
·
Discuss:
r/programming
⚡
CUDA Programming Patterns
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
How Virtual
Textures
Really Work
shlom.dev
·
1h
·
Discuss:
Hacker News
📈
GPU Occupancy
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Show HN: C discrete event SIM w
stackful
coroutines runs 45x faster than
SimPy
github.com
·
21h
·
Discuss:
Hacker News
⏱️
CUDA Events
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Millets
: A practical memory-safety and thread-safety
experiment
eagledot.xyz
·
1d
·
Discuss:
Lobsters
,
Hacker News
⚙️
Systems Programming
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Go Deep Dive:
Mutex
vs
RWMutex
dev.to
·
1h
·
Discuss:
DEV
⚡
CUDA Programming Patterns
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
ML for Energy-Performance-Aware Scheduling On Heterogeneous
Multicore
Architectures (
Cambridge
)
semiengineering.com
·
1d
📈
Occupancy Optimization
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Semantic LLM Cache:
Vector-Based
Caching
for Java (Spring Boot)
dev.to
·
5h
·
Discuss:
DEV
🏗️
Build Optimization
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Diffusion LLM Sampling Achieves 70%
Latency
Reduction With Novel
NPU
Design
quantumzeitgeist.com
·
2d
🎯
Tensor Cores
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
slow
abstraction
steel-water.bearblog.dev
·
6h
🐕
Ruff
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Demystifying
ARM SME to Optimize General Matrix
Multiplications
news.ycombinator.com
·
2d
·
Discuss:
Hacker News
🔄
SIMD Programming
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
WebGPU
Cameras
webgpufundamentals.org
·
5h
🎮
NVIDIA
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Using
Nsight
Compute with large
codebases
- Part 2 : Profiling large code bases
blog.ncompass.tech
·
21h
·
Discuss:
Hacker News
🔍
Nsight
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
**Abstract:** This paper introduces a novel approach to stabilizing simulated
spacetime
geometries
in high-performance computing environments by leveraging h...
freederia.com
·
1h
✂️
CUTLASS
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Linear-time classical
approximate
optimization of
cubic-lattice
classical spin glasses
link.aps.org
·
1d
🔀
Operator Fusion
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Intel attacks the workstation segment with
Xeon
600 featuring up to 86
cores
and a new platform
igorslab.de
·
8h
🧠
CPU Architecture
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Claude Code's
renderer
is more
complex
than a game engine
spader.zone
·
1d
·
Discuss:
Hacker News
📈
GPU Occupancy
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Anthropic
's Performance Take-Home: A 65x Optimization (For
Dummies
)
ikot.blog
·
23h
·
Discuss:
Hacker News
🎛️
CUDA Optimization
Preview
Share
Show Feeds
Block Domain
Report Post
Harmful Content
Off Topic
Low Quality
Spam
Misleading
Duplicate
Wrong Language
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help