Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
⚡ Parallel Computing
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
154
posts in
24.8
ms
Scalable
parallel
3-D TEM inversion via rational approximation of the matrix exponential
🏺
Computational Archaeology
arxiv.org
·
1d
(VBS-NN) ML – 512k context length pre-training on a 12GB GPU
🖥️
Modern CPU
github.com
·
3d
·
Hacker News
Uncle Sam's next big
supercomputer
might use something more exotic than GPUs
🖥️
Terminal Renaissance
theregister.com
·
2d
·
r/hardware
Is it worth to study
HPC
and GPU programming?
🔩
Systems Programming
news.ycombinator.com
·
3d
·
Hacker News
China bypasses US GPU bans with 1.54-exaflops 'LineShine'
supercomputer
— CPU-only monster packs 2.4 million Huawei-designed Armv9 cores
⚡
Nordic Processors
tomshardware.com
·
3d
·
Hacker News
,
r/hardware
Mojo:
SIMD
⚡
SIMD Optimization
mojolang.org
·
3d
·
Hacker News
One Problem, Four Languages, Two Paradigms (2021)
🎞️
Tape Combinatorics
ashermancinelli.com
·
4d
·
Hacker News
Artain-AI/ignite-ms: Fast self-hosted embedding engine for search, RAG, and reindexing workloads on NVIDIA GPUs. Built in Rust + TensorRT for teams that care about
scale
, cost, and control.
📊
Performance Profiling
github.com
·
12h
·
Hacker News
Book review: The Thinking Machine
🖥️
Terminal Renaissance
muratbuffalo.blogspot.com
·
4d
·
Hacker News
,
Blogger
Show HN: FlashAttention-2 in Cute, from Scratch
⚡
Homebrew CPUs
blog.echen.io
·
3d
·
Hacker News
VMT19937: A
SIMD-Friendly
Pseudo Random Number Generator based on Mersenne Twister 19937
⚡
SIMD Optimization
arxiv.org
·
2d
ASSESSING THE STOCHASTIC PROPERTIES OF MODERN PSEUDO-RANDOM GENERATORS FOR
PARALLEL
COMPUTING
📼
Cassette Combinators
arxiv.org
·
2d
AutoVecCoder: Teaching LLMs to Generate Explicitly
Vectorized
Code
🦀
Rust Macros
arxiv.org
·
2d
MegaTrain Full Precision Training of 100B+ Parameter LLMs on a Single GPU
💻
Local LLMs
github.com
·
3d
·
Hacker News
AdaptiveLoad: Towards Efficient Video Diffusion Transformer Training
🌊
Streaming Algorithms
arxiv.org
·
2d
vyasgiridhar/moleqular: Molecular dynamics on Apple M4 — NEON intrinsics, SME2, Metal
compute
shaders,
OpenMP
. Pushing Apple Silicon to its limits.
⚡
Homebrew CPUs
github.com
·
4d
·
Hacker News
Source-to-Source Transformations for GPU Code Generation
⚡
SIMD Vectorization
arxiv.org
·
6d
APWA: A
Distributed
Architecture for
Parallelizable
Agentic Workflows
🌊
Streaming Systems
arxiv.org
·
6d
Malleable Molecular Dynamics
Simulations
with GROMACS and DMR
⚙️
Modern Assembly
arxiv.org
·
6d
Heuristic-Based Merging of
HPC
Traces to Extend Hardware Counter Coverage
💨
Cache Analysis
arxiv.org
·
3d
No more posts from matmat's subscribed feeds.
Scour all
24650
feeds
Learn more about Feeds
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help