Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
⚡ SIMD
Specific
Vectorization, AVX, Performance, Parallel Processing
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
145211
posts in
11.4
ms
Shortest-Path
FFT
: Optimal
SIMD
Instruction Scheduling via Graph Search
🚫
Branch-Free Programming
arxiv.org
·
1d
themankindproject/simd-bp128-rs
: High-performance
SIMD-BP128
integer compression library for Rust with scalar, SSE4.1, AVX2, and AVX-512 backends.
📦
Nixpkgs
github.com
·
4d
SOTA
Normalization
Performance with Torch.compile
🔮
Branch Prediction
pytorch.org
·
20h
·
Hacker News
Agentic AI
Demands
More Than
GPUs
⚙️
Performance Profiling
semiwiki.com
·
12h
Optimising a
Pipelined
RISC-V Core: From Naive Pipeline to
Near-Superscalar
Performance
⚙️
CPU Pipeline
mummanajagadeesh.github.io
·
1d
·
Lobsters
,
Hacker News
Portability
and the Road Ahead
⚙️
Performance Profiling
modular.com
·
6d
Vector Database Performance Compared:
pgvector
vs Pinecone vs Qdrant vs
Weaviate
⚙️
Performance Profiling
vecstore.app
·
3d
·
r/programming
Analyzing Persistent
Alltoallv
RMA
Implementations for High-Performance MPI Communication
⚙️
Performance Profiling
arxiv.org
·
23h
Fast
Cross-Operator
Optimization of Attention
Dataflow
🧩
Memory Pooling
arxiv.org
·
1d
teamchong/turboquant-wasm
:
TurboQuant
WASM SIMD vector compression — 3 bits/dim with fast dot product. Requires relaxed SIMD (Chrome 114+, Firefox 128+, Safari 18+, Node 20+)
🐛
Fuzz Testing
github.com
·
4d
·
Hacker News
3D-Stacked
NMP
, LLM Decoding, Systolic Array
Microarchitecture
, Multi-Core Scheduling
🧩
Memory Pooling
arxiv.org
·
1d
NEURA
: A Unified and
Retargetable
Compilation Framework for Coarse-Grained Reconfigurable Architectures
🚫
Branch-Free Programming
arxiv.org
·
1d
DeepStack
: Scalable and Accurate Design Space Exploration for Distributed 3D-Stacked AI
Accelerators
⚙️
Performance Profiling
arxiv.org
·
1d
L-SPINE: A Low-Precision
SIMD
Spiking
Neural Compute Engine for Resource-efficient Edge Inference
🚫
Branch-Free Programming
arxiv.org
·
1d
Adaptive Parallel
Monte
Carlo
Tree Search for Efficient Test-time Compute Scaling
🧩
Memory Pooling
arxiv.org
·
6d
Computer Architecture's
AlphaZero
Moment: Automated Discovery in an
Encircled
World
🚫
Branch-Free Programming
arxiv.org
·
1d
Diagonal-Tiled
Mixed-Precision Attention for Efficient Low-Bit
MXFP
Inference
🧩
Memory Pooling
arxiv.org
·
1d
Is
RISC-V
Ready for Machine Learning? Portable Gaussian Processes Using
Asynchronous
Tasks
🔮
Branch Prediction
arxiv.org
·
6d
Minos
:
Systematically
Classifying Performance and Power Characteristics of GPU Workloads on HPC Clusters
⚙️
Performance Profiling
arxiv.org
·
1d
Making
Array-Based
Translation Practical for Modern, High-Performance
Buffer
Management
🧩
Memory Pooling
arxiv.org
·
6d
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help