Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
⚡ SIMD Optimization
AVX-512, Vectorization, Loop Unrolling, Auto-vectorization
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
33051
posts in
10.5
ms
Exposing More
Parallelism
Is the Hidden Reason Why Some Vectorized Loops Are Faster - Not
Vectorization
per se
johnnysswlab.com
·
2d
·
Discuss:
Hacker News
⚡
SIMD
Fast
Autoscheduling
for Sparse ML
Frameworks
fredrikbk.com
·
5h
·
Discuss:
Hacker News
🕯️
Candle
μpack
: Faster & more flexible
integer
compression
blog.cf8.gg
·
2d
·
Discuss:
r/programming
,
r/rust
🔬
RaBitQ
[Benchmark]
Qwen3.5-122B-A10B
FP8 weights / bf16 KV on 8x RTX PRO 6000 (SM120): 1,985 tok/s burst, MTP 2.75x, fp8 KV silent corruption finding · Issue #19603
github.com
·
3h
·
Discuss:
r/LocalLLaMA
🖥
GPUs
Deep Dive: How
StarRocks
Built a High-Performance
Vectorized
Engine
starrocks.io
·
4d
⚡
Vectorized Execution
anadim/AdderBoard
: Smallest transformer that can add two 10-digit numbers
github.com
·
17h
🎯
Vector Quantization
Bitwise
Systolic
Array Architecture for
Runtime-Reconfigurable
Multi-precision Quantized Multiplication on Hardware Accelerators
arxiv.org
·
2d
⚡
Hardware Acceleration
Why
Structured
Kernels
?
modular.com
·
2d
⚡
Hardware Acceleration
The
RISC
Concept - A Survey of
Implementations
inf.fu-berlin.de
·
1d
🖥️
Hardware Architecture
Gilles
Darold
:
pgdsat
version 2.0
postgr.es
·
2h
🐘
pgvector
Every Hardware Deserves a Coder:
Devstral
Small 2
24B
and Qwen3 Coder 30B
byteshape.com
·
8h
·
Discuss:
Hacker News
🔬
RaBitQ
Parallelizable
Search-Space Decomposition for Large-Scale Combinatorial Optimization Problems Using
Ising
Machines
arxiv.org
·
2d
⚡
Vectorized Execution
Build Your Own Key-Value Storage Engine—Week 7
read.thecoder.cafe
·
2d
🌳
Data Structures
Maximizing GPU
Utilization
with NVIDIA Run:ai and NVIDIA
NIM
developer.nvidia.com
·
1d
📊
Model Serving Economics
Essential
Python
Libraries
for Data Science
pub.towardsai.net
·
1d
🕯️
Candle
An AI-Native Architecture That
Eliminates
GPU
Inefficiencies
semiwiki.com
·
2d
⚡
Hardware Acceleration
fast-servers: an
interesting
pattern
geocar.sdf1.org
·
14h
·
Discuss:
Lobsters
🧵
Async
TENSURE
: Fuzzing Sparse Tensor
Compilers
(Registered Report)
ndss-symposium.org
·
5h
·
Discuss:
Hacker News
🕯️
Candle
An FPGA-based Accelerator Addressing Bottlenecks in GNN
Preprocessing
(
KAIST
et al.)
semiengineering.com
·
2d
⚡
Hardware Acceleration
Profile-guided
optimization
go.dev
·
1d
⚡
Systems Performance
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help