Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
📐 SIMD
Specific
SIMD, AVX, SSE, vectorization, data parallelism, CPU intrinsics
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
238
posts in
8.7
ms
Shortest-Path
FFT
: Optimal
SIMD
Instruction Scheduling via Graph Search
⚡
SIMD Vectorization
arxiv.org
·
3d
the value of a performance
oracle
⚙️
Mechanical Sympathy
wingolog.org
·
2d
·
Lobsters
,
Hacker News
The
Insert
Benchmark vs
MariaDB
10.2 to 13.0 on a 32-core server
📝
Database WAL
smalldatum.blogspot.com
·
1d
·
smalldatum.blogspot.com
Working with identity
columns
and sequences in Aurora
DSQL
⚙️
Database Internals
aws.amazon.com
·
3d
Blink: CPU-Free LLM Inference by
Delegating
the Serving Stack to GPU and
SmartNIC
⚡
Hardware Transactional Memory
arxiv.org
·
4h
Wattlytics
: A Web Platform for Co-Optimizing Performance, Energy, and
TCO
in HPC Clusters
⚙️
Mechanical Sympathy
arxiv.org
·
4h
Scheduling
Coflows
in Multi-Core
OCS
Networks with Performance Guarantee
🌐
Distributed Systems
arxiv.org
·
4h
Sparsity-Aware
Roofline
Models for Sparse Matrix-Matrix
Multiplication
⚡
SIMD Vectorization
arxiv.org
·
1d
Beyond Dense Connectivity:
Explicit
Sparsity
for Scalable Recommendation
💰
Cost-Based Optimization
arxiv.org
·
4h
The
Insert
Benchmark vs
MariaDB
10.2 to 13.0 on a 24-core server
⚙️
Database Internals
smalldatum.blogspot.com
·
2d
·
smalldatum.blogspot.com
ENEC
: A Lossless AI Model Compression Method Enabling Fast Inference on Ascend
NPUs
⚡
SIMD Optimization
arxiv.org
·
3d
Ensembles
at Any Cost? Accuracy-Energy Trade-offs in
Recommender
Systems
💰
Cost-Based Optimization
arxiv.org
·
4h
Analyzing Persistent
Alltoallv
RMA
Implementations for High-Performance MPI Communication
🔗
RDMA
arxiv.org
·
2d
DeepStack
: Scalable and Accurate Design Space Exploration for Distributed 3D-Stacked AI
Accelerators
⚡
SIMD Vectorization
arxiv.org
·
3d
NL-CPS
: Reinforcement Learning-Based Kubernetes Control Plane Placement in Multi-Region
Clusters
👑
Leader Election
arxiv.org
·
4h
Making Room for AI: Multi-GPU Molecular Dynamics with Deep
Potentials
in
GROMACS
⚡
SIMD Vectorization
arxiv.org
·
1d
Efficient Dataset Selection for
Continual
Adaptation of Generative
Recommenders
💰
Cost-Based Optimization
arxiv.org
·
4h
JZ-Tree
: GPU friendly neighbour search and friends-of-friends with dual tree walks in
JAX
plus CUDA
🌳
B+ Trees
arxiv.org
·
2d
Minos
:
Systematically
Classifying Performance and Power Characteristics of GPU Workloads on HPC Clusters
⚙️
Mechanical Sympathy
arxiv.org
·
3d
ALTO: Adaptive
LoRA
Tuning and
Orchestration
for Heterogeneous
LoRA
Training Workloads
📊
Columnar Execution
arxiv.org
·
2d
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help