Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
⏱️ Instruction Scheduling
Pipeline, Out-of-Order, Dependencies, Latency
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
184833
posts in
30.3
ms
AccelSync
:
Verifying
Synchronization Coverage in Accelerator Pipeline Programs
🛡️
Intel CET
arxiv.org
·
1d
Three CPU
Generations
That Changed Everything: A Latency-Focused History of
x86
⚡
BOLT
lucisqr.substack.com
·
6d
·
Substack
nviennot/core-to-core-latency
: Measures the latency between CPU
cores
📊
Intel PMU
github.com
·
6h
·
Hacker News
How to
Eliminate
Pipeline
Friction
in AI Model Serving
🌀
Naiad
developer.nvidia.com
·
1h
Efficient Remote Memory
Ordering
for
Non-Coherent
Systems
🔁
Cache Coherence
danglingpointers.substack.com
·
6h
·
Substack
FPGA
Spectrum
Engine
🔌
FPGA Programming
hackaday.io
·
5d
Per-Phase Fidelity Attribution for Quantum
Compilers
using
HBR
Decomposition
📊
Profile-Guided Optimization
arxiv.org
·
15h
CCD-Level
and Load-Aware Thread Orchestration for In-Memory Vector
ANNS
on Multi-Core CPUs
🚀
Milvus
arxiv.org
·
15h
Beyond
Static
Policies: Exploring Dynamic Policy
Selection
for Single-Thread Performance Optimization
📍
CPU Pinning
arxiv.org
·
4d
Non-Monotonic
Latency in Apple MPS Decoding: KV Cache Interactions and Execution
Regimes
🔁
Cache Coherence
arxiv.org
·
15h
31.1 A
14.08-to-135.69Token/s
ReRAM-on-Logic
Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
🌊
Memory Bandwidth
arxiv.org
·
15h
AutoRAGTuner
: A
Declarative
Framework for Automatic Optimization of RAG Pipelines
📊
Performance Tools
arxiv.org
·
6d
KV-RM
:
Regularizing
KV-Cache Movement for Static-Graph LLM Serving
🎴
TAO
arxiv.org
·
15h
Janus: Compiler-Based Defense Against
Transient
Execution Attacks Using ARM Hardware
Primitives
🏷️
Memory Tagging
arxiv.org
·
15h
PoTAcc
: A Pipeline for End-to-End Acceleration of Power-of-Two Quantized
DNNs
🧮
Intel MKL-DNN
arxiv.org
·
4d
Scratchpad
Patching
: Decoupling Compute from Patch Size in Byte-Level Language Models
🏷️
Pointer Tagging
arxiv.org
·
15h
UniVer
: A Unified Perspective for Multi-step and Multi-draft
Speculative
Decoding
📦
Folly
arxiv.org
·
5d
Nitsum
: Serving
Tiered
LLM Requests with Adaptive Tensor Parallelism
🦙
Ollama
arxiv.org
·
4d
Triage
: An Adaptive Parallel Window Decoding
Scheduler
for Real-time Fault-Tolerant Quantum Computation
⚛️
Quantum Computing
arxiv.org
·
5d
Low-Latency
Out-of-Core
ANN
Search in High-Dimensional Space
📊
Vectorized Query Execution
arxiv.org
·
4d
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help