Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
馃 Memory Management
Garbage Collection, Reference Counting, Linear Types, Region-based
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
2647
posts in
9.0
ms
KV
Cache
Offloading
for Context-Intensive Tasks
聽
馃彈
Computer Architecture
arxiv.org
路
10h
ForkKV
: Scaling Multi-LoRA Agent Serving via Copy-on-Write
Disaggregated
KV Cache
聽
馃彈
Computer Architecture
arxiv.org
路
1d
CIDER
: Boosting
Memory-Disaggregated
Key-Value Stores with Pessimistic Synchronization
聽
馃彈
Computer Architecture
arxiv.org
路
4d
Dual-Pool Token-Budget
Routing
for Cost-Efficient and Reliable LLM
Serving
聽
馃攼
Cryptography
arxiv.org
路
10h
Top-K Retrieval with Fixed-Size Linear-Attention Completion:
Backbone
- and
KV-Format-Preserving
Attention for
KV-Cache
Read Reduction
聽
馃幉
Probabilistic Programming
arxiv.org
路
2d
FluxMoE
:
Decoupling
Expert Residency for High-Performance MoE Serving
聽
馃幉
Probabilistic Programming
arxiv.org
路
4d
MIPT-SSM
: Scaling Language Models with $O(1)$ Inference Cache via Phase Transitions
聽
馃幉
Probabilistic Programming
arxiv.org
路
10h
AudioKV
: KV Cache
Eviction
in Efficient Large Audio Language Models
聽
馃彈
Computer Architecture
arxiv.org
路
1d
Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal
Dense
Associative
Memory
聽
馃搳
Information Theory
arxiv.org
路
10h
TRAPTI
: Time-Resolved Analysis for
SRAM
Banking and Power Gating Optimization in Embedded Transformer Inference
聽
馃彈
Computer Architecture
arxiv.org
路
1d
A Synthesis Method of Safe Rust Code Based on
Pushdown
Colored
Petri
Nets
聽
馃
Rust
arxiv.org
路
4d
StructKV
: Preserving the Structural
Skeleton
for Scalable Long-Context Inference
聽
馃幉
Probabilistic Programming
arxiv.org
路
1d
Attention Editing: A
Versatile
Framework for Cross-Architecture Attention
Conversion
聽
馃幉
Probabilistic Programming
arxiv.org
路
2d
Salt:
Self-Consistent
Distribution Matching with
Cache-Aware
Training for Fast Video Generation
聽
馃彈
Computer Architecture
arxiv.org
路
4d
DualDiffusion
: A
Speculative
Decoding Strategy for Masked Diffusion Models
聽
馃幉
Probabilistic Programming
arxiv.org
路
2d
Your LLM Agent Can Leak Your Data: Data
Exfiltration
via
Backdoored
Tool Use
聽
馃攼
Cryptography
arxiv.org
路
2d
PG-MDP
: Profile-Guided Memory
Dependence
Prediction for Area-Constrained Cores
聽
馃彈
Computer Architecture
arxiv.org
路
10h
Lightweight
LLM Agent Memory with Small Language Models
聽
馃幉
Probabilistic Programming
arxiv.org
路
10h
Fast
Heterogeneous
Serving: Scalable Mixed-Scale LLM Allocation for
SLO-Constrained
Inference
聽
馃幉
Probabilistic Programming
arxiv.org
路
10h
MemCoT
: Test-Time
Scaling
through Memory-Driven Chain-of-Thought
聽
馃幉
Probabilistic Programming
arxiv.org
路
10h
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help