Context Windows

Feeds to Scour
SubscribedAll
Scoured 490 posts in 10.4 ms

STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control

馃LLMContent type: Academic
arxiv.org

markusheimerl/gpt: A generative pretrained transformer implementation

馃挰LLMsContent type: Code
github.comHacker News

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

馃LLMContent type: Academic
arxiv.org

mingusb/transformer-golf: The Fully Unrolled Transformer: An experimental repository for architecture simplification and compilation. [2026]

馃挰LLMsContent type: Code
github.comHacker News

From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

馃LLMContent type: Academic
arxiv.org

harshuljain13/llm-inference-at-scale: A Practitioner handbook for production llm serving.

馃LLMContent type: Code
github.comHacker News

RKSC: Reasoning-Aware KV Cache Sharing and Confident Early Exit for Multi-Step LLM Inference

馃LLMContent type: Academic
arxiv.org

Dynamic Linear Attention

馃挰LLMsContent type: Academic
arxiv.org

A Unifying View of Attention Sinks: Two Algorithms, Two Solutions

馃捑Cognitive OffloadingContent type: Academic
arxiv.org

Attention Expansion: Enhancing Keyphrase Extraction from Long Documents with Attention-Augmented Contextualized Embeddings

馃挰LLMsContent type: Academic
arxiv.org

Parallel Causal Associative Fields: Gated Sparse Memory for Long-Context Language Modeling

馃挰LLMsContent type: Academic
arxiv.org

When Vision Misleads, Let Location Speak: A Worldwide Image Geo-Localization Method via Location Attention Mechanism and Large Multimodal Models

馃挰LLMsContent type: Academic
arxiv.org

One Step Closer to Ground Truth: A Multi-Scale Residual-Aware Representation Learning Pipeline for Predicting Time Series Data

馃挰LLMsContent type: Academic
arxiv.org

Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It

馃挰LLMsContent type: Academic
arxiv.org

Still: Amortized KV Cache Compaction in a Single Forward Pass

馃捑Cognitive OffloadingContent type: Academic
arxiv.org

Beyond Patches: Superpixel Token-based Transformers for Attribute-Specific Fashion Retrieval

馃挰LLMsContent type: Academic
arxiv.org

SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving

馃LLMContent type: Academic
arxiv.org

EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

馃LLMContent type: Academic
arxiv.org

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

馃挰LLMsContent type: Academic
arxiv.org

A Four-Condition Diagnostic Protocol for Evidence Utilization in Long-Context and Retrieval-Augmented Language Models

馃LLMContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help