LLMs

Feeds to Scour
SubscribedAll
Scoured 482 posts in 6.8 ms

Alignment Collapse Under KV Cache Quantization: Diagnosis and Mitigation

馃LLMContent type: Academic
arxiv.org

Reachability and asymptotics of Gaussian Transformer dynamics

馃LLMContent type: Academic
arxiv.org

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

馃LLMContent type: Academic
arxiv.org

The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring

馃LLMContent type: Academic
arxiv.org

A retrieval conditioned rebinding circuit for dynamic entity tracking in large language models

馃LLMContent type: Academic
arxiv.org

RedKnot: Efficient Long-Context LLM Serving with Head-Aware KV Reuse and SegPagedAttention

馃LLMContent type: Academic
arxiv.org

YouZhi: Towards High-Concurrency Financial LLMs via Adaptive GQA-to-MLA Transition

馃AI ToolsContent type: Academic
arxiv.org

Tangram: Unlocking Non-Uniform KV Cache for Efficient Multi-turn LLM Serving

馃捇Operating SystemsContent type: Academic
arxiv.orgHacker News

Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

馃LLMContent type: Academic
arxiv.org

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

馃挰Natural Language ProcessingContent type: Academic
arxiv.org

SigmaScale: LLM Compression with SVD-based Low-Rank Decomposition and Learned Scaling Matrices

馃LLMContent type: Academic
arxiv.org

Empirical Evaluation of Large Language Models for Migration of Code Fragments to Post-Quantum Cryptography

馃LLMContent type: Academic
arxiv.org

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help