Transformers

Feeds to Scour
SubscribedAll
Scoured 107 posts in 8.8 ms

Selective Coupling of Decoupled Informative Regions: Masked Attention Alignment for Data-Free Quantization of Vision Transformers

 🤖AI  Content type: Academic
arxiv.org·

Instruction Finetuning DeepSeek-R1-8B Model Using LoRA and NEFTune

 🎯Fine-Tuning  Content type: Academic
arxiv.org·

Chiaroscuro Attention: Spending Compute in the Dark

 Flash Attention  Content type: Academic
arxiv.org·

Dynamic Linear Attention

 🤖AI  Content type: Academic
arxiv.org·

LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding

 🤖AI  Content type: Academic
arxiv.org·

Look Less, Reason More: Block-wise Attention Skipping for Efficient Multimodal LLMs

 👁️Computer Vision  Content type: Academic
arxiv.org·

InA-Probe: Instruction-Aware Active Probing for Time Series Forecasting with LLMs

 📈Time Series Analysis  Content type: Academic
arxiv.org·

Query-based Cross-Modal Projector Bolstering Mamba Multimodal LLM

 🤖AI  Content type: Academic
arxiv.org·

FuseFSS: Efficient Secure LLM Inference with Function Secret Sharing

 💬LLMs  Content type: Academic
arxiv.org·
Less-relevant results

Inside the Latent Flow: Causal Deciphering of Attention Dynamics in Audio Separation Foundation Models

 📈Optimization  Content type: Academic
arxiv.org·

Signed Dual Attention: Capturing Signed Dependencies in Time Series Forecasting

 🤖AI  Content type: Academic
arxiv.org·

When Vision Misleads, Let Location Speak: A Worldwide Image Geo-Localization Method via Location Attention Mechanism and Large Multimodal Models

 🤖AI  Content type: Academic
arxiv.org·

Inside the LLM Word Factory

 💬Natural Language Processing  Content type: Academic
arxiv.org·

Transformer-Enhanced Reinforcement Learning: Fundamentals and Applications in Communication Networks

 🤖AI  Content type: Academic
arxiv.org·

TextEconomizer: Enhancing Lossy Text Compression with Denoising Transformers and Entropy Coding

 🤖AI  Content type: Academic
arxiv.org·

ATT-CR: Adaptive Triangular Transformer for Cloud Removal

 🧮Complexity Theory  Content type: Academic
arxiv.org·

Towards Tight Bounds for Streaming Attention

 🤖AI  Content type: Academic
arxiv.org·

Depth-Attention: Cross-Layer Value Mixing for Language Models

 📈Optimization  Content type: Academic
arxiv.org·

Beyond Item IDs: Scaling Short-Form-Video Recommendation via Semantic-Native Long Sequence Modeling

 🧮Complexity Theory  Content type: Academic
arxiv.org·

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

 🤖AI  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help