Task-Aware Structured Memory for Dynamic Multi-modal In-Context Learning (opens in new tab) 📄AI Papers Content type: Academic
Multi-modal large language models (MLLMs) depend on in-context learning (ICL) for rapid task adaptation, but their scalability is severely limited by finite context windows and the growing cost of key-value (KV) caches in long multi-modal sequences. Existing memory compression approaches typically rely on rigid token removal or sample-dependent importance estimation, which introduces bias, disrupts semantic structure, particularly for visual rep...
Read the original article