Characterizing Mamba's Selective Memory using Auto-Encoders
arxiv.org·1d
💬Prompt Engineering
Preview
Report Post

Title:Characterizing Mamba’s Selective Memory using Auto-Encoders

View PDF HTML (experimental)

Abstract:State space models (SSMs) are a promising alternative to transformers for language modeling because they use fixed memory during inference. However, this fixed memory usage requires some information loss in the hidden state when processing long sequences. While prior work has studied the sequence length at which this information loss occurs, it does not characterize the types of information SSM language models (LMs) tend to forget. In this paper, we address this knowledge gap by identifying the types of tokens (e.g., parts of speech, named entities) and sequences (e.g., code, math problems)…

Similar Posts

Loading similar posts...