RAG and Long Context Aren't Enough for Agent Memory. δ-mem Is a Third Option (opens in new tab)

Covers $\delta$-mem: Efficient Online Memory for Large Language ModelsDiscussed on DEV

An 8×8 online state lifted Qwen3-4B from 46.79% to 51.66%, with the backbone untouched. δ-mem stores an LLM’s conversation history inside an 8×8 matrix and uses it to steer attention. The backbone stays frozen. No prompt growth. No fine-tuning. On Qwen3-4B-Instruct, that small matrix lifts the average score across five benchmarks from 46.79% to 51.66%, with 4.87M trainable parameters (0.12% of the model). The adapter is public on Hugging Face under CC-BY-4.0. The arXiv paper landed on May 12,...

Read the original article