Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

推理、记忆与推理时控制突破级暂无讲解视频

发表时间: 2026-05-21
arXiv: 2605.22791

收录解读

Gated DeltaNet-2 improves linear attention by decoupling the erase and write operations that update the recurrent memory state.

The method generalizes earlier gated delta and Kimi Delta Attention variants with channel-wise erase and write gates, plus efficient chunkwise training and backward-pass machinery.

The reported results show strong long-context retrieval behavior and competitive language-modeling performance among recurrent and hybrid sequence models.

For this repository, the paper matters as a reusable memory-update primitive for efficient long-context modeling and non-softmax attention architectures.

链接

论文链接代码