Trajectory-Informed Memory Generation for Self-Improving Agent Systems

Gaodan Fang; Vatche Isahagian; K. R. Jayaram; Ritesh Kumar; Vinod Muthusamy; Punleuk Oum; Gegi Thomas

智能体与自主科学突破级暂无讲解视频

发表时间: 2026-03-11
arXiv: 2603.10600

收录解读

LLM agent 一个长期没有解决好的问题，是任务虽然能完成，但执行过程中的失败模式、低效步骤和恢复策略并不会自动沉淀成后续可复用的能力。现有 memory 系统往往只存对话事实或零散经验，而不真正理解 agent 执行轨迹里的决策结构，因此很难持续提升未来任务表现。

这篇工作把执行轨迹转成可检索的结构化学习信号。框架包含轨迹语义分析、失败与恢复的决策归因、基于执行质量生成策略/恢复/优化三类提示，以及按上下文相似性动态注入的自适应 memory retrieval。重点不是再加一个静态记忆库，而是让 memory 来自可解释的 trajectory learning，并保留 provenance。

它在仓库里属于 agent systems 主线下的高价值方法论文，和 self-improving agents、memory-augmented agents、长期任务改进直接相关。对 agent 训练与推理之间的桥接有明显外溢价值，也适合和现有的 tool-use、exploration、self-improvement 路线一起看。

它暂时不升到更高等级，因为证据主要集中在 AppWorld 一类基准，尚未证明自己已经成为通用 agent memory 的默认方案；同时它目前仍是 arXiv 阶段，跨环境复现和长期采用度还需要后续验证。

原始摘要与中文对照

中文对照翻译

标题：轨迹信息记忆生成用于自改进智能体系统LLM驱动的智能体面临一个持续的挑战：如何从其执行经验中学习以提高未来的性能。尽管智能体可以成功完成许多任务，但它们经常重复低效模式，未能从类似错误中恢复，并错失了应用过去执行中成功策略的机会。我们提出了一种新颖的框架，用于自动从智能体执行轨迹中提取可操作的学习内容，并通过上下文记忆检索来利用它们以提高未来的性能。我们的方法包含四个组件：(1) 轨迹智能提取器，对智能体推理模式进行语义分析；(2) 决策归因分析器，识别哪些决策和推理步骤导致了失败、恢复或低效；(3) 上下文学习生成器，生成三种类型的指导——来自成功模式的策略提示、来自失败处理的恢复提示以及来自低效但成功执行的优化提示；(4) 自适应记忆检索系统，根据多维相似性将相关的学习内容注入到智能体提示中。与存储通用对话事实的现有记忆系统不同，我们的框架理解执行模式，提取带有来源的结构化学习内容，并检索针对特定任务上下文量身定制的指导。在AppWorld基准测试上的评估表明了持续的改进，在保留任务上的场景目标完成率提高了高达14.3个百分点，并且在复杂任务上表现出特别显著的优势（场景目标改进28.5个百分点，相对增长149%）。

原始摘要

LLM-powered agents face a persistent challenge: learning from their execution experiences to improve future performance. While agents can successfully complete many tasks, they often repeat inefficient patterns, fail to recover from similar errors, and miss opportunities to apply successful strategies from past executions. We present a novel framework for automatically extracting actionable learnings from agent execution trajectories and utilizing them to improve future performance through contextual memory retrieval. Our approach comprises four components: (1) a Trajectory Intelligence Extractor that performs semantic analysis of agent reasoning patterns, (2) a Decision Attribution Analyzer that identifies which decisions and reasoning steps led to failures, recoveries, or inefficiencies, (3) a Contextual Learning Generator that produces three types of guidance—strategy tips from successful patterns, recovery tips from failure handling, and optimization tips from inefficient but successful executions—and (4) an Adaptive Memory Retrieval System that injects relevant learnings into agent prompts based on multi-dimensional similarity. Unlike existing memory systems that store generic conversational facts, our framework understands execution patterns, extracts structured learnings with provenance, and retrieves guidance tailored to specific task contexts. Evaluation on the AppWorld benchmark demonstrates consistent improvements, with up to 14.3 percentage point gains in scenario goal completion on held-out tasks and particularly strong benefits on complex tasks (28.5 pp scenario goal improvement, a 149% relative increase).

链接

论文链接