Harness Updating Is Not Harness Benefit: Disentangling Evolution Capabilities in Self-Evolving LLM Agents

智能体与自主科学突破级暂无讲解视频

发表时间: 2026-05-28
arXiv: 2605.30621

核心要点

问题/背景: 这篇提供了 self-evolving agents 的重要诊断框架：能写出 harness 更新，不等于能从 harness 更新中受益。
方法/机制: 它把能力拆成 harness-updating 和 harness-benefit，并发现二者与模型 base capability 的关系不同：更新能力不随模型强弱明显提升，而受益能力呈非单调。
结果/证据: 收录价值在于它给 agent skill/memory/tool harness 训练提供了评估边界和资源分配原则：不要只训练 evolver，还要训练 artifact invocation 和 long-horizon instruction following。
收录价值: 风险与限制：当前仍是 arXiv 初版，核心结论需要跨模型、跨环境和真实部署场景的进一步复现；因此分级为 breakthrough，而不是 disruptive/paradigm。

完整收录解读

这篇提供了 self-evolving agents 的重要诊断框架：能写出 harness 更新，不等于能从 harness 更新中受益。

它把能力拆成 harness-updating 和 harness-benefit，并发现二者与模型 base capability 的关系不同：更新能力不随模型强弱明显提升，而受益能力呈非单调。

收录价值在于它给 agent skill/memory/tool harness 训练提供了评估边界和资源分配原则：不要只训练 evolver，还要训练 artifact invocation 和 long-horizon instruction following。

风险与限制：当前仍是 arXiv 初版，核心结论需要跨模型、跨环境和真实部署场景的进一步复现；因此分级为 breakthrough，而不是 disruptive/paradigm。

论文摘要

本文研究了可自我演化的LLM代理，其可编辑的外部套件包括提示、技能、记忆和工具。它将套件更新能力（即产生有用的持久更新的能力）与套件收益能力（即从更新后的套件中获益的能力）分离。结果表明，在基础能力层级上，套件更新能力保持不变，而套件收益能力是非单调的，中等层级的模型收益最大，而弱层级的模型则因激活失败和忠实遵循失败而失败。

英文原文

The paper studies self-evolving LLM agents whose editable external harnesses include prompts, skills, memories, and tools. It separates harness-updating, the ability to produce useful persistent updates, from harness-benefit, the ability to benefit from updated harnesses. Results show harness-updating is flat across base capability tiers, while harness-benefit is non-monotonic, with mid-tier models benefiting most and weak-tier models failing through activation and faithful-following failures.

链接

论文链接论文链接代码相关链接

核心要点

论文摘要

相关论文

链接