PANDO: Efficient Multimodal AI Agents via Online Skill Distillation

智能体与自主科学突破级暂无讲解视频

收录解读

多模态 web agents 经常靠 rollout search、verifier passes 和 specialist stacks 提升成功率，但这会让推理成本越来越高。

PANDO 从 VisualWebArena trajectories 中分析 repeat-action loops、hidden discovery costs 和低 prompt-cache reuse 等低效来源，并提出 online skill distillation。

核心思想是让 agent 把重复经验蒸馏为技能，使经验积累降低未来成本，而不是不断增加 test-time compute。

它值得收录，因为它直接连接 agent experience、skill distillation 和 inference efficiency，是长期部署 agent 的关键系统问题。