A Benchmark for Interactive World Models with a Unified Action Generation Framework

多模态生成与世界模型突破级暂无讲解视频

收录解读

interactive world model 一直缺少统一评测接口，尤其不同模型的 action interface 常常根本不兼容。这篇工作的关键不是再造一个 world model，而是提出统一的 action generation framework 去对齐评测。

iWorld-Bench 的价值在于把 interaction-related abilities 明确拆成可测任务，例如 distance perception 和 memory，并让不同交互范式的 world model 能在同一 benchmark 下比较。

它值得正式收录，因为 benchmark primitive 在这个方向上比又一篇模型 paper 更稀缺。只要 world model 继续从纯生成转向可交互 agent substrate，这种统一评测接口就很重要。

它没有更高，是因为当前仍然主要解决 benchmark/interface 统一问题，不直接证明某种 world-model architecture 的长期主导性。