怎样测出AI是真有逻辑还是在背题？科学界用SuperARC压缩率指标给大模型卸妆 | DAST Papers

对应论文

SuperARC: a test for artificial superintelligence based on compressed modelling, recursive prediction and problem complexity

视频简介

这篇 Nature Communications 论文提出 SuperARC，试图用压缩建模、递归预测和问题复杂度来评估前沿 AI，而不是继续依赖人类问答式 benchmark。它把 AGI/ASI 评测放到 algorithmic information theory 和 universal prediction 的框架下，强调压缩 over algorithmic space 与形式理论预测能力之间的关系。它值得收录，因为仓库需要跟踪不依赖人类题库的 AI 评测接口；即使论文对 ASI 的表述偏强，SuperARC 仍是一个可讨论、可复现的评测问题定义。按当前规则，它是 AI evaluation framing paper；局限是理论主张和 benchmark 有争议，是否成为长期标准高度不确定，因此只给 breakthrough，不上 disruptive。

外部视频链接

论文链接

论文详情页