Towards end-to-end automation of AI research

Chris Lu; Cong Lu; Robert Tjarko Lange; Yutaro Yamada; Shengran Hu; Jakob Foerster; David Ha; Jeff Clune

doi:10.1038/s41586-026-10265-5

科学发现旗舰工作颠覆级有讲解视频

策展与解读：DAST AI · 收录方法与内容透明度

发表时间: 2026-03-25
DOI: 10.1038/s41586-026-10265-5

收录解读

自动化科学早就不缺局部工具：想法生成、代码编写、实验执行、论文写作、文献检索、评审辅助都各自有进展。真正缺的是把整个研究生命周期连成一条可运行、可评估的 agentic workflow。The AI Scientist 直指这个缺口。

论文提出一个端到端 research pipeline：自动生成研究方向和计划、执行实验、可视化和记录结果、写完整论文、再由 Automated Reviewer 做自动评审。系统同时支持 template-based 和 template-free 两种实验路径，并在后者中引入 tree search 扩展 test-time compute。

这篇工作的地位不在于某个局部子模块最强，而在于它第一次把“从 conception 到 submission”的完整 AI research workflow 以可运行系统形式展示出来，并用 workshop submission 和 reviewer prediction 作为外部化验证。这对 agent-driven scientific workflow 是明确的 research framing 变化。

它没有再升到 paradigm，原因也很明确：目前主要限于机器学习这类可计算研究任务；提交实验前有人工筛选；通过的是 workshop 首轮评审而不是更高门槛正式长文轨道；而且 reviewer automation 与生成研究质量之间仍存在可被游戏化和污染的风险。

原始摘要与中文对照

中文对照翻译

迈向人工智能研究的端到端自动化科学自动化是人工智能（AI）研究领域一个长期以来的宏伟目标。尽管学界在自动化科学过程的各个独立组成部分方面取得了实质性进展，但一个能够自主完成从构思到发表的整个研究生命周期的系统仍然遥不可及。在本文中，我们提出了一种用于端到端自动化整个科学过程的流程。我们介绍了“AI科学家”（The AI Scientist），它能够生成研究想法、编写代码、运行实验、绘制和分析数据、撰写完整的科学手稿，并进行自我同行评审。它的想法、执行和呈现质量足以使该AI系统生成的手稿通过了顶级机器学习会议研讨会的第一轮同行评审。该研讨会的接受率为70%。我们的系统在一个复杂的智能体系统中利用了现代基础模型。我们在两种设置下评估了“AI科学家”：一种是聚焦模式，使用人类提供的代码模板作为在特定主题上进行研究的初始支架；另一种是无模板的开放式模式，利用智能体搜索进行更广泛的科学探索。这两种设置都能产生多样化的想法，并自动进行测试、报告和评估。这一成就展示了AI在做出科学贡献方面日益增长的能力，并预示着研究方式可能发生的范式转变。与任何具有影响力的新技术一样，这也可能带来重要的风险，包括给不堪重负的评审系统增加负担，以及向科学文献中引入噪音。然而，如果得到负责任的开发，此类自主系统可以极大地加速科学发现。

原始摘要

The automation of science is a long-standing ambition in artificial intelligence (AI) research . Although the community has made substantial progress in automating individual components of the scientific process, a system that autonomously navigates the entire research life cycle—from conception to publication—has remained out of reach. Here we present a pipeline for automating the entire scientific process end to end. We present The AI Scientist, which creates research ideas, writes code, runs experiments, plots and analyses data, writes the entire scientific manuscript, and performs its own peer review. Its ideas, execution and presentation are of sufficient quality that the manuscript generated by this AI system passed the first round of peer review for a workshop of a top-tier machine learning conference. The workshop had an acceptance rate of 70%. Our system leverages modern foundation models within a complex agentic system. We evaluate The AI Scientist in two settings: a focused mode using human-provided code templates as an initial scaffold for conducting research on a specific topic and a template-free, open-ended mode that leverages agentic search for wider scientific exploration . Both settings produce diverse ideas and automatically test, report on and evaluate them. This achievement demonstrates the growing capacity of AI for making scientific contributions and signifies a potential paradigm shift in how research is conducted. As with any impactful new technology, there could be important risks, including taxing overwhelmed review systems and adding noise to the scientific literature. However, if developed responsibly, such autonomous systems could greatly accelerate scientific discovery.

解读视频

视频观看页 B 站 YouTube

链接

论文链接

收录解读

原始摘要与中文对照

中文对照翻译

原始摘要

解读视频

相关论文

链接