安全、治理与可靠性 突破级 暂无讲解视频
发表时间
2026-06-10
arXiv
2606.12320

核心要点

问题/背景
这篇论文针对 production AI agents 的治理问题提出 five-plane reference architecture,把风险从数据边界转移到工作流内部的动作序列。
方法/机制
架构由 reasoning plane 和 network、identity、endpoint、data 四个 enforcement planes 组成,并引入 stop-anywhere mediation、composite principals、capability attenuation 和 structured audit substrate。
结果/证据
论文定义了六类 interruption primitives 和四个 correctness invariants,并用多个生产型 agent 威胁场景说明这些控制如何阻断越权工作流。
收录价值
它值得收录,因为它把 agent safety 从提示/模型行为推进到运行时权限、委托链、审计证据和工作流控制边界,是可复用系统架构。
完整收录解读

这篇论文针对 production AI agents 的治理问题提出 five-plane reference architecture,把风险从数据边界转移到工作流内部的动作序列。

架构由 reasoning plane 和 network、identity、endpoint、data 四个 enforcement planes 组成,并引入 stop-anywhere mediation、composite principals、capability attenuation 和 structured audit substrate。

论文定义了六类 interruption primitives 和四个 correctness invariants,并用多个生产型 agent 威胁场景说明这些控制如何阻断越权工作流。

它值得收录,因为它把 agent safety 从提示/模型行为推进到运行时权限、委托链、审计证据和工作流控制边界,是可复用系统架构。

原始摘要与中文对照

中文对照翻译

生产AI代理运行时治理的五平面参考架构 企业安全旨在管理数据边界:受保护的表面是静态数据和传输中的数据,而控制措施(访问控制、数据丢失防护、边界检查)则管理着对该边界的跨越。生产AI代理打破了这一假设。代理代表企业读取上下文、调用工具、调用连接器并修改记录系统,因此风险转移到工作流内部,进入一系列单独允许的操作,这些操作可能会改变未经授权的业务流程。现有策略引擎不适用于这种机制:它们根据原子主体评估请求时决策,而代理系统需要根据复合主体进行有状态评估,这些复合主体的权限通过委托链衰减。我们提出了一种用于生产代理运行时治理的参考架构,该架构由四个可组合原语构建:一个五平面分解(一个裁决意图的推理平面,以及四个实现决策的网络、身份、端点和数据强制平面);随处可停的调解;具有能力衰减的复合主体;以及作为结构化证据基底的审计。我们定义了六种中断原语的分类法,它们概括了允许和拒绝,阐述并论证了四个正确性不变量,展示了在五个具体工作流中预防七种生产代理威胁,并提出了一个将安全性和实用性视为共同目标的评估框架。策略引擎核心的参考实现提供了实测证据:衰减正确性和证据可重构性在每次试验中都成立,裁决在个位数微秒内运行,并且审计基底的防篡改证据行为与设计完全一致。我们明确了范围:该架构管理委托行动,而非模型行为;其不变量是结构性论证的,而非形式化证明的;参考实现验证了架构的内部主张,将针对实时代理基准的全面系统评估作为下一步工作。

原始摘要

Enterprise security was built to govern data boundaries: the protected surface was data at rest and in transit, and the controls (access control, data-loss prevention, perimeter inspection) governed crossings of that boundary. Production AI agents dissolve this assumption. An agent reads context, calls tools, invokes connectors, and modifies systems of record on an enterprise’s behalf, so risk moves inside the workflow, into sequences of individually-permitted actions that may transform a business process no one authorized. Existing policy engines do not extend to this regime: they evaluate request-time decisions against atomic principals, where agentic systems require stateful evaluation against composite principals whose authority attenuates through delegation chains. We present a reference architecture for the runtime governance of production agents, built from four composable primitives: a five-plane decomposition (a reasoning plane that adjudicates intent, plus four enforcement planes for network, identity, endpoint, and data that realize the decision); stop-anywhere mediation; composite principals with capability attenuation; and audit as a structured evidence substrate. We define a taxonomy of six interruption primitives that generalize allow and deny, state and argue for four correctness invariants, demonstrate the foreclosure of seven production-agent threats across five concrete workflows, and propose an evaluation framework that treats safety and utility as joint objectives. A reference implementation of the policy-engine core supplies measured evidence: attenuation correctness and evidence reconstructability hold on every trial, adjudication runs in single-digit microseconds, and the audit substrate’s tamper-evidence behaves exactly as designed. We are explicit about scope: the architecture governs delegated action, not model behavior; its invariants are argued structurally, not formally proved; and the reference implementation validates the architecture’s internal claims, leaving a full-system evaluation against a live agent benchmark as the invited next step.

相关论文

链接