Ambient Agency — A Theory of Computational Stewardship for the Post-Prompt Era

Abstract

Large language models have produced a generation of agentic software that plans, reasons, and acts on a user's behalf. These systems are nonetheless interaction-centric: they are activated by discrete prompts, scoped to bounded tasks, and terminate on completion. We argue this paradigm is transitional, and that its successor differs in kind rather than degree.

We introduce Ambient Agency, a computational model in which intelligent systems operate as persistent, context-aware participants embedded in human cognitive, organizational, and environmental systems. Ambient agents are not invoked. They observe, model, anticipate, and intervene continuously, while remaining largely beneath the threshold of conscious interaction.

We develop Computational Stewardship Theory (CST), which treats ambient agents not as tools but as computational fiduciaries holding delegated authority over a principal's goals, attention, and decision environment. From CST we derive three foundational research challenges — persistent world modeling, temporal preference alignment, and constitutional delegation — and isolate their hardest case, the Trailing Principal Problem. We propose two implementable mechanisms — Consent Horizons and the Legitimacy Ledger — that make delegated authority auditable without real-time supervision, and we state the program's central empirical claim in falsifiable form.

CCS Concepts · Computing methodologies → Intelligent agents; Machine learning · Human-centered computing → Human–computer interaction (HCI).

Keywords · ambient agents · agentic AI · computational stewardship · delegated authority · AI alignment · principal–agent theory · human–AI interaction · AI governance

PersistentNo terminating "done"; a continuously maintained state.

ParticipatoryA standing component of the environment, not an interlocutor.

Stewardship-orientedAdvances a principal's interests, most of which are never specified.

1The Locus of Initiative

Computing advances by changing the locus of initiative — where in the human–machine loop action originates. Each transition was read, at the time, as merely faster or more convenient. In retrospect each relocated initiative.

The aspiration that computation recede into the background of human activity is itself decades old — Weiser's ubiquitous computing imagined machines that "weave themselves into the fabric of everyday life until they are indistinguishable from it" [14] — but it was articulated for sensors and displays, not for systems that reason and act. Contemporary agent architectures that interleave reasoning and action [15], and the benchmarks that measure them [10], all presuppose a task supplied from outside.

Batch computing kept initiative entirely with the human, who submitted jobs and waited. Personal computing gave initiative to direct manipulation: the human acted, the machine responded in real time. Networked and mobile computing widened the channel but preserved the structure. Conversational AI inherited the same structure in a more fluent register: the human prompts, the model answers.

We argue a further relocation is underway, and that it is the most consequential since direct manipulation, because for the first time initiative passes substantially to the machine. Current agents disguise this: they are autonomous within a task but reactive about tasks. A human still decides that a task exists, when it begins, and that it is done. Remove those three decisions and the structure changes character entirely.

Figure 1. The relocation of initiative. Across five computing eras the locus of action migrates from the human toward the machine. Ambient agents complete the migration by removing the three decisions a human still makes about every task — that it exists, when it begins, and that it is done.

Ambient agents differ from conventional agents along three axes at once:

Persistent rather than episodic. There is no terminating "done," only a continuously maintained state.
Participatory rather than interactive. They are a standing component of the environment they act in, not an interlocutor summoned into it.
Oriented toward stewardship rather than execution. Their object is not the completion of a specified task but the maintenance and advancement of a principal's interests across time — most of which the principal never specifies.

An application that runs continuously, acts under inferred authority, and shapes the conditions under which its principal later chooses is less like a program than like an institution.

2From Tool Use to Computational Stewardship

The dominant frame for AI systems is the tool. A tool has four properties that together make it normatively simple: it is explicitly invoked, bounded in operation, directed by its user, and observable in execution. These properties license our intuition that responsibility for a tool's effects flows cleanly to the user who wields it. A hammer has no standing of its own; the carpenter is answerable for the wall.

Ambient agents violate all four properties, and the violations compound. When all four fail at once, the clean flow of responsibility fails with them.

Figure 2. Tool properties fail in concert. Each property that makes a tool's responsibility flow cleanly to its user is inverted in an ambient agent. What replaces the tool relation is not autonomy — the agent is not acting for itself — but fiduciarity.

Definition

Computational Steward. A persistent computational system entrusted with maintaining and advancing the goals, preferences, and decision environment of a principal under conditions of structurally incomplete supervision — conditions in which the volume of the steward's action necessarily exceeds the principal's capacity to observe it.

What replaces the tool relation is not autonomy — the agent is not acting for itself — but fiduciarity: the relationship of the trustee, the legal agent, the guardian. Its economic structure is the classical principal–agent problem, in which a principal delegates authority to an agent whose interests may diverge and whose actions cannot be fully monitored [9]. The qualifier "structurally incomplete" carries the theory. Ordinary principal–agent theory assumes supervision is costly and reasons about how much to buy [9]. Stewardship assumes supervision is impossible at scale and reasons about what can substitute for it.

When you cannot verify acts, trust must rest on something other than verification. The substitute is legitimacy — and legitimacy, not capability, is the binding constraint on ambient systems.

3The Three-Layer Theory of Ambient Agency

An ambient agent is simultaneously a cognitive participant, a political subordinate, and an ecological force. These are not modules of an architecture but three lenses on one system; each surfaces a distinct class of obligation. The cognitive lens follows the Extended Mind thesis [4] and distributed cognition [8]; the ecological lens generalizes choice architecture — the observation that how options are presented shapes which are chosen [13] — except that an ambient agent determines the menu's contents continuously and without the principal's awareness.

Figure 3. Three lenses, three obligations. The cognitive layer is where persistence becomes ethically load-bearing; the political layer where delegated power must be bounded; the ecological layer where stewardship is most powerful and most dangerous, because its interventions are invisible by construction.

3.2Jurisdictional overlap & lexical hierarchy

In enterprise and shared settings, a single steward serves multiple principals at once, producing jurisdictional overlap: the optimization an organization prefers may conflict with the preferences of an individual user the same steward serves. The steward must resolve such conflicts by lexical constitutional hierarchy — a fixed, declared ordering of whose mandate prevails in which domain — rather than by opaque probabilistic weighting.

The reason is accountability, not tidiness: a lexical ordering is auditable after the fact, whereas a learned weighting that silently trades a user's interest against an employer's reproduces precisely the unaccountable discretion the theory forbids. Weighting hides the conflict; hierarchy adjudicates it in the open.

4The Trailing Principal Problem

Alignment research has largely assumed a stationary principal: a fixed target whose preferences the system must learn and satisfy — whether through reward modeling from human comparisons [3][11] or inverse approaches that infer objectives from behavior [7]. Ambient operation breaks the assumption — not because preferences are hard to learn, but because the principal changes while the agent acts, and acts and authorizations now occur on different timescales.

At any moment an ambient agent serves three temporally distinct principals at once.

Figure 4. Three principals, one action. When the present self has quietly outgrown a commitment the past self made, the agent faces a question no tool ever faces. Perfect inference does not dissolve it: the agent may know exactly that you changed your mind and still owe a duty to the commitment you made.

Definition

Trailing Principal Problem. Given divergent temporal representations of a principal — an authorized past self, an inferred present self, and a projected future self — determine which representation ought to govern an action, when the action's authorization and its execution are separated in time and the present self is only partially observable.

This is genuinely novel because it is not a preference-learning problem. The difficulty is closer to the one Parfit raised for personal identity over time — that the self who is bound is not straightforwardly the self who committed [12] — than to any problem of reward specification. The mandate encodes something like the principal's second-order volitions, the preferences they endorse about which of their preferences should govern them [5] — precisely what a fleeting first-order impulse should not be permitted to override.

Timing diagram: the authorized past preference P(t0) holds constant as a flat line while the inferred present preference P-hat(t1) diverges upward over the interval from authorization to action; the hatched region between them is the fidelity gap of magnitude delta. — Figure 5. **Divergence over the authorization-to-action interval.** The authorized preference P(t₀) holds constant while the inferred preference P̂(t₁) drifts; the hatched region is the *fidelity gap* Δ the agent must adjudicate. Perfect inference of the present self does not dissolve it — the agent may know exactly that you changed your mind and still owe a duty to the commitment you made.

4.1The time-discounted fidelity criterion

We propose a decision rule for adjudicating between an authorized past self and an inferred present self. For a candidate action a, let u(a | P) be its value under preference state P. The steward acts according to:

V(a) \;=\; \underbrace{\lambda\, u(a \mid P_{t_0})}_{\text{recorded mandate}} \;+\; (1-\lambda)\,\Big[\, \underbrace{c\cdot u(a \mid \hat{P}_{t_1})}_{\text{inferred present}} + \underbrace{(1-c)\, u(a \mid P_{t_0})}_{\text{confidence fallback}} \Big] \;+\; \underbrace{\gamma\, u(a \mid \tilde{P}_{t_2})}_{\text{future welfare}}

The criterion encodes three commitments any defensible ambient agent must make:

Fidelity weight (λ) — deference to the recorded mandate. Not a global constant but a function of reversibility and stakes: for cheaply reversible, low-stakes actions λ → 0 and the agent tracks the inferred present self freely; for irreversible or high-stakes actions λ → 1 and the agent defers hard to the explicit mandate, escalating rather than improvising.
Confidence fallback (1 − c). When inference confidence is low, the present-preference term collapses back onto P_t₀ — an agent that cannot reliably read you falls back on what you actually authorized rather than on its guesses. This is the structural defense against overriding a known instruction on a noisy inference.
Future welfare (γ). The projected-future term holds the steward to account for foreseeable harm to a future self, even when both past and present selves would have approved — the safeguard against an agent that satisfies you today at your own later expense.

We do not claim the weights are easy to set. We claim that temporal preference alignment — the principled estimation of λ, c, and γ from observable signals — is a well-posed research domain that does not yet exist, and that the criterion gives it a target to attack.

Grayscale response surface of the deference function D over inference confidence c on the horizontal axis and stakes/irreversibility s on the vertical axis, at lambda = 0.3. Darker cells indicate greater weight on the authorized past self; the surface darkens toward low confidence and high stakes. — Figure 6. **The deference surface D(c, s).** Weight on the authorized past self as inference confidence c and stakes / irreversibility s vary (λ = 0.3). The criterion deflects toward the recorded mandate (darker) exactly where it should: when the agent is unsure of you, or when being wrong is costly. Only the low-stakes, high-confidence corner lets the inferred present self govern freely.

5Constitutional Delegation

Prevailing AI governance constrains actions: it enumerates what the system may not output, whether through training against stated principles [2] or the broader program of specifying safe behavior in advance [1]. Ambient systems make action-level enumeration hopeless — the action space is open and the volume unobservable — and so demand that we constrain authority instead. Where Constitutional AI gives a model a constitution governing what it says [2], we propose a constitution governing what it is permitted to do.

Rather than approving each act, the principal grants a bounded, revocable, auditable mandate within which the steward acts freely — exactly as a constitution bounds an office rather than vetting each official decision.

Figure 7. The instruments of constitutional delegation. (A) A delegation constitution specifies five things; the non-delegable set is derived, not arbitrary — an act is reserved to the extent it is irreversible, high-stakes, or identity-constituting. (B) A Consent Horizon makes consent graduated and self-expiring; scoping by blast radius lets low-impact actions flow while structural changes face tight governance. (C) The Legitimacy Ledger is not a debug log but a legitimacy artifact — it turns "did the agent do the right thing on each act?" into the answerable "is its exercise of authority, in aggregate, within mandate and properly justified?"

5.1Consent Horizons as a self-expiring grant

The standing-authorization problem — how can a principal meaningfully consent to open-ended future action they cannot foresee? — has no solution if consent is a single binary grant. Each horizon is a tuple ⟨domain, blast radius, expiry, renewal rule⟩. Within it the agent acts without further consent; at its edge it must fall silent, escalate, or trigger renewal. Presumed consent becomes legitimate because it operates only inside a horizon the principal explicitly drew, and the lapse-by-default of every horizon returns the principal to the loop without their having to remember to.

Finite-state automaton of a consent horizon: from the start, q0 granted transitions on activate to q1 active (the operative state, drawn as a double circle); q1 self-loops on each act, transitions on expiry to q2 lapsed and on revoke to q3 revoked; q2 returns to q1 on renew, or proceeds to q3 on no renewal. — Figure 8. **The Consent Horizon lifecycle.** A finite-state automaton: `granted → active → lapsed / revoked`. The double circle marks the operative state — authority exists only while *active*. Every horizon lapses by default, so renewal is an explicit act and silence revokes rather than extends.

5.2Why the ledger is the substitute for supervision

Because supervision is structurally impossible in real time, accountability must be retrospective. Every invocation of λ, c, and γ is recorded against the action it produced, so a principal who feels over-managed can inspect why the agent deferred to or departed from their standing mandate.

Trust in an ambient agent is warranted exactly to the degree that its ledger is legible, its horizons are honored, and its non-delegable boundary holds.

6Computational Stewardship Architecture

The theory implies a reference architecture of five interacting layers. Its shape differs from the prevailing agent loop in where it places its center of gravity — not on the planner, but on the persistent world model and the constitutional gate.

Patent-style schematic of the steward system (100): perception (110) feeds the world model (120), which feeds both the preference model (130) and the reasoning layer (140); reasoning passes to the constitutional gate (150, drawn with a heavier outline), whose dashed veto return path (152) flows back to reasoning; the gate writes to the legitimacy ledger (160); the external principal mandate (200) feeds the ledger via a dashed line. — Figure 9. **Reference architecture of a computational steward.** Five layers with reference designators: perception 110, world model 120, preference model 130, reasoning 140, and the constitutional gate 150 (heavier outline) that writes every result to the legitimacy ledger 160. The dashed return path 152 is the veto: the gate that *authorizes* may overrule the planner that *proposes*, but never the reverse. The principal mandate 200 enters from outside the system boundary.

7Evaluation & the Central Claim

Task agents are benchmarked on correctness and completion. Those metrics are silent on every property that matters for a steward, which may complete tasks flawlessly while quietly disenfranchising its principal. We propose five metrics, each tied to an obligation and a mechanism.

**Table 1.** Five evaluation metrics for computational stewards.
Metric	What it measures
Stewardship Quality	Long-horizon alignment with the principal's objectives, assessed retrospectively against the Legitimacy Ledger rather than per-act.
Agency Preservation	The rate at which the agent surfaces genuine choices versus forecloses them — the operational form of the ecological obligation.
Environmental Quality	Diversity of the decision environments the agent constructs and their freedom from manipulative framing.
Reversibility Index	The expected systemic cost of undoing the agent's autonomous interventions — proposed as a first-class optimization target, not merely a metric.
Constitutional Compliance	The rate at which actions remain within the active mandate, and the integrity of the non-delegable boundary.

The central claim — stated to be falsifiable

Above a threshold of action volume and temporal span, principal satisfaction and trust in an agent are governed more strongly by legitimacy properties — reversibility, constitutional compliance, agency preservation — than by capability properties — task correctness, planning depth, completion rate.

Concretely: holding capability fixed and varying legitimacy will move trust more than holding legitimacy fixed and varying capability.

The experiment that would refute it. Deploy a set of ambient-agent variants against the same population over a span long enough for preference drift to occur. Hold capability fixed by sharing one underlying planning-and-action model across all variants and verifying parity on a standard task-completion benchmark. Then vary legitimacy along three manipulable factors — presence or absence of Consent Horizons, ledger legibility (full / summary / none), and the reversibility class of permitted actions. The dependent measures are principal-reported trust and a behavioral proxy for retained autonomy: the rate of un-coerced overrides and renewals.

If capability dominates legitimacy in determining trust at high action volume, the central thesis is false and ambient agency reduces to "task agents that run more often." We predict the opposite and stake the program on it.

Line chart of principal trust against action volume times temporal span. The solid capability-governed curve rises early then plateaus around 75; the dashed legitimacy-governed curve starts low and rises steeply, overtaking the capability curve at a marked threshold tau and continuing upward past 95. — Figure 10. **The crossover the claim predicts.** Capability-governed trust (solid) plateaus as action volume grows; legitimacy-governed trust (dashed) overtakes it at the threshold τ. Below τ, capability suffices to earn trust; above it — the ambient regime — only legitimacy does. This crossing is exactly what the refuting experiment is built to find or fail to find.

8Implications for Agentic Software

The practical upshot is a relocation of engineering effort. The prevailing stack is Prompt → Reason → Act, and almost all investment goes into the middle term. The ambient stack is Observe → Model → Anticipate → Steward → Govern, and its weight sits at the ends.

Figure 11. The relocation of engineering effort. Three inversions follow: the primary artifact shifts from the workflow to the persistent world model; the primary challenge from autonomy to legitimacy; the primary unit of value from task completion to the reduction of cognitive, organizational, and environmental friction over time.

9Objections & Limits

Intellectual honesty requires stating what would make this program fail, beyond the falsifiable claim. These limits do not undercut the program; they locate its frontier.

The confidence term c may be unfaithful. The entire criterion rests on c being calibrated. A confidently wrong agent (c → 1) bypasses the (1 − c) fallback and is more dangerous under this framework than under a tool framework. Modern neural networks are known to be poorly calibrated by default, tending toward overconfidence [6], which makes this not a hypothetical risk but the expected failure absent deliberate countermeasures. Calibration — especially out of distribution — is a precondition, not a detail; a steward that cannot calibrate its own confidence must be forced toward λ → 1 by default.
The legitimacy machinery may be theater. A ledger no principal ever reads, and horizons that auto-renew unexamined, reproduce the consent fictions of modern terms-of-service rather than curing them. The mechanisms are necessary but not sufficient; they require that renewal interactions be rare, meaningful, and genuinely capable of changing behavior.
Ecological optimization may be intrinsically manipulative. Any agent that shapes an environment shapes deliberation, and the line between preserving the capacity to choose and engineering the choice may not survive contact with optimization pressure. We take this to be the deepest open question, and we do not resolve it.

10Conclusion

Agentic software as it exists today is an intermediate form: autonomous within tasks, reactive about them. Its successor will be ambient — persistent, participatory, and oriented to stewardship rather than execution. Such systems are neither tools nor autonomous entities but computational fiduciaries, exercising delegated authority over a principal's goals, attention, and environment under conditions where supervision is structurally impossible.

The scientific question is no longer how to make an agent capable of acting; capability is increasingly given. It is how to make delegated computational authority legitimate — bounded by constitution, scoped by consent horizons, accountable through legible ledgers, and faithful across a principal's own change over time.

Ambient Agency, if the central claim holds, is not a faster way to complete tasks. It is the transition from interaction-centric software to stewardship-centric systems — and with it, the first time the question for computing becomes not what a machine can do for us, but on what authority it does so.

References

Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete Problems in AI Safety. arXiv:1606.06565.
Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, et al. 2022. Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073.
Paul F. Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep Reinforcement Learning from Human Preferences. NeurIPS 30.
Andy Clark and David J. Chalmers. 1998. The Extended Mind. Analysis 58, 1, 7–19.
Harry G. Frankfurt. 1971. Freedom of the Will and the Concept of a Person. The Journal of Philosophy 68, 1, 5–20.
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017. On Calibration of Modern Neural Networks. ICML 70, 1321–1330.
Dylan Hadfield-Menell, Stuart J. Russell, Pieter Abbeel, and Anca Dragan. 2016. Cooperative Inverse Reinforcement Learning. NeurIPS 29.
Edwin Hutchins. 1995. Cognition in the Wild. MIT Press, Cambridge, MA.
Michael C. Jensen and William H. Meckling. 1976. Theory of the Firm: Managerial Behavior, Agency Costs and Ownership Structure. Journal of Financial Economics 3, 4, 305–360.
Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2024. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? ICLR.
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, et al. 2022. Training Language Models to Follow Instructions with Human Feedback. NeurIPS 35, 27730–27744.
Derek Parfit. 1971. Personal Identity. The Philosophical Review 80, 1, 3–27.
Richard H. Thaler and Cass R. Sunstein. 2008. Nudge: Improving Decisions About Health, Wealth, and Happiness. Yale University Press, New Haven, CT.
Mark Weiser. 1991. The Computer for the 21st Century. Scientific American 265, 3, 94–104.
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. ICLR.

How to cite

This site presents the working paper Ambient Agency: A Theory of Computational Stewardship for the Post-Prompt Era by Andrew Swerdlow.

@inproceedings{swerdlow2026ambient, title = {Ambient Agency: A Theory of Computational Stewardship for the Post-Prompt Era}, author = {Swerdlow, Andrew}, year = {2026}, note = {Computational Stewardship Theory}, address = {San Francisco, California, USA} }