The durable execution engine that powers every agent on Metaprise. Crash recovery, event replay, saga compensation, and checkpoint persistence — tasks are never lost, even across infrastructure failures.
AURA Runtime is organized into three layers: execution, durability, and monitoring. Together they guarantee that every agent task completes — even across crashes, timeouts, and infrastructure failures.
Task Execution Engine
The core execution engine that receives Missions, orchestrates tool calls, manages agent state, and drives the mission through its lifecycle. Every execution is deterministic and reproducible.
State Memory Management
Manages the agent's working memory during execution — MissionStateData reads and writes, context retrieval, and inter-step state passing. Every write generates an AuditChain entry.
DualToken-Authenticated Gateway
Every tool call passes through the Tool Gateway, which validates both AgentIdentity and ExecutionToken before allowing the call to proceed. No valid DualToken, no execution.
Audit Write Engine
Dedicated engine for synchronous audit writes. Every action — tool calls, state changes, permission checks — is written to the AuditChain before execution continues.
Crash Recovery + Event Replay
If the runtime crashes mid-execution, the Durable Execution Engine replays the event history to recover exact state. No task is ever lost — guaranteed.
Automatic Multi-Step Rollback
When a multi-step Mission fails partway through, the Saga engine executes compensating transactions to roll back completed steps — automatically and in reverse order.
Per-Step State Persistence
After each execution step, the agent's state is checkpointed. On recovery, execution resumes from the last checkpoint — not from scratch.
Progress Monitoring
Long-running tasks emit heartbeats to prove they're still alive. If heartbeats stop, the runtime intervenes — rescheduling on healthy infrastructure or triggering compensation.
Configurable Retry Strategies
Exponential backoff, maximum attempts, non-retryable error classification — all configurable per Mission. Failed steps retry automatically according to policy.
State Renewal for Ultra-Long Executions
For executions that exceed single-history limits, Continue-As-New refreshes the execution state while preserving all progress — overcoming platform limits on execution duration.
Every execution step is recorded as an event. If the runtime crashes — hardware failure, network partition, OOM kill — the Durable Execution Engine replays the event history to recover exact state. The agent resumes from where it stopped, not from the beginning.
Multi-step Missions often involve irreversible side effects — sending emails, transferring funds, updating records. When a later step fails, the Saga engine executes compensating transactions to undo completed steps. Checkpoints ensure recovery starts from the last successful step, not from scratch.
Long-running agent tasks — multi-hour data analysis, overnight compliance scans, continuous monitoring — need special handling. Activity Heartbeat ensures liveness detection, while Continue-As-New allows executions to run indefinitely by refreshing state at platform limits.
Every step is checkpointed, every action is audited, every failure triggers recovery. The runtime ensures completion — not just execution.