Aegis Orchestrator
Architecture

Storage Gateway

AegisFSAL architecture, user-space NFSv3, FileHandle structure, UID/GID squashing, path canonicalization, and SeaweedFS integration.

Storage Gateway

The AEGIS Storage Gateway is the security boundary for all filesystem access by agent containers. It is implemented as a user-space NFSv3 server (AegisFSAL) running on the orchestrator host, with agent containers mounting their volumes via the kernel NFS client.


Design Philosophy

Traditional container volume mounts (bind mounts, CAP_SYS_ADMIN FUSE mounts) give agent containers unrestricted access to mounted storage once the mount is established. AEGIS takes a different approach:

Every POSIX operation is routed through the orchestrator-controlled AegisFSAL before reaching SeaweedFS. This means:

  • Per-operation authorization: The orchestrator validates every read, write, create, and delete against the execution's manifest policies.
  • Full audit trail: Every file operation is published as a StorageEvent domain event.
  • Path traversal prevention: Server-side path canonicalization blocks ../ attempts before they reach SeaweedFS.
  • No elevated privileges: Agent containers require zero special capabilities (CAP_SYS_ADMIN is not needed).

Component Hierarchy

Agent Container (Docker)
  │  kernel NFS client
  │  mount: addr=orchestrator_host, nfsvers=3, proto=tcp, nolock
  │  /workspace → NFS server

Orchestrator Host: NFS Server Gateway (user-space, tcp, port 2049)
  │  NFSv3 protocol handler (nfsserve Rust crate)

AegisFSAL (File System Abstraction Layer)
  │  receive: LOOKUP, READ, WRITE, READDIR, GETATTR, CREATE, REMOVE
  ├──► Decode FileHandle → extract execution_id + volume_id
  ├──► Authorize: does execution own this volume?
  ├──► Canonicalize path: reject ".." components
  ├──► Enforce FilesystemPolicy (manifest allowlists)
  ├──► Apply UID/GID squashing (return agent container's UID/GID, not real ownership)
  ├──► Enforce quota (size_limit_bytes)
  ├──► Publish StorageEvent to Event Bus

StorageProvider trait (via StorageRouter)
  ├── SeaweedFS POSIX API client (default)
  ├── OpenDalStorageProvider (S3, GCS, Azure)
  ├── LocalHostStorageProvider (NVMe, bind mounts)
  └── SealStorageProvider (Remote Node execution coordination)

AegisFileHandle

The NFSv3 protocol requires servers to return an opaque FileHandle for each file and directory. AEGIS encodes authorization information directly into the FileHandle:

FileHandle layout (48 bytes raw, ~52 bytes serialized, ≤64 bytes NFSv3 limit):
┌──────────────────────────────────────────────────┐
│  execution_id  (UUID binary, 16 bytes)            │
│  volume_id     (UUID binary, 16 bytes)            │
│  path_hash     (u64, 8 bytes)  — FNV hash of path │
│  created_at    (i64, 8 bytes)  — Unix timestamp   │
└──────────────────────────────────────────────────┘

Because NFSv3's 64-byte limit does not allow storing a full file path in the handle, path_hash contains a hash of the canonical path. The NFS server maintains a bidirectional in-memory FileHandleTable (fileid3 ↔ AegisFileHandle) that maps numeric file IDs to handles and reconstructs paths on demand. This table is per-execution and discarded when the execution ends.

On every NFS operation, AegisFSAL decodes the FileHandle, extracts execution_id and volume_id, and verifies that the requesting execution is authorized to access that volume. If the execution does not own the volume, the operation fails with NFS3ERR_ACCES and an UnauthorizedVolumeAccess event is published.

The 64-byte NFSv3 FileHandle size limit is a hard protocol constraint enforced by the kernel NFS client. The current layout serializes to ~52 bytes via bincode, safely within the limit.


UID/GID Squashing

When SeaweedFS stores files, they carry a real POSIX UID/GID. Agent containers run as varying user IDs. Without squashing, file ownership mismatches would cause permission errors.

AegisFSAL overrides all file metadata returned by GETATTR to report the agent container's UID/GID rather than the real file ownership:

  • All GETATTR responses return uid = agent_container_uid, gid = agent_container_gid.
  • POSIX permission bit checks (chmod 600) are not enforced by the NFS server.
  • Authorization is handled entirely by the manifest FilesystemPolicy, not kernel permission bits.

The agent_container_uid and agent_container_gid are stored in the Execution metadata when the container is created and retrieved by AegisFSAL during each operation.


Path Canonicalization

All incoming paths are canonicalized before reaching the StorageProvider:

  1. Resolve any . components.
  2. Detect any .. components.
  3. If .. is detected, reject the entire operation with NFS3ERR_ACCES and publish a PathTraversalBlocked event.
  4. Strip the volume's root prefix to produce a path relative to the SeaweedFS bucket.

Example:

Incoming:    /workspace/../etc/passwd
Step 2:      ".." detected
Step 3:      REJECTED → NFS3ERR_ACCES
             PathTraversalBlocked event published

Filesystem Policy Enforcement

Each WRITE, CREATE, and REMOVE operation is validated against the manifest's FilesystemPolicy:

spec:
  security:
    filesystem:
      read:
        - /workspace
        - /agent
      write:
        - /workspace

If an agent attempts to write to /agent/config.py but only /workspace is in write, the operation is blocked with NFS3ERR_PERM and a FilesystemPolicyViolation event is published.


Quota Enforcement

When size_limit is set in the volume declaration, AegisFSAL tracks cumulative bytes written to the volume. Before each WRITE:

current_volume_size + write_size > parsed(size_limit)?
  → YES: fail with NFS3ERR_NOSPC, emit VolumeQuotaExceeded event
  → NO:  proceed with write

Quota accounting is maintained in-memory per execution and persisted to PostgreSQL. It is not affected by file deletions in Phase 1 (quota only tracks bytes written, not net storage used).


Storage Routing

The StorageRouter acts as a proxy for the AegisFSAL, allowing operations on volumes using distinct backends (like OpenDAL or LocalHost). Every POSIX operation requests the StorageRouter to find the correct StorageProvider for the specified volume_id in the FileHandle.

For more in-depth operational mechanisms regarding diverse storage modes, see Storage Backends.

AegisFSAL is designed as a transport-agnostic core. The NFSv3 server is the Phase 1 transport for Docker-based deployments. In Phase 2 (Firecracker), a virtio-fs frontend will use the same AegisFSAL security and authorization logic with zero code duplication:

Phase 1: Docker
  NFSv3 Frontend → AegisFSAL → StorageProvider

Phase 2: Firecracker
  virtio-fs Frontend → AegisFSAL → StorageProvider

The FSAL authorization logic, path canonicalization, UID/GID squashing, quota tracking, and event publishing are written once in AegisFSAL and shared across both transports.


Volume Lifecycle

Volumes follow a deterministic state machine managed by the orchestrator:

Creating ──► Available ──► Attached ──► Detached
    │                          │             │
    │                          └─────────────┤
    │                                        │
    └──────────────────────────────────► Deleting ──► Deleted

Any non-terminal state ──► Failed
StateMeaning
CreatingDirectory being provisioned in SeaweedFS and quota being set
AvailableSeaweedFS directory ready; no container has mounted it yet
AttachedNFS export active; container is mounted and I/O is live
DetachedContainer stopped; NFS export removed; volume data intact in SeaweedFS
DeletingDelete request accepted; SeaweedFS directory removal in progress
DeletedSeaweedFS directory confirmed removed; record retained briefly for audit trail
FailedA state transition failed (e.g., SeaweedFS unreachable during creation or deletion)

Available → Attached occurs when the container starts and the NFS mount is confirmed active. Attached → Detached occurs when the container stops or is killed. Ephemeral volumes with no active execution proceed immediately from Detached to Deleting. Persistent volumes remain in Detached until explicitly deleted via the CLI or API.

Failed volumes are surfaced through volume management APIs and can be retried by reissuing the delete request through the orchestrator API.


Phase 1 Constraints

nolock Mount Option

All NFS mounts in Phase 1 use nolock. This disables the NLM (Network Lock Manager) protocol, meaning POSIX advisory file locks (flock, fcntl) are not coordinated across agents.

This is safe for the common case of single-agent-per-volume. For multi-agent coordination (swarms), use the ResourceLock mechanism provided by the swarm coordination context instead of POSIX locks.

Single-Writer Constraint

Persistent volumes with ReadWrite access can only be mounted by one execution at a time. Attempting a second ReadWrite mount on the same volume returns VolumeAlreadyMounted. Multiple executions may hold ReadOnly mounts simultaneously.

SRE & Performance Tuning

To optimize the AegisFSAL NFS Server Gateway for varied agent workloads, operators should tune the kernel NFS client mount options. By default, the orchestrator mounts volumes with the following options:

addr=<orchestrator_host>,nfsvers=3,proto=tcp,soft,timeo=10,nolock,acregmin=3,acregmax=60
  • Graceful Degradation (soft,timeo=10): A soft mount ensures that if the NFS server crashes or becomes unreachable, the agent container's I/O operations will return an EIO error rather than hanging indefinitely in a D-state. The timeo=10 parameter specifies a 1-second timeout (measured in deciseconds).
  • Client Caching (acregmin, acregmax): The kernel NFS client caches file attributes. Lowering these values (acregmin=1) reduces cache staleness at the cost of more GETATTR calls to the orchestrator. For high-throughput artifact generation, keeping the defaults (acregmin=3, acregmax=60) reduces load on the orchestrator.
  • Latency Overhead: Because every POSIX operation routes through the AegisFSAL orchestrator process for authorization, there is an expected 1-2ms latency overhead per operation compared to a direct FUSE mount. This is generally acceptable for agent-driven code generation, but may affect high-frequency I/O workloads.

Export Path Routing

Each volume gets a unique NFS export path derived from its tenant and volume identifiers:

/{tenant_id}/{volume_id}

The orchestrator maintains a runtime NfsVolumeRegistry — a concurrent map of VolumeId → NfsVolumeContext. When an execution mounts a volume, its export path is registered. When the execution ends and the volume is detached, the entry is removed. Volumes that are not currently mounted have no active export and cannot be reached via NFS.

The agent container's NFS mount is configured to target the orchestrator host at the volume's export path:

addr=<orchestrator_host>,nfsvers=3,proto=tcp,soft,timeo=10,nolock
device: :/<tenant_id>/<volume_id>
target: /workspace  (or mount_path from manifest)

Storage Events

Every file operation handled by AegisFSAL publishes a StorageEvent to the event bus. These events are persisted to PostgreSQL by a background StorageEventPersister task and form the complete file-level audit trail for each execution.

EventTrigger
FileOpenedAgent opens a file (open() / create())
FileReadBytes are read from a file — includes offset and bytes_read
FileWrittenBytes are written to a file — includes offset and bytes_written
FileClosedFile handle is released
DirectoryListedreaddir is called on a directory
FileCreatedA new file is created
FileDeletedA file is removed
PathTraversalBlockedA .. component was detected in the incoming path
FilesystemPolicyViolationAn operation violated a manifest read/write allowlist
QuotaExceededA write would exceed the volume's size_limit
UnauthorizedVolumeAccessThe requesting execution does not own the volume

All events carry execution_id, volume_id, and a timestamp. File operation events additionally carry the canonicalized path, byte counts, and latency in milliseconds.


SeaweedFS Integration

SeaweedFS is the default StorageProvider. The orchestrator communicates with SeaweedFS through two separate interfaces:

InterfaceUsed For
HTTP Filer API (port 8888)Directory lifecycle (create, delete, set quota, get usage, list) — called by VolumeManager during volume provisioning and GC
HTTP Filer API (port 8888)POSIX file operations (open, read, write, stat, readdir, create, rename, delete) — called by AegisFSAL on every NFS LOOKUP, READ, WRITE, READDIR, GETATTR etc.

Volume data is stored in SeaweedFS at the following path structure:

/{tenant_id}/{volume_id}/{file_path}

For example, a file /workspace/main.py written by an execution exec-abc on volume vol-xyz in the default single-tenant environment is stored at:

/00000000-0000-0000-0000-000000000001/vol-xyz/main.py

Replication

SeaweedFS replication is configured independently of AEGIS at the SeaweedFS layer. The AEGIS orchestrator does not set the replication factor on volume directories — this is controlled by the SeaweedFS default replication setting and can be overridden in SeaweedFS collection configuration.

A common convention for AEGIS deployments is:

Storage ClassSeaweedFS ReplicationRationale
Ephemeral000 (no replication)TTL-based; durability not required
Persistent001 (one copy on different nodes)Survives single node failure

Health Checks

The orchestrator checks SeaweedFS health via the Filer API on startup and periodically thereafter. If SeaweedFS is unreachable and fallback_to_local is enabled in the node configuration, the orchestrator falls back to a local filesystem StorageProvider. The local fallback does not support S3 artifact inspection or multi-node access.

On this page