The agent manifest format, lifecycle states, runtime selection, and BYOLLM model alias system.

Agents

An Agent in AEGIS is a stateless compute process defined entirely by a declarative YAML manifest (kind: Agent). The orchestrator reads the manifest to determine the runtime environment, what tools the agent is allowed to use, what security constraints to enforce, what resources to allocate, and how to validate the output.

Agents do not maintain state between executions. Context is injected by the orchestrator at the start of each execution and tool results are returned to the agent via the orchestrator proxy.

Execution Context Overrides

Direct agent executions can inject a structured dictionary of context variables at start time using context_overrides.

The dictionary must be a JSON or YAML object.
Keys are exposed as top-level template variables during prompt rendering.
The same normalized object is forwarded into the runtime task context so the agent receives the structured values beyond prompt interpolation.
Reserved built-in keys cannot be overridden. Attempts to replace fields such as instruction, input, or iteration metadata fail validation.

This makes it possible to keep a reusable agent manifest while swapping execution-specific variables from a file or API call.

aegis task execute python-coder \
  --input @input.json \
  --context @context.yaml

Example context.yaml:

repo_url: https://github.com/example/service
branch: main
review:
  severity: high

If the prompt template references {{repo_url}} or {{review.severity}}, the override values win over same-named non-reserved context variables for that execution.

Manifest Structure

All agent manifests follow a Kubernetes-style format:

apiVersion: 100monkeys.ai/v1
kind: Agent
metadata:
  name: code-reviewer
  version: "1.0.0"
  labels:
    team: platform
    environment: production
spec:
  runtime:
    language: python
    version: "3.11"
    isolation: docker

  task:
    instruction: "Reviews pull requests and outputs structured feedback."

  security:
    network:
      mode: allow
      allowlist:
        - api.github.com
        - api.openai.com
    filesystem:
      read:
        - /workspace
      write:
        - /workspace
    resources:
      cpu: 1000
      memory: "1Gi"
      timeout: "300s"

  execution:
    mode: iterative
    max_iterations: 10
    validation:
      system:
        must_succeed: true
      output:
        format: json

  env:
    LOG_LEVEL: info

  volumes:
    - name: workspace
      storage_class: ephemeral
      mount_path: /workspace
      access_mode: read-write
      ttl_hours: 1

Manifest Fields

The table below covers the most common fields. For the complete field-by-field specification including all options, defaults, and validation rules, see the Agent Manifest Reference.

`metadata`

Field	Type	Required	Description
`name`	string	✓	Unique identifier for the agent on this node. Used in CLI commands and gRPC calls.
`labels`	map[string]string		Arbitrary key-value tags for filtering and organization.

`spec`

Field	Type	Required	Description
`description`	string		Human-readable description injected into the system prompt.
`runtime`	object	✓	Language, version, and isolation mode.
`task`	object		Instruction and prompt template. Community skills can be imported via the `skill-import` workflow.
`security`	object		Network policy, filesystem policy, resource limits (deny-by-default).
`execution`	object		Iteration mode, max iterations, and validation criteria.
`tools`	object[] or string[]		MCP tools the agent may invoke.
`env`	map[string]string		Environment variables injected into the agent container.
`volumes`	object[]		Volume mounts. See Configuring Storage.

`spec.security`

Field	Type	Description
`network.mode`	`allow` \| `deny` \| `none`	Policy mode. `allow` = allowlist; `none` = no network.
`network.allowlist`	string[]	Allowed domain names and CIDR blocks.
`network.denylist`	string[]	Explicitly blocked domains.
`filesystem.read`	string[]	Readable paths inside the container. Glob patterns supported.
`filesystem.write`	string[]	Writable paths inside the container. Glob patterns supported.
`resources.cpu`	integer	CPU in millicores (`1000` = 1 core). Default: `1000`.
`resources.memory`	string	Memory limit. Human-readable: `"512Mi"`, `"1Gi"`. Default: `"512Mi"`.
`resources.timeout`	string	Total execution timeout. Human-readable: `"300s"`, `"5m"`. Max `"1h"`.

`spec.execution`

Field	Type	Description
`mode`	`one-shot` \| `iterative`	Execution strategy. `iterative` enables the 100monkeys refinement loop. Default: `one-shot`.
`max_iterations`	integer	Maximum refinement loops (for iterative mode). Default: `5`.
`iteration_timeout`	string	Per-iteration timeout. Human-readable: `"30s"`, `"60s"`, `"5m"`. Default: `"300s"` (5 minutes). Each iteration gets this much time for LLM calls + tool invocations.
`llm_timeout_seconds`	integer	HTTP socket timeout for `bootstrap.py` LLM calls. Default: `300` seconds.
`memory`	boolean	Enable Cortex memory system. Default: `false`.

Agent Lifecycle

An agent transitions through the following states after deploy:

deployed → paused → deployed
    │
    └──────────────────────→ archived

State	Description
`deployed`	The manifest is registered. New executions can be started against this agent.
`paused`	The manifest is retained but no new executions are accepted. Running executions complete normally.
`archived`	The manifest is soft-deleted. Cannot be unarchived. Historical execution records are retained.

aegis agent deploy ./agent.yaml
aegis agent list
aegis agent show <id>
aegis agent remove <id>

Semantic Discovery (Enterprise)

On nodes with the discovery service configured, deployed agents are automatically indexed for semantic search. This allows agents and operators to find existing agents by natural-language intent (e.g. "review code for security issues") rather than by exact name, using the aegis.agent.search MCP tool. This is an enterprise feature; nodes without discovery configured use aegis.agent.list for enumeration.

Runtime Selection

Two separate concerns govern how an agent runs:

The container image — defined entirely by the agent manifest (spec.runtime). Each agent declares its own image, independently of other agents on the same node.
The isolation technology — determined by node configuration (aegis-config.yaml). This controls whether images run inside Docker containers or Firecracker microVMs.

Container Image: Manifest-Driven

AEGIS supports two mutually exclusive ways to specify the container image in spec.runtime:

Mode	Manifest Fields	How the Image Is Resolved
StandardRuntime	`language` + `version`	Orchestrator resolves to a pinned official image (e.g., `python:3.11-slim`). No image to build or maintain.
CustomRuntime	`image`	Orchestrator pulls directly from your registry. You build and maintain the image.

See Standard Runtime Registry for the full language-version table, and Custom Runtime Agents for the custom image path.

Isolation Technology: Node-Config-Driven

The AgentRuntime trait abstracts all isolation backends. Switching a node from Docker to Podman or Firecracker requires only a config change — no agent manifest changes needed.

Isolation	Use Case	Requirement
`docker`	Development, staging, and production	Docker or Podman runtime accessible via `container_socket_path`
`firecracker`	Hardened production	Bare-metal or KVM-passthrough host, Linux kernel 5.10+

The docker isolation type works with both Docker and Podman. The orchestrator auto-detects the container engine from the configured socket. Podman rootless mode provides additional security by eliminating the privileged daemon.

See Docker Deployment, Podman Deployment, and Firecracker Runtime for deployment details.

BYOLLM: Model Alias System

Agent manifests reference LLM models by alias, not by provider name. This decouples agent definitions from infrastructure choices.

Specify the model alias in your manifest (for example in semantic validation settings), and let the node operator map aliases to providers in node configuration.

In aegis-config.yaml, the node operator maps aliases to providers:

llm:
  providers:
    - name: openai-gpt4o
      type: openai
      api_key: "env:OPENAI_API_KEY"
      model: gpt-4o
    - name: claude-sonnet-4-5
      type: anthropic
      api_key: "env:ANTHROPIC_API_KEY"
      model: claude-sonnet-4-5
  aliases:
    default: openai-gpt4o
    fast: claude-sonnet-4-5
    reasoning: openai-gpt4o

To swap the model backing the default alias for all agents on the node, change the alias mapping in config and restart the daemon — no manifest changes required.

See Configuring LLM Providers for the full provider configuration reference.

Agents

Agents

Execution Context Overrides

Manifest Structure

Manifest Fields

`metadata`

`spec`

`spec.security`

`spec.execution`

Agent Lifecycle

Semantic Discovery (Enterprise)

Runtime Selection

Container Image: Manifest-Driven

Isolation Technology: Node-Config-Driven

BYOLLM: Model Alias System

See Also

On this page