Aegis Orchestrator
Guides

Management via MCP

How to use built-in management tools to automate agent and workflow lifecycles.

AEGIS is designed for high degrees of autonomy. While the CLI is excellent for human operators, AI agents can manage the platform themselves using the built-in Management MCP Tools.

This enables powerful patterns like:

  • Agent Factories: Agents that create specialized "child" agents on demand.
  • Self-Optimizing Workflows: Workflows that analyze their own performance and update their definitions.
  • Generative Authoring: Generating complete agent/workflow manifests from natural language and deploying them immediately.

The Management Toolset

All aegis.* tools are built directly into the Orchestrator as first-class built-in tools. They require no external MCP servers and are available to any agent whose SecurityContext has the appropriate capabilities.

CapabilityTools
Agent Lifecycleaegis.agent.list, aegis.agent.search, aegis.agent.create, aegis.agent.update, aegis.agent.export, aegis.agent.delete, aegis.agent.generate
Agent Observabilityaegis.agent.logs
Workflow Lifecycleaegis.workflow.list, aegis.workflow.search, aegis.workflow.create, aegis.workflow.update, aegis.workflow.export, aegis.workflow.delete, aegis.workflow.run, aegis.workflow.status, aegis.workflow.executions.list, aegis.workflow.executions.get, aegis.workflow.generate, aegis.workflow.logs, aegis.workflow.cancel, aegis.workflow.signal, aegis.workflow.remove
Task Executionaegis.task.execute, aegis.task.status, aegis.task.wait, aegis.task.list, aegis.task.logs, aegis.task.cancel, aegis.task.remove
System Introspectionaegis.system.info, aegis.system.config
Schema & Validationaegis.schema.get, aegis.schema.validate

For a complete parameter reference, see the Management Tools documentation.


Pattern: The Agent Factory

An "Agent Factory" is an agent specialized in authoring other agents. It typically follows a structured loop to ensure safety and correctness.

Step 1: Discover Existing Capabilities

The factory should first check what agents already exist to avoid duplication.

// Tool Call: aegis.agent.list
{}

The response includes description, labels, and tags for each agent, enabling the factory to assess capability fit without running a full semantic search:

// aegis.agent.list — example response
{
  "agents": [
    {
      "name": "python-code-reviewer",
      "version": "1.0.0",
      "description": "Reviews Python code for style, bugs, and security issues.",
      "labels": { "role": "worker", "category": "code-review" },
      "tags": ["code-review", "python", "security-review", "static-analysis"]
    }
  ]
}

Step 2: Retrieve the Schema

To write a valid manifest, the agent retrieves the current canonical schema.

// Tool Call: aegis.schema.get
{ "key": "agent/manifest/v1" }

Step 3: Author and Validate

After composing the YAML, the agent validates it before attempting deployment.

// Tool Call: aegis.schema.validate
{
  "kind": "agent",
  "manifest_yaml": "apiVersion: aegis.ai/v1\nkind: Agent\n..."
}

Step 4: Deploy or Update

Finally, the agent deploys the new manifest. If updating an existing agent, it should ensure the version is incremented or use force: true.

// Tool Call: aegis.agent.create
{
  "manifest_yaml": "..."
}

Pattern: Safe Workflow Updates

Updating a workflow requires care to ensure the new definition is sound. The aegis.workflow.create tool provides a high-assurance path by running both deterministic checks and semantic judge evaluations.

  1. Export Current: Call aegis.workflow.export to get the current source.
  2. Modify: Apply changes to the YAML (increment version).
  3. Validate & Register: Call aegis.workflow.create with a task_context explaining the change.

If you intentionally want to overwrite an existing same-name/same-version workflow, pass force: true to aegis.workflow.create instead of introducing a separate deployment path.

// Tool Call: aegis.workflow.create
{
  "manifest_yaml": "...",
  "task_context": "Updating the data-science-pipeline to include a new CSV-parsing stage.",
  "judge_agents": ["workflow-generator-judge"],
  "min_score": 0.85
}

If the semantic judge score is below the threshold, the tool returns the judge's feedback, allowing the authoring agent to refine the manifest and try again.

To execute a workflow pinned to a specific version:

// Tool Call: aegis.workflow.run (with version)
{
  "name": "deploy-pipeline",
  "version": "1.0.0",
  "input": { "environment": "staging" }
}

For running workflows, the corresponding execution tools expose the read-side lifecycle:

  • aegis.workflow.status and aegis.workflow.executions.get for status, blackboard, and persisted execution state
  • aegis.workflow.logs for the execution event log and live follow mode

Pattern: Monitor and Manage Running Tasks

Management tools give agents complete control over the task execution lifecycle. This is particularly useful for orchestrator agents that launch, monitor, and clean up child tasks.

Step 1: Launch a Task

// Tool Call: aegis.task.execute
{
  "agent_id": "3f2504e0-4f89-11d3-9a0c-0305e82c3301",
  "input": { "target_url": "https://example.com/api" }
}

Returns { "execution_id": "6ba7b810-...", "status": "started" }.

To pin execution to a specific agent version:

// Tool Call: aegis.task.execute (with version)
{
  "agent_id": "code-reviewer",
  "version": "2.1.0",
  "input": { "repo": "my-project" }
}

Step 2: Wait for Completion

Use aegis.task.wait to block until the execution reaches a terminal state instead of manually polling with aegis.task.status:

// Tool Call: aegis.task.wait
{ "execution_id": "6ba7b810-...", "timeout_seconds": 300 }

Returns the final status object once the execution completes, fails, or is cancelled. If the timeout is exceeded, returns { "status": "timed_out", ... }.

Alternatively, you can poll manually with aegis.task.status:

// Tool Call: aegis.task.status
{ "execution_id": "6ba7b810-..." }

Returns { "status": "running", "iteration_count": 3, "last_output": "...", ... }.

Step 3: Fetch Task Logs

For task executions, retrieve a chronological event log:

// Tool Call: aegis.task.logs
{ "execution_id": "6ba7b810-...", "limit": 50, "offset": 0 }

Use offset to paginate through long-running task event histories. If you need live tailing, keep using the CLI aegis task logs <execution_id> --follow path.

Step 4: Cancel if Needed

// Tool Call: aegis.task.cancel
{ "execution_id": "6ba7b810-..." }

Step 5: Clean Up

After a task concludes, remove the execution record from the registry:

// Tool Call: aegis.task.remove
{ "execution_id": "6ba7b810-..." }

Pattern: Workflow Execution Control

Management tools provide full lifecycle control over workflow executions, including cancellation, human-in-the-loop signaling, and cleanup.

Cancel a Stuck Workflow

If a workflow execution is unresponsive or no longer needed, cancel it gracefully:

// Tool Call: aegis.workflow.cancel
{ "execution_id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8" }

Returns { "tool": "aegis.workflow.cancel", "cancelled": true, "execution_id": "6ba7b810-..." }.

Send Human Input to a Paused Workflow

Workflows using human_input states will pause and wait for a signal. Deliver the response via MCP:

// Tool Call: aegis.workflow.signal
{
  "execution_id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8",
  "response": "approved"
}

Returns { "tool": "aegis.workflow.signal", "signalled": true, "execution_id": "6ba7b810-..." }.

Clean Up Old Executions

After a workflow execution reaches a terminal state, remove the record from the registry:

// Tool Call: aegis.workflow.remove
{ "execution_id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8" }

Returns { "tool": "aegis.workflow.remove", "removed": true, "execution_id": "6ba7b810-..." }.


Pattern: Agent Activity Logs

Retrieve deployment events, execution summaries, and lifecycle transitions for a specific agent. This is useful for auditing and diagnostics.

// Tool Call: aegis.agent.logs
{
  "agent_id": "3f2504e0-4f89-11d3-9a0c-0305e82c3301",
  "limit": 100,
  "offset": 0
}

Returns a paginated list of agent events. Use offset to page through longer histories.


Pattern: System Introspection

Before executing complex operations, a management agent should confirm the node is healthy and verify which tools and capabilities are available.

// Tool Call: aegis.system.info
{}

Returns the node version, health status (healthy | unhealthy), and the full list of enabled tool capabilities. Use this to gate decisions on feature availability.

// Tool Call: aegis.system.config
{}

Returns the full node-config.yaml as a string — useful for agents performing self-diagnostic checks or configuration audits.


Pattern: Semantic Discovery (Enterprise)

When the discovery service is configured, agents and workflows can be found by natural-language intent rather than exact name. This is useful for agent factories that need to check whether a suitable agent already exists before creating a new one, and for orchestrator agents selecting the right workflow for a user's request.

// Tool Call: aegis.agent.search
{
  "query": "review pull requests for security vulnerabilities",
  "limit": 5,
  "min_score": 0.7
}

Returns a ranked list of agents whose descriptions semantically match the query. Similarly, aegis.workflow.search finds workflows by intent:

// Tool Call: aegis.workflow.search
{
  "query": "deploy Node.js application to Kubernetes",
  "labels": { "category": "deployment" }
}

Discovery is an enterprise feature. If the node does not have the discovery section configured, these tools return an error. Use aegis.agent.list and aegis.workflow.list as the universal fallback. Both list tools include description, labels, and tags in each result so that callers can assess capability fit even without semantic search:

// aegis.workflow.list — example response
{
  "workflows": [
    {
      "name": "dev-pipeline",
      "version": "1.0.0",
      "description": "Requirements analysis, implementation, and code review pipeline.",
      "labels": { "category": "development", "team": "platform" },
      "tags": ["code-review", "ci-cd", "human-in-the-loop", "software-development"]
    }
  ]
}

Security and Permissions

Managing agents and workflows is a highly privileged operation. Access to these tools is governed by the agent's SecurityContext.

By default, standard "worker" agents do not have access to aegis.* tools. You must explicitly grant these capabilities in the Security Context assigned to your authoring agents.

# In node-config.yaml or via SecurityContext management
name: "agent-authoring-context"
capabilities:
  - name: "aegis.agent.*"
  - name: "aegis.workflow.*"
  - name: "aegis.schema.*"

The aegis.workflow.* capability family covers workflow definition management, execution inspection, logging, signaling, and the higher-trust execution control flows documented above.


Next Steps

On this page