Skip to content

Conversation

@ThomasK33
Copy link
Member

@ThomasK33 ThomasK33 commented Dec 18, 2025

Implements “sub-workspaces as subagents” by introducing agent Tasks backed by child workspaces spawned via the new task tool.

  • Built-in presets: research, explore
  • Config + UI for max parallel tasks / nesting depth
  • Restart-safe orchestration (queueing, report delivery to parent, auto-resume)
  • Explicit reporting via agent_report + leaf auto-cleanup
  • Sidebar nesting for child workspaces

Validation:

  • make static-check
  • bun test src/node/services/tools/task.test.ts src/node/services/taskService.test.ts

📋 Implementation Plan

🤖 Sub-workspaces as subagents (Mux)

Decisions (confirmed)

  • Lifecycle: auto-delete the subagent workspace after it completes (and after its child tasks complete).
  • Isolation (runtime-aware): create subagent workspaces using the parent workspace’s runtime (runtimeConfig); prefer runtime.forkWorkspace(...) (when implemented) so the child starts from the parent’s branch.
  • Results: when the subagent finishes, it calls agent_report and we post the report back into the parent workspace.
  • Limits (configurable): max parallel subagents + max nesting depth (defaults: 3 parallel, depth 3).
  • Durability: if Mux restarts while tasks are running, tasks resume and the parent awaits existing tasks (no duplicate spawns).
  • Delegation: expose a task tool so any agent workspace can spawn sub-agent tasks (depth-limited).
  • Built-in presets (v1): Research + Explore.

Recommended approach: Workspace Tasks (net +~1700 LoC product code)

Represent each subagent as a Task (as described in subagents.md), implemented as a child workspace plus orchestration.

This keeps the v1 scope small while keeping the API surface task-shaped so we can later reuse it for non-agent tasks (e.g., background bashes).

High-level architecture

flowchart TD
  Parent["Parent workspace"]
  TaskTool["tool: task"]
  Spawn["Task.create(parentId, kind=agent, agentType, prompt)"]
  Child["Child workspace (agent)"]

  ReportTool["tool-call-end: agent_report"]
  Report["Append report message to parent history + emit chat event"]
  Cleanup["Remove child workspace + delete runtime resources"]

  StreamEndNoReport["stream-end (no report)"]
  Reminder["Send reminder: toolPolicy requires agent_report"]

  Parent --> TaskTool --> Spawn --> Child
  Child --> ReportTool --> Report --> Cleanup
  Child --> StreamEndNoReport --> Reminder --> Child
Loading

Data model

Alignment with subagents.md (what we’re matching)
  • Agent identity: Claude’s agentId maps cleanly to Mux’s workspaceId for the child workspace.
  • Spawning: Claude’s Task(subagent_type=…, prompt=…) becomes Mux tool task, backed by Task.create({ parentWorkspaceId, kind: "agent", agentType, prompt }).
  • Tool filtering: Claude’s tools/disallowedTools maps to Mux’s existing toolPolicy (applied in order).
  • Result propagation: Agent tasks use an explicit agent_report tool call (child → parent) plus a backend retry if the tool wasn’t called. (Future: bash tasks can map to existing background bash output, or be unified behind a Task.output API.)
  • Background vs foreground: task({ run_in_background: true, ... }) returns immediately; otherwise the tool blocks until the child calls agent_report (with a timeout).

Extend workspace metadata with optional fields:

  • parentWorkspaceId?: string — enables nesting in the UI
  • agentType?: "research" | "explore" | string — selects an agent preset

(These are optional so existing configs stay valid.)

Agent presets (built-in)

Create a small registry of agent presets that define:

  • toolPolicy (enforced)
  • systemPrompt (preset-defined; can replace or append; v1 uses replace so each subagent can fully override the parent’s user instructions)

Implementation detail: for agent task workspaces, treat the preset’s systemPrompt as the effective prompt (internal mode), instead of always appending to the parent workspace’s system message.

  • A required reporting mechanism: the agent must call agent_report exactly once when it has a final answer

Initial presets:

  • Research: allow web_search + web_fetch (and optionally file_read), disallow edits.
  • Explore: allow read-only repo exploration (likely file_read + bash for rg/git), disallow file edits.

Both presets should enable:

  • task (so agents can spawn subagents when useful)
  • agent_report (so leaf tasks have a single, explicit channel for reporting back)

Enforce max nesting depth from settings (default 3) in the backend to prevent runaway recursion.

Note: Mux doesn’t currently have a “grep/glob” tool; Explore will either need bash or we add a future safe-search tool.


Implementation steps

1) Schemas + types (IPC boundary)

Net +~50 LoC

  • Extend:
    • WorkspaceMetadataSchema / FrontendWorkspaceMetadataSchema (src/common/orpc/schemas/workspace.ts)
    • WorkspaceConfigSchema (src/common/orpc/schemas/project.ts)
  • Thread the new fields through WorkspaceMetadata / FrontendWorkspaceMetadata types.

2) Persist config (workspace tree + task settings)

Net +~320 LoC

  • Workspace tree fields

    • Ensure config write paths preserve parentWorkspaceId and agentType.
    • Update Config.getAllWorkspaceMetadata() to include the new fields when constructing metadata.
  • Task settings (global; shown in Settings UI)

    • Persist taskSettings in ~/.mux/config.json, e.g.:
      • maxParallelAgentTasks (default 3)
      • maxTaskNestingDepth (default 3)
    • Settings UI
      • Add a small Settings section (e.g. “Tasks”) with two number inputs.
      • Read via api.config.getConfig(); persist via api.config.saveConfig().
      • Clamp to safe ranges (e.g., parallel 1–10, depth 1–5) and show the defaults.
  • Task durability fields (per agent task workspace)

    • Persist a minimal task state for child workspaces (e.g., taskStatus: queued|running|awaiting_report) so we can rehydrate and resume after restart.

3) Backend Task API: Task.create

Net +~450 LoC
Add a new task operation (ORPC + service) that is intentionally generic:

  • Task.create({ parentWorkspaceId, kind, ... })
    • Return a task-shaped result: { taskId, kind, status }.

V1: implement kind: "agent" (sub-workspace agent task):

  1. Validate parent workspace exists.
  2. Enforce limits from taskSettings (configurable):
    • Max nesting depth (maxTaskNestingDepth, default 3) by walking the parentWorkspaceId chain.
    • Max parallel agent tasks (maxParallelAgentTasks, default 3) by counting running agent tasks globally (across the app).
    • If parallel limit is reached: persist as status: "queued" and start later (FIFO).
  3. Create a new child workspace ID + generated name (e.g., agent_research_<id>; must match [a-z0-9_-]{1,64}).
  4. Runtime-aware: create the child workspace using the parent workspace’s runtimeConfig (Local/Worktree/SSH).
    • Prefer runtime.forkWorkspace(...) (when implemented) so the child starts from the parent’s branch.
    • Otherwise fall back to runtime.createWorkspace(...) with the same runtime config (no branch isolation).
  5. Write workspace config entry including { parentWorkspaceId, agentType, taskStatus }.
  6. When the task is started, send the initial prompt message into the child workspace.

Durability / restart:

  • On app startup, rehydrate queued/running tasks from config and resume them:

    • queued tasks are scheduled respecting maxParallelAgentTasks
    • running tasks get a synthetic “Mux restarted; continue + call agent_report” message.
  • Parent await semantics (restart-safe):

    • While a parent workspace has any descendant agent tasks in queued|running|awaiting_report, treat it as “awaiting” and avoid starting new subagent tasks from it.
    • When the final descendant task reports, automatically resume any parent partial stream that was waiting on the task tool call.

Design note: keep the return type “task-shaped” (e.g., { taskId, kind, status }) so we can later add kind: "bash" tasks that wrap existing background bashes.

4) Tool: task (agents can spawn sub-agents)

Net +~250 LoC
Expose a Claude-like Task tool to the LLM (but backed by Mux workspaces):

  • Tool: task

    • Input (v1): { subagent_type: string, prompt: string, description?: string, run_in_background?: boolean }

    • Behavior:

      • Spawn (or enqueue) a child agent task via Task.create({ parentWorkspaceId: <current workspaceId>, kind: "agent", agentType: subagent_type, prompt, ... }).

      • If run_in_background is true: return immediately { status: "queued" | "running", taskId }.

      • Otherwise: block (potentially across queue + execution) until the child calls agent_report (or timeout) and return { status: "completed", reportMarkdown }.

      • Durability: if this foreground wait is interrupted (app restart), the child task continues; when it reports, we persist the tool output into the parent message and auto-resume the parent stream.

    • Wire-up: add to TOOL_DEFINITIONS + register in getToolsForModel(); inject taskService into ToolConfiguration so the tool can call Task.create.

  • Guardrails

    • Enforce maxTaskNestingDepth and maxParallelAgentTasks from settings (defaults: depth 3, parallel 3).
      • If parallel limit is reached, new tasks are queued and the parent blocks/awaits until a slot is available.
    • Disallow spawning new tasks after the workspace has called agent_report.

5) Enforce preset tool policy + system prompt

Net +~130 LoC
In the backend send/stream path:

  • Compute an effective tool policy:
    • effectivePolicy = [...(options.toolPolicy ?? []), ...presetPolicy]
    • Apply presetPolicy last so callers cannot re-enable restricted tools.
  • System prompt strategy for agent task workspaces (per preset):
    • Replace (default): ignore the parent workspace’s user instructions and use the preset’s systemPrompt as the effective instructions (internal-only agent mode).
    • Implementation: add an internal system-message variant (e.g., "agent") that starts from an empty base prompt (no user custom instructions), then apply preset.systemPrompt.
    • Append (optional): keep the normal workspace system message and append preset instructions.
  • Ensure the preset prompt covers:
    • When/how to delegate via the task tool (available subagent_types).
    • When/how to call agent_report (final answer only; after any spawned tasks complete).

6) Auto-report back + auto-delete (orchestrator)

Net +~450 LoC
Add a small reporting tool + orchestrator that ensures the child reports back explicitly, and make it durable across restarts.

  • Tool: agent_report

    • Input: { reportMarkdown: string, title?: string } (or similar)
    • Execution: no side effects; return { success: true } (the backend uses the tool-call args as the report payload)
    • Wire-up: add to TOOL_DEFINITIONS + register in getToolsForModel() as a non-runtime tool
  • Orchestrator behavior

    • Primary path: handle tool-call-end for agent_report

      1. Validate workspaceId is an agent task workspace and has parentWorkspaceId.
      2. Persist completion (durable):
        • Update child workspace config: taskStatus: "reported" (+ reportedAt).
      3. Deliver report to the parent (durable):
        • Append an assistant message to the parent workspace history (so the user can read the report).
        • If the parent has a partial assistant message containing a pending task tool call, update that tool part from input-availableoutput-available with { reportMarkdown, title } (like the ask_user_question restart-safe fallback).
        • Emit tool-call-end + workspace.onChat events so the UI updates immediately.
      4. Auto-resume the parent (durable tool call semantics):
        • If the parent has a partial message and no active stream, call workspace.resumeStream(parent) after writing the tool output.
        • Only auto-resume once the parent has no remaining running descendant tasks (so it doesn’t spawn duplicates).
      5. Cleanup:
        • If the task has no remaining child tasks, delete the workspace + runtime resources (branch/worktree if applicable).
        • Otherwise, mark it pending cleanup and delete it once its subtree is gone.
    • Enforcement path: if a stream ends without an agent_report call

      1. Send a synthetic "please report now" message into the child workspace with a toolPolicy that requires only agent_report.
      2. If still missing after one retry, fall back to posting the child's final text parts (last resort) and clean up to avoid hanging sub-workspaces.

7) UI: nested sidebar rows

Net +~100 LoC

  • Update sorting/rendering so child workspaces appear directly below the parent with indentation.
  • Add a small depth prop to WorkspaceListItem and adjust left padding.

8) No user-facing launcher (agent-orchestrated only)

Net +~0 LoC

  • Do not add slash commands / command palette actions for spawning tasks.
  • Tasks are launched exclusively via the model calling the task tool from the parent workspace.

9) Tests

~200 LoC tests (not counted in product LoC estimate)

  • Unit test: workspace tree flattening preserves parent→child adjacency.
  • Unit/integration test: task tool spawns/enqueues a child agent task and enforces maxTaskNestingDepth.
  • Unit/integration test: queueing respects maxParallelAgentTasks (extra tasks stay queued until a slot frees).
  • Unit/integration test: agent_report posts report to parent, updates waiting task tool output (restart-safe), and triggers cleanup (and reminder path when missing).
  • Unit test: toolPolicy merge guarantees presets can’t be overridden.
Follow-ups (explicitly out of scope for v1)
  • More presets (Review, Writer). “Writer” likely needs non-auto-delete so the branch/diff persists.
  • Task.create(kind: "bash") tasks that wrap existing background bashes (and optionally render under the parent like agent tasks).
  • Safe “code search” tools (Glob/Grep) to avoid granting bash to Explore.
  • Deeper nesting UX (collapse/expand, depth cap visuals).

Generated with codex cli • Model: gpt-5.2 • Thinking: xhigh

@ThomasK33 ThomasK33 force-pushed the codex-cli-subagents branch 3 times, most recently from ce6d41b to 1c1ae0d Compare December 18, 2025 13:11
@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33
Copy link
Member Author

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@ThomasK33 ThomasK33 added this pull request to the merge queue Dec 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 20, 2025
Change-Id: I98401f98f52a9ba82adc854ef796fa7da0494553
Signed-off-by: Thomas Kosiewski <[email protected]>
- Queue agent tasks without creating worktrees until dequeued.
- Persist queued prompts in metadata and render them in queued workspaces without auto-resume/backoff.
- Disable task/task_* tools once maxTaskNestingDepth is reached.

Signed-off-by: Thomas Kosiewski <[email protected]>

---
_Generated with `mux` • Model: unknown • Thinking: unknown_
<!-- mux-attribution: model=unknown thinking=unknown -->

Change-Id: Icf17d2634b2aff2061f75b44fdd8a6b63b887247
Consolidate shared thinking policy and reduce tool/task scaffolding.

Signed-off-by: Thomas Kosiewski <[email protected]>

---
_Generated with `mux` • Model: gpt-5.2 • Thinking: unknown_
<!-- mux-attribution: model=gpt-5.2 thinking=unknown -->

Change-Id: Id77858efe746d9b7551ab266f98886dc7712a5f3
Consolidate repeated task graph traversal maps and share task queue debug logging across services.

Signed-off-by: Thomas Kosiewski <[email protected]>

---
_Generated with `mux` • Model: gpt-5.2 • Thinking: unknown_
<!-- mux-attribution: model=gpt-5.2 thinking=unknown -->

Change-Id: I767343544a6d54d3e5bc284c4eac807e3e435780
Change-Id: I83a14ee950e087ba93bf1e6f964ba97c2e65f150
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I8a3433b010c7b657a26e2ecd81e493f77b7ca680
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I0b89b2cc25cb1c1f6eb7ee84b96723397ac5c3c3
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: If1559cc6776d7d5668d11f23ca2acb93e984ffe1
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ie853bc24cc969052136abad09f41025d2101abbc
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I07b5be1009713ab9b61ccec8a65741c21370d5e0
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I64ae7ece054cefa9d4862beabe565bc32a47a921
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Icfab3a441e0dc7979e720a84468a2905b2255a19
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I7486265efb796be3831f5e97692dfba2f0fb163e
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I5363d2c940e341f3c6b0623603564388eb777c83
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I638aef9035b6de82d84dde49314ba15f81ee9332
Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I5cf741604f12af5586f6245ae890678405a402a7
Signed-off-by: Thomas Kosiewski <[email protected]>
- Enforce subagent_type enum (explore|exec)\n- Make subagents non-recursive; Plan Mode spawns explore only\n- Hide Exec sub-agent settings; exec always inherits\n\n---\n_Generated with  • Model: unknown • Thinking: unknown_\n<!-- mux-attribution: model=unknown thinking=unknown -->

Change-Id: Ib7e8e5b9340bc9a1117e349f696779caea60ac18
Signed-off-by: Thomas Kosiewski <[email protected]>
Fixes Storybook test flake by scoping the "Exec" absence check to the Sub-agents section (avoid matching the global mode selector).\n\n---\n_Generated with  • Model: unknown • Thinking: unknown_\n<!-- mux-attribution: model=unknown thinking=unknown -->

Change-Id: Ie4298b1cefc43d5866b878da06b3fcf4562636c1
Signed-off-by: Thomas Kosiewski <[email protected]>
Fixes TS typecheck failure by removing a duplicate `log` import.

Signed-off-by: Thomas Kosiewski <[email protected]>

---
_Generated with `mux` • Model: unknown • Thinking: unknown_
<!-- mux-attribution: model=unknown thinking=unknown -->

Change-Id: I2390570dced5f8f6875fab325035755c7a80bf9b
Signed-off-by: Thomas Kosiewski <[email protected]>
@ThomasK33 ThomasK33 enabled auto-merge December 20, 2025 20:24
@ThomasK33 ThomasK33 added this pull request to the merge queue Dec 20, 2025
Merged via the queue into main with commit ca2367a Dec 20, 2025
20 checks passed
@ThomasK33 ThomasK33 deleted the codex-cli-subagents branch December 20, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant