-
Notifications
You must be signed in to change notification settings - Fork 35
🤖 refactor: consolidate bash execution + await/output tools #1308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7860d42 to
986955f
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
215bd13 to
db76e83
Compare
|
Fixed in db76e83 - |
|
@codex review Follow-up: fixed static-check by aligning RefreshController unit tests with the current (post-revert) RefreshController API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
mux/src/node/services/tools/task_await.ts
Lines 174 to 176 in f84415d
| try { | |
| const report = await taskService.waitForAgentReport(taskId, { | |
| timeoutMs, |
task_await now allows timeout_secs=0 (per schema/docs for non‑blocking bash), but createTaskAwaitTool forwards timeoutMs directly into taskService.waitForAgentReport. When timeout_secs is 0 and any agent task is included (explicitly or via the default “all active tasks”), waitForAgentReport asserts timeoutMs > 0 and throws, so those agent results come back as error instead of queued|running|awaiting_report. Consider treating 0 as “no wait” for agent tasks (return current status) or defaulting to the 10‑minute wait when timeout_secs is 0.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
Fixed in cac38ca.
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review Addressed the truncation metadata issue for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
84a1d42 to
ee8d4c4
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
0523720 to
a5fc33c
Compare
|
@codex review Fixed pending task finalization to skip Also extended workspace activity snapshots to include the last thinking level for the VS Code extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
bbe9ddd to
853b615
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
dc653d6 to
81cd629
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
10b63d2 to
4a2ea12
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
4a2ea12 to
ec44579
Compare
Change-Id: Iae7e5a1f29002e2ccbc5bf1e5f82c2e72c975400 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I6a731b8665141f37c4fa4bbdf80302db201d1171 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I057eac2e131fd9e8a00281c028ae3aee6c49b3e9 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ib254ceb7cadcc2632bcd9f249767c0d1004b1167 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I5f986c0596dee69aa17c901ffb94322fa390fb69 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ib563d7fe7a5882aed0a61f338147e43e747bb5a2 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I44d0d38c54c9136a4f2459abe21b87d8c30bb45f Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I64827f3a8e4305fbe3a46978fa49ad78523e4662 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ibb874d6bbb7d38a458024deb1ed710d58c87d975 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I836b13fa243fd999eac70ae006d47501f57e30a6 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I3129f859e0f44bf6eaed3855a6a8222ed107dc4a Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I3dc061b4728cc6cf70f680628401eb02c540d876 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I5145c268728aa0476723eb1753e56e687d9e81af Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ica79dd1f0b0f528331922bb4bdeabb9da1222395 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I470711377e0d439be2724d4ed909e18ef76bd0cc Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I4fe6af0fe450b12528335bbb04310dbe37d866cf Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Ib1a91a17f9d88b5b02314d1e608135887e835d90 Signed-off-by: Thomas Kosiewski <[email protected]>
- Make task tool agent-only (spawn child workspaces) - Keep bash as the shell tool; background bash now returns taskId="bash:<processId>" - Update tool policies + tests to call bash directly; keep task_await unified Signed-off-by: Thomas Kosiewski <[email protected]> --- _Generated with [`mux`](https://github.com/coder/mux) • Model: `openai:gpt-5.2` • Thinking: `xhigh`_ Change-Id: Ia398d1ceccf403313ef6ff3b454eb72004d1cde5
Change-Id: Ib79033d160d8c2a0883e96035e8e5c6adbbc8035 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I4a1facddeb9208fc76fa6c5518ec0a30b4308330 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Idd97ea5edaf559f75b0ed4b39933e06c3c0abab3 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I7fc46dbbb1cab4ae6df6c89fac5c3d2ee9a57f08 Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: Id7967e79ada797727244f419274f34daa100450a Signed-off-by: Thomas Kosiewski <[email protected]>
Change-Id: I0d2daaa2c42c59e43be1e1e1d96067e8cca094d7 Signed-off-by: Thomas Kosiewski <[email protected]>
7a02a2b to
db508a8
Compare
Change-Id: If30ef50d80f9768371f7b52aa86500d7840b0ac5 Signed-off-by: Thomas Kosiewski <[email protected]>
This PR consolidates the await/output surface for background bash processes and agent tasks, while keeping the
bash(shell execution) andtask(subagent spawn) tools separate:task: spawn sub-agent workspaces onlybash: execute shell commands onlytask_await/task_list/task_terminate: remain the unified surface for waiting / rediscovering / terminating both agent tasks and background bash processesKey changes
taskremains agent-only (nokind="bash").bash(run_in_background=true)now returnstaskId: "bash:<processId>"(in addition tobackgroundProcessId) so callers cantask_await/task_terminatewithout needing bash-specific wrappers.bashas its own tool (no moretask(kind="bash")special-casing).bashinstead oftask(kind="bash").Compatibility / behavior notes
task(kind="bash")is intentionally not supported anymore; it will be rejected by the tool schema and backend.task(kind="bash")are handled best-effort by the UI (converted/rendered as bash output when possible).Verification
make static-check📋 Implementation Plan
Plan: Consolidate waiting/output for bash + agent tasks under
task_awaitGoal
Keep the CC-style split, but consolidate waiting/output:
task: spawn sub-agent workspaces only (subagent_type,prompt,title,run_in_background).bash: execute shell commands only (script,timeout_secs,run_in_background,display_name).task_await: remain the single “wait/output” tool for both agent tasks and background bash processes.Recommended approach net LoC (product code only): ~-60 to -110 LoC (mostly deleting the
task(kind="bash")branch + schema fields, plus a small additive tweak tobashbackground results).Why this is needed (context)
We want a single, obvious tool for shell execution (
bash) and a single, obvious tool for waiting/output (task_await) across both agent tasks and background bash processes. This avoids models reaching for bash-specific wrappers whentask_awaitis the unified output surface.Should we prefix agent task IDs as
agent:<workspaceId>?Recommendation: don’t (for now).
task_awaitneeds a cheap routing signal.taskId: "bash:<processId>"tobashbackground results gets us the consistency win (bothtaskandbashreturn ataskIdusable bytask_await) without changing agent ID semantics.If we ever want stronger typing, prefer adding an explicit
kind: "agent" | "bash"field totask_listresults (low-risk) rather than changing the agent task ID format.Implementation Steps
1) Split tool input surface in
TOOL_DEFINITIONS(agent-onlytask)Files:
src/common/utils/tools/toolDefinitions.tssrc/common/utils/tools/tools.tsActions:
taskagent-only again:TaskToolArgsSchema = TaskToolAgentArgsSchema.TaskToolBashArgsSchemaand thesuperRefineunion enforcement wrapper.TOOL_DEFINITIONS.task.description: remove “Unified … (bash)” wording and only document agent spawning.TOOL_DEFINITIONS.bash.description: update background guidance to point attask_await/task_list/task_terminate(notbash_output/bash_background_*).TaskToolResultSchemato agent-only:TaskToolCompletedResultSchema(bashResult,exitCode,note,truncated).2) Remove bash dispatch from backend
tasktoolFile:
src/node/services/tools/task.tsActions:
task(kind="bash")execution branch.createBashTool,resolveBashDisplayName,toBashTaskId,BashToolResult, andformatBashReport().3) Return
taskIdfrombashbackground results (directly awaitable)Files:
src/common/utils/tools/toolDefinitions.ts(result schema)src/node/services/tools/bash.tssrc/node/services/tools/taskId.ts(import-only)Actions:
BashToolResultSchema:taskId: z.string()to the background-success variant, wheretaskIdis the prefixed form (bash:<processId>).bash.ts:run_in_background=true), includetaskId: toBashTaskId(spawnResult.processId).taskId: toBashTaskId(processId).4) Keep
task_awaitunified (no behavior change), but fix UX affordancesFiles:
src/node/services/tools/task_await.ts(likely no code changes)src/common/utils/tools/taskToolTypeGuards.tssrc/browser/stores/WorkspaceStore.tssrc/browser/components/tools/*wheretask(kind="bash")is special-casedActions:
taskIdprefix (bash:→ backgroundProcessManager; else agent task).task(kind="bash"):TaskToolArgshavingkind.unknown → LegacyTaskBashArgsguard for rendering old history.isTaskKindBashToolCall()withisBashLikeToolCall():toolName === "bash"→ bashtoolName === "task" && input.kind === "bash"→ legacy bash-in-tasktoolName === "task" && result.taskId startsWith("bash:")/bashResult→ legacy fallback5) Update token accounting buckets
Files:
src/common/utils/tokens/tokenStatsCalculator.tssrc/common/utils/tokens/tokenStatsCalculator.test.tsActions:
getConsumerInfoForToolCall():toolName === "bash"is its own consumer bucket.toolName === "task"andinput.kind === "bash", label astask (bash legacy).6) Update tests and fixtures to use
bash, nottask(kind="bash")Targeted updates (non-exhaustive but high-signal):
src/common/utils/tools/toolDefinitions.test.tstaskaccepts{ kind: "bash" … }.taskrejects bash-shaped input.src/node/services/tools/task.bash.test.tscreateTaskTool()withkind:"bash".task_await,task_list, andtask_terminatework withtaskId="bash:<processId>".tests/ipc/runtimeExecuteBash.test.tsbash(instead oftask).toolName === "bash"(noargs.kind).tests/ipc/sendMessage.context.test.ts,tests/ipc/ollama.test.tssrc/node/services/taskService.test.tstoolName:"task" input:{kind:"bash",…}withtoolName:"bash" input:{script,…}.Update any strict equality assertions for background bash results to allow / expect the new
taskIdfield.7) Verification
Run a tight set of checks after changes:
bun test src/node/services/tools/task.test.tsbun test src/node/services/tools/task.bash.test.tsbun test src/common/utils/tools/toolDefinitions.test.tsbun test src/common/utils/tokens/tokenStatsCalculator.test.tsTEST_INTEGRATION=1 bun x jest tests/ipc/runtimeExecuteBash.test.ts -t "execute simple bash commands"Generated with
mux• Model:openai:gpt-5.2• Thinking:xhigh