plan: orchestration.md v2.2 — phase-scoped Administrator push allowlist (67d1276a) · Commits · Vacaliuc, Bogdan / tasking

plan/orchestration.md

+78 −49

Original line number	Diff line number	Diff line
		@@ -105,13 +105,21 @@ as the historical record of plans + learnings; it is never merged
		to `{base-branch}` and may be deleted post-effort at the user's
		discretion.

		Administrator is read-only on the remote (v2.1). The
		Administrator agent does not push, fetch, delete, tag, or otherwise
		modify any ref on `{remote}`. Phase 1 verification of write
		capability (initialization.md §2.5) is performed either by the user
		manually or by the v1 standalone `Initialization-prompt.md`
		fallback session, which retains its narrow `init-check-*` scratch
		namespace.
		Administrator's push capability is phase-scoped (v2.2). During
		Phase 1 (Initialization) only, the Administrator may push and
		delete `init-check-<YYYYMMDD-HHMMSS>-*` scratch refs to verify
		write capability per initialization.md §2.5; all such refs are
		deleted at the end of Phase 1. Once the Administrator declares
		"Phase 1 complete," its in-memory push allowlist drops to empty
		for the remainder of the session — **Phase 2 (Active monitoring)
		is strictly read-only** (cheap `git ls-remote` only; no push, no
		delete, no fetch). This phase-scoped split was the v2.2 walk-back
		of v2.1's blanket read-only restriction (which had pushed §2.5
		verification onto the user or the v1 fallback at the cost of
		ergonomics — see redesign §17.7). The v1 standalone
		`Initialization-prompt.md` fallback remains for re-verification
		after credential rotation without disturbing a running Phase 2
		loop.

		Base branch for feature branches: `{base-branch}` — see §7 for
		the knob. For this effort, `{base-branch}` is `new_workflow_ui_plan`
		@@ -172,7 +180,7 @@ branch for the `*-learning.md` pile.
		\| `qa/{slug}` \| Developer pushes tag \| "Ready for QA" signal \|
		\| `review/{slug}` \| Integrator pushes tag \| "Tests failed; read todo.md on feature/{slug}" \|
		\| `review/{slug}-escalate` \| Analyst pushes (annotated) tag \| Human attention required; retry cap hit \|
		\| `init-check-<YYYYMMDD-HHMMSS>-` \| v1 fallback only: standalone Initialization session creates (and deletes) \| Scratch refs for the §2.5 write-capability check; NOT used by Administrator* in v2.1 (Administrator is read-only — see §3) \|
		\| `init-check-<YYYYMMDD-HHMMSS>-` \| Administrator (Phase 1 only) or* v1 standalone Initialization session creates (and deletes) \| Scratch refs for the §2.5 write-capability check; ephemeral — deleted at end of Phase 1 (Administrator) or at end of the v1 session. The Administrator's authorization to push these refs is phase-scoped and dropped before Phase 2 begins (see §3 / §6.4 / §8) \|

		`{effort-name}` for this effort is `new_workflow-repairs-2026-04`.

		@@ -315,27 +323,30 @@ poll for its liveness, and do not depend on its push allowlist).
		```
		┌──────────────── Administrator ────────────┐
		│ │
		user starts ─────┼── PHASE 1 — Initialization (READ-ONLY v2.1) │
		│ Walk initialization.md §1, §2.1-§2.4, │
		│ §3-§11 — read-only checks only. │
		│ §2.5 (write-capability verification) is │
		│ NOT done here in v2.1 — Administrator │
		│ does not push. The user runs §2.5 │
		│ manually or launches the v1 standalone│
		│ Initialization-prompt.md session for │
		│ it. Phase 1 asks the user to confirm │
		│ write capability has been verified │
		│ (last 24 h or since cred rotation). │
		user starts ─────┼── PHASE 1 — Initialization (v2.2) │
		│ Walk initialization.md §1-§11 in full, │
		│ including §2.5 write-capability check.│
		│ Use a timestamped scratch ref namespace │
		│ init-check-<YYYYMMDD-HHMMSS> for §2.5 │
		│ pushes; delete every init-check-* ref │
		│ (remote and local) at end of Phase 1. │
		│ Phase-1 push allowlist (per §8 v2.2): │
		│ init-check-<YYYYMMDD-HHMMSS>-* refs │
		│ and their deletions; nothing else. │
		│ If DRY_RUN=1 also walk dry-run.md §5.1. │
		│ If any section fails, STOP — hand the │
		│ user a specific action item; do NOT │
		│ enter phase 2. │
		│ On success, report "Phase 1 complete. │
		│ Ready to launch worker agents." │
		│ Ready to launch worker agents." This │
		│ drops the in-memory push allowlist to │
		│ empty for the remainder of the session│
		│ — the transition is irrevocable per │
		│ §8 v2.2. │
		│ │
		user confirms ───┼── PHASE 2 — Active monitoring │
		worker startup │ (No push at phase 2 start — v2.1. │
		│ Administrator is read-only.) │
		user confirms ───┼── PHASE 2 — Active monitoring (READ-ONLY) │
		worker startup │ Push allowlist is now empty — strictly │
		│ read-only on the remote. ls-remote only.│
		│ │
		poll (TM secs) ──┼── Read {admin-state-dir}/agent-state- │
		│ {Analyst,Developer,Integrator}.json │
		@@ -361,17 +372,18 @@ poll for its liveness, and do not depend on its push allowlist).
		│ large reads) unless user asks. │
		│ Resume the poll loop after answering. │
		│ │
		│ Administrator NEVER (v2.1 — read-only): │
		│ Administrator NEVER (any phase): │
		│ - restarts a stalled worker │
		│ - rewrites worker state files │
		│ - opens PRs/MRs │
		│ - pushes any protocol ref │
		│ (analysis/, triage/, feature/, qa/, │
		│ review/, {base-branch}) │
		│ Phase-2 specifically NEVER: │
		│ - pushes ANY ref to {remote} │
		│ (no protocol refs, no init-check-*, │
		│ no admin-status-*, no informational │
		│ tags — Administrator is read-only on │
		│ the repository per §3 v2.1 contract) │
		│ (no init-check-*, no anything) │
		│ - deletes any ref on {remote} │
		│ - fetches objects (cheap ls-remote only)│
		│ - fetches objects (ls-remote only) │
		└───────────────────────────────────────────┘
		```

		@@ -504,20 +516,32 @@ user approval.
		- `review/{slug}-escalate` annotated tag (Analyst — terminal,
		never auto-deleted)

		**v1-fallback allowlist (Initialization-prompt.md only, NOT
		Administrator):**

		- `init-check-<YYYYMMDD-HHMMSS>-*` branches and tags, plus their
		deletions, are pushed by the v1 standalone Initialization session
		for the §2.5 write-capability check. The Administrator agent in
		v2.1 does not use this namespace.
		Administrator allowlist (v2.2 — phase-scoped):

		- Phase 1 only: `init-check-<YYYYMMDD-HHMMSS>-*` branches and
		tags (plus their deletions) for the initialization.md §2.5
		write-capability check. All such refs are deleted at the end of
		Phase 1; the in-memory allowlist drops to empty when the
		Administrator declares "Phase 1 complete." The transition is
		irrevocable — re-entering Phase 1 logic mid-session is not
		permitted; for re-init after a credential rotation, the user
		launches the v1 standalone `Initialization-prompt.md` session
		instead (which retains the same narrow allowlist).
		- Phase 2: empty. The Administrator is strictly read-only
		on `{remote}` — `git ls-remote` only; no push, no delete, no
		fetch. If a runbook step appears to require Administrator to
		write to the remote during Phase 2, that is a bug in the runbook
		— escalate to the user.
		- Always denied (any phase): every protocol ref (`analysis/`,
		`triage/`, `feature/`, `qa/`, `review/`, `{base-branch}`),
		PR/MR creation, and any ref outside the `init-check-*`
		namespace.

		Administrator (v2.1): no push allowlist. The Administrator
		agent is read-only on `{remote}`. It never pushes, deletes, fetches
		objects, or otherwise modifies any ref. Its only network operation
		on `{remote}` is `git ls-remote` (cheap, read-only). If a runbook
		step appears to require Administrator to write to the remote, that
		is a bug in the runbook — escalate to the user.
		v1-fallback allowlist (Initialization-prompt.md): the v1
		standalone session retains the same `init-check-*` push allowlist
		(plus deletions) as Administrator Phase 1. Use it for
		re-verification after credential rotation without disturbing a
		running Phase 2 loop, or for three-session-only deployments.

		Still requires explicit user confirmation:

		@@ -1091,8 +1115,8 @@ role's prompt into its session.

		\| Role \| Robust default \| Cost-efficient alternative \| Why \|
		\|---\|---\|---\|---\|
		\| Initialization (deprecated; v1 fallback only — Administrator phase 1 covers most checks read-only) \| Sonnet 4.6 / `medium` \| Haiku 4.5 / `medium` \| Procedural checklist with occasional diagnostic suggestions when a section fails. Sonnet reads tool stderr well and offers actionable fixes. Haiku is fine for the happy path but weaker on diagnosing why something failed. \|
		\| Administrator \| Sonnet 4.6 / `medium` \| Haiku 4.5 / `medium` \| Phase 1 is the procedural Initialization checklist (Sonnet's strength). Phase 2 is small-summary aggregation across four state files plus a single ls-remote — Sonnet handles the "is anything stalled?" synthesis cheaply. Haiku is viable when phase-1 diagnostics aren't expected to fire (stable env). Do NOT bump to Opus: this role is read-only on the repo; reasoning depth is not the bottleneck. \|
		\| Initialization (deprecated for routine startup — Administrator Phase 1 walks the same checklist with phase-scoped push allowlist; v1 fallback retained for re-init after credential rotation or three-session-only deployments) \| Sonnet 4.6 / `medium` \| Haiku 4.5 / `medium` \| Procedural checklist with occasional diagnostic suggestions when a section fails. Sonnet reads tool stderr well and offers actionable fixes. Haiku is fine for the happy path but weaker on diagnosing why something failed. \|
		\| Administrator \| Sonnet 4.6 / `medium` \| Haiku 4.5 / `medium` \| Phase 1 is the procedural Initialization checklist (Sonnet's strength), with a narrow phase-scoped push allowlist for §2.5 verification. Phase 2 is small-summary aggregation across four state files plus a single ls-remote — Sonnet handles the "is anything stalled?" synthesis cheaply. Haiku is viable when Phase-1 diagnostics aren't expected to fire (stable env). Do NOT bump to Opus: Phase 2 is read-only on the repo; reasoning depth is not the bottleneck. \|
		\| Analyst \| Opus 4.7 / `max` \| — do not downgrade — \| Plan-quality is the highest-leverage variable in the protocol. Every Analyst error propagates to Developer + Integrator cycles; the retry math (see below) makes max-effort Opus the cheap option in expectation. \|
		\| Developer \| Opus 4.7 / `xhigh` \| Sonnet 4.6 / `max` \| Implementation under a clear spec. Opus reduces test-failure retries; Sonnet at max is competent for spec-driven implementation but produces a small uptick in retry rate. \|
		\| Integrator \| Opus 4.7 / `xhigh` \| Sonnet 4.6 / `high` \| Test-failure-diagnosis quality (the todo.md hypothesis ranking) determines how easy the Analyst's retry will be. Opus produces sharper hypotheses; Sonnet is fine for happy-path PR/MR creation and weaker on failure ranking. \|
		@@ -1314,10 +1338,15 @@ is complete, not every case is auto-resolved.

		1. In <path-to-session-4>, launch `claude`. Set model/effort per
		§9.5 (Administrator row), then paste the Administrator opening
		prompt (§9.6 → `plan/Administrator-prompt.md`). Wait for
		"Phase 1 complete. Ready to launch worker agents." If phase 1
		reports a failure, fix the cited issue and ask the Administrator
		to re-run phase 1 — do NOT proceed to step 2.
		prompt (§9.6 → `plan/Administrator-prompt.md`). The Administrator
		walks initialization.md §1-§11 in full (read-only checks plus
		§2.5 write-capability verification under its phase-scoped
		`init-check-` allowlist per §8 v2.2; all init-check- refs are
		deleted at end of Phase 1). Wait for "Phase 1 complete. Ready to
		launch worker agents." — at that point Administrator drops to
		strictly read-only on the remote for the rest of the session. If
		Phase 1 reports a failure, fix the cited issue and ask the
		Administrator to re-run Phase 1 — do NOT proceed to step 2.
		2. In <path-to-session-1>, launch `claude`. Paste the
		Analyst opening prompt (§9.1).
		3. Wait for the Analyst to finish the initial triage (four triage