Commit 67d1276a authored by Vacaliuc, Bogdan's avatar Vacaliuc, Bogdan
Browse files

plan: orchestration.md v2.2 — phase-scoped Administrator push allowlist



Walks back v2.1's blanket "Administrator is read-only on the
repository" restriction. The v2.1 design pushed initialization.md
§2.5 (write-capability verification) onto the user or the v1
fallback at the cost of ergonomics — the spirit of the reviewer's
original concern (minimize Administrator's blast radius) is
preserved with a strictly *phase-scoped* allowlist instead.

§3 contract:
  - "Administrator is read-only on the remote (v2.1)" replaced
    with "Administrator's push capability is phase-scoped (v2.2)".
  - Phase 1: may push and delete init-check-<YYYYMMDD-HHMMSS>-*
    refs (deleted at end of Phase 1).
  - Phase 2: empty allowlist (strictly read-only; ls-remote only).
  - Transition is irrevocable: re-init after credential rotation
    uses the v1 standalone Initialization-prompt.md fallback.

§5 naming conventions: init-check-* row restored as
"Administrator (Phase 1) or v1 standalone Initialization session";
both authorize the same scratch namespace.

§6.4 Administrator state machine:
  - Phase 1 box updated to walk initialization.md §1-§11 in full
    (including §2.5 push verification) under the phase-scoped
    allowlist; explicit "delete every init-check-* ref at end of
    Phase 1" cleanup; explicit irrevocable allowlist drop at the
    "Phase 1 complete" boundary.
  - Phase 2 box explicitly READ-ONLY.
  - "NEVER (any phase)" denial block split into "any phase"
    (protocol refs, PRs, worker restarts/state-rewrites) vs
    "Phase 2 specifically" (any push at all).

§8 push allowlist:
  - Replaced the v2.1 "Administrator (v2.1): no push allowlist"
    block with v2.2 phase-scoped split: Phase 1 init-check-* /
    Phase 2 empty / always-denied protocol refs.
  - The transition is documented as irrevocable.
  - v1-fallback note: same allowlist as Administrator Phase 1;
    use case narrowed to re-verify after credential rotation
    without disturbing a running Phase 2 loop, or three-session-
    only deployments.

§9.5 model/effort table:
  - Initialization row: clarify "deprecated for routine startup"
    (v1 fallback retained for re-init / three-session deployments).
  - Administrator row: clarify Phase 1 has a "narrow phase-scoped
    push allowlist for §2.5 verification"; Phase 2 read-only.

§13.1 runbook step 1: explicit "walks initialization.md §1-§11 in
full ... all init-check-* refs are deleted at end of Phase 1 ...
Administrator drops to strictly read-only on the remote for the
rest of the session."

Co-Authored-By: default avatarClaude Opus 4.7 (1M context) <noreply@anthropic.com>
parent db752b91
Loading
Loading
Loading
Loading
+78 −49
Original line number Diff line number Diff line
@@ -105,13 +105,21 @@ as the historical record of plans + learnings; it is never merged
to `{base-branch}` and may be deleted post-effort at the user's
discretion.

**Administrator is read-only on the remote (v2.1).** The
Administrator agent does not push, fetch, delete, tag, or otherwise
modify any ref on `{remote}`. Phase 1 verification of write
capability (initialization.md §2.5) is performed either by the user
manually or by the v1 standalone `Initialization-prompt.md`
fallback session, which retains its narrow `init-check-*` scratch
namespace.
**Administrator's push capability is phase-scoped (v2.2).** During
**Phase 1 (Initialization)** only, the Administrator may push and
delete `init-check-<YYYYMMDD-HHMMSS>-*` scratch refs to verify
write capability per initialization.md §2.5; all such refs are
deleted at the end of Phase 1. Once the Administrator declares
"Phase 1 complete," its in-memory push allowlist drops to **empty**
for the remainder of the session — **Phase 2 (Active monitoring)
is strictly read-only** (cheap `git ls-remote` only; no push, no
delete, no fetch). This phase-scoped split was the v2.2 walk-back
of v2.1's blanket read-only restriction (which had pushed §2.5
verification onto the user or the v1 fallback at the cost of
ergonomics — see redesign §17.7). The v1 standalone
`Initialization-prompt.md` fallback remains for re-verification
after credential rotation without disturbing a running Phase 2
loop.

**Base branch for feature branches:** `{base-branch}` — see §7 for
the knob. For **this** effort, `{base-branch}` is `new_workflow_ui_plan`
@@ -172,7 +180,7 @@ branch for the `*-learning.md` pile.
| `qa/{slug}` | Developer pushes tag | "Ready for QA" signal |
| `review/{slug}` | Integrator pushes tag | "Tests failed; read todo.md on feature/{slug}" |
| `review/{slug}-escalate` | Analyst pushes (**annotated**) tag | Human attention required; retry cap hit |
| `init-check-<YYYYMMDD-HHMMSS>-*` | **v1 fallback only**: standalone Initialization session creates (and deletes) | Scratch refs for the §2.5 write-capability check; **NOT used by Administrator** in v2.1 (Administrator is read-only — see §3) |
| `init-check-<YYYYMMDD-HHMMSS>-*` | Administrator (Phase 1 only) **or** v1 standalone Initialization session creates (and deletes) | Scratch refs for the §2.5 write-capability check; ephemeral — deleted at end of Phase 1 (Administrator) or at end of the v1 session. The Administrator's authorization to push these refs is **phase-scoped** and dropped before Phase 2 begins (see §3 / §6.4 / §8) |

`{effort-name}` for this effort is `new_workflow-repairs-2026-04`.

@@ -315,27 +323,30 @@ poll for its liveness, and do not depend on its push allowlist).
```
                  ┌──────────────── Administrator ────────────┐
                  │                                           │
 user starts ─────┼── PHASE 1 — Initialization (READ-ONLY v2.1) │
                  │   Walk initialization.md §1, §2.1-§2.4,   │
                  │     §3-§11 — read-only checks only.       │
                  │   §2.5 (write-capability verification) is │
                  │     NOT done here in v2.1 — Administrator │
                  │     does not push. The user runs §2.5     │
                  │     manually or launches the v1 standalone│
                  │     Initialization-prompt.md session for  │
                  │     it. Phase 1 asks the user to confirm  │
                  │     write capability has been verified    │
                  │     (last 24 h or since cred rotation).   │
 user starts ─────┼── PHASE 1 — Initialization (v2.2)         │
                  │   Walk initialization.md §1-§11 in full,  │
                  │     including §2.5 write-capability check.│
                  │   Use a timestamped scratch ref namespace │
                  │     init-check-<YYYYMMDD-HHMMSS> for §2.5 │
                  │     pushes; delete every init-check-* ref │
                  │     (remote and local) at end of Phase 1. │
                  │   Phase-1 push allowlist (per §8 v2.2):   │
                  │     init-check-<YYYYMMDD-HHMMSS>-* refs   │
                  │     and their deletions; nothing else.    │
                  │   If DRY_RUN=1 also walk dry-run.md §5.1. │
                  │   If any section fails, STOP — hand the   │
                  │     user a specific action item; do NOT   │
                  │     enter phase 2.                        │
                  │   On success, report "Phase 1 complete.   │
                  │     Ready to launch worker agents."       │
                  │     Ready to launch worker agents." This  │
                  │     drops the in-memory push allowlist to │
                  │     empty for the remainder of the session│
                  │     — the transition is irrevocable per   │
                  │     §8 v2.2.                              │
                  │                                           │
 user confirms ───┼── PHASE 2 — Active monitoring            
 worker startup   │   (No push at phase 2 start — v2.1.     
 Administrator is read-only.)           
 user confirms ───┼── PHASE 2 — Active monitoring (READ-ONLY)
 worker startup   │   Push allowlist is now empty — strictly
read-only on the remote. ls-remote only.
                  │                                           │
 poll (TM secs) ──┼── Read {admin-state-dir}/agent-state-     │
                  │     {Analyst,Developer,Integrator}.json   │
@@ -361,17 +372,18 @@ poll for its liveness, and do not depend on its push allowlist).
                  │   large reads) unless user asks.          │
                  │   Resume the poll loop after answering.   │
                  │                                           │
                  │ Administrator NEVER (v2.1 — read-only):
                  │ Administrator NEVER (any phase):       
                  │   - restarts a stalled worker             │
                  │   - rewrites worker state files           │
                  │   - opens PRs/MRs                         │
                  │   - pushes any protocol ref               │
                  │     (analysis/, triage/, feature/, qa/,   │
                  │      review/, {base-branch})              │
                  │ Phase-2 specifically NEVER:               │
                  │   - pushes ANY ref to {remote}            │
                  │     (no protocol refs, no init-check-*,   │
                  │      no admin-status-*, no informational  │
                  │      tags — Administrator is read-only on │
                  │      the repository per §3 v2.1 contract) │
                  │     (no init-check-*, no anything)        │
                  │   - deletes any ref on {remote}           │
                  │   - fetches objects (cheap ls-remote only)│
                  │   - fetches objects (ls-remote only)      
                  └───────────────────────────────────────────┘
```

@@ -504,20 +516,32 @@ user approval.
- `review/{slug}-escalate` annotated tag (Analyst — terminal,
  never auto-deleted)

**v1-fallback allowlist (Initialization-prompt.md only, NOT
Administrator):**

- `init-check-<YYYYMMDD-HHMMSS>-*` branches and tags, plus their
  deletions, are pushed by the v1 standalone Initialization session
  for the §2.5 write-capability check. The Administrator agent in
  v2.1 does **not** use this namespace.
**Administrator allowlist (v2.2 — phase-scoped):**

- **Phase 1 only:** `init-check-<YYYYMMDD-HHMMSS>-*` branches and
  tags (plus their deletions) for the initialization.md §2.5
  write-capability check. All such refs are deleted at the end of
  Phase 1; the in-memory allowlist drops to empty when the
  Administrator declares "Phase 1 complete." The transition is
  **irrevocable** — re-entering Phase 1 logic mid-session is not
  permitted; for re-init after a credential rotation, the user
  launches the v1 standalone `Initialization-prompt.md` session
  instead (which retains the same narrow allowlist).
- **Phase 2:** **empty.** The Administrator is strictly read-only
  on `{remote}``git ls-remote` only; no push, no delete, no
  fetch. If a runbook step appears to require Administrator to
  write to the remote during Phase 2, that is a bug in the runbook
  — escalate to the user.
- **Always denied (any phase):** every protocol ref (`analysis/`,
  `triage/`, `feature/`, `qa/`, `review/`, `{base-branch}`),
  PR/MR creation, and any ref outside the `init-check-*`
  namespace.

**Administrator (v2.1): no push allowlist.** The Administrator
agent is read-only on `{remote}`. It never pushes, deletes, fetches
objects, or otherwise modifies any ref. Its only network operation
on `{remote}` is `git ls-remote` (cheap, read-only). If a runbook
step appears to require Administrator to write to the remote, that
is a bug in the runbook — escalate to the user.
**v1-fallback allowlist (Initialization-prompt.md):** the v1
standalone session retains the same `init-check-*` push allowlist
(plus deletions) as Administrator Phase 1. Use it for
re-verification after credential rotation without disturbing a
running Phase 2 loop, or for three-session-only deployments.

**Still requires explicit user confirmation:**

@@ -1091,8 +1115,8 @@ role's prompt into its session.

| Role | Robust default | Cost-efficient alternative | Why |
|---|---|---|---|
| Initialization (deprecated; v1 fallback only — Administrator phase 1 covers most checks read-only) | Sonnet 4.6 / `medium` | Haiku 4.5 / `medium` | Procedural checklist with occasional diagnostic suggestions when a section fails. Sonnet reads tool stderr well and offers actionable fixes. Haiku is fine for the happy path but weaker on diagnosing why something failed. |
| **Administrator** | **Sonnet 4.6 / `medium`** | **Haiku 4.5 / `medium`** | Phase 1 is the procedural Initialization checklist (Sonnet's strength). Phase 2 is small-summary aggregation across four state files plus a single ls-remote — Sonnet handles the "is anything stalled?" synthesis cheaply. Haiku is viable when phase-1 diagnostics aren't expected to fire (stable env). Do NOT bump to Opus: this role is read-only on the repo; reasoning depth is not the bottleneck. |
| Initialization (deprecated for routine startup — Administrator Phase 1 walks the same checklist with phase-scoped push allowlist; v1 fallback retained for re-init after credential rotation or three-session-only deployments) | Sonnet 4.6 / `medium` | Haiku 4.5 / `medium` | Procedural checklist with occasional diagnostic suggestions when a section fails. Sonnet reads tool stderr well and offers actionable fixes. Haiku is fine for the happy path but weaker on diagnosing why something failed. |
| **Administrator** | **Sonnet 4.6 / `medium`** | **Haiku 4.5 / `medium`** | Phase 1 is the procedural Initialization checklist (Sonnet's strength), with a narrow phase-scoped push allowlist for §2.5 verification. Phase 2 is small-summary aggregation across four state files plus a single ls-remote — Sonnet handles the "is anything stalled?" synthesis cheaply. Haiku is viable when Phase-1 diagnostics aren't expected to fire (stable env). Do NOT bump to Opus: Phase 2 is read-only on the repo; reasoning depth is not the bottleneck. |
| **Analyst** | **Opus 4.7 / `max`** | **— do not downgrade —** | Plan-quality is the highest-leverage variable in the protocol. Every Analyst error propagates to Developer + Integrator cycles; the retry math (see below) makes max-effort Opus the cheap option in expectation. |
| Developer | Opus 4.7 / `xhigh` | Sonnet 4.6 / `max` | Implementation under a clear spec. Opus reduces test-failure retries; Sonnet at max is competent for spec-driven implementation but produces a small uptick in retry rate. |
| Integrator | Opus 4.7 / `xhigh` | Sonnet 4.6 / `high` | Test-failure-diagnosis quality (the todo.md hypothesis ranking) determines how easy the Analyst's retry will be. Opus produces sharper hypotheses; Sonnet is fine for happy-path PR/MR creation and weaker on failure ranking. |
@@ -1314,10 +1338,15 @@ is complete, not every case is auto-resolved.

1. In <path-to-session-4>, launch `claude`. Set model/effort per
   §9.5 (Administrator row), then paste the Administrator opening
   prompt (§9.6 → `plan/Administrator-prompt.md`). Wait for
   "Phase 1 complete. Ready to launch worker agents." If phase 1
   reports a failure, fix the cited issue and ask the Administrator
   to re-run phase 1 — do NOT proceed to step 2.
   prompt (§9.6 → `plan/Administrator-prompt.md`). The Administrator
   walks initialization.md §1-§11 in full (read-only checks plus
   §2.5 write-capability verification under its phase-scoped
   `init-check-*` allowlist per §8 v2.2; all init-check-* refs are
   deleted at end of Phase 1). Wait for "Phase 1 complete. Ready to
   launch worker agents." — at that point Administrator drops to
   strictly read-only on the remote for the rest of the session. If
   Phase 1 reports a failure, fix the cited issue and ask the
   Administrator to re-run Phase 1 — do NOT proceed to step 2.
2. In <path-to-session-1>, launch `claude`. Paste the
   Analyst opening prompt (§9.1).
3. Wait for the Analyst to finish the initial triage (four triage