Commit 0c5b518a authored by Vacaliuc, Bogdan's avatar Vacaliuc, Bogdan
Browse files

add Initialization agent findings for lr_reduction-new_workflow-repairs dry run



Documents all three init runs (2026-04-25, -27, -28): the four issues
encountered (missing tools, 780-violation ruff baseline, BLE001 commit
loop, HTTPS credential gap), their root causes, resolutions, and lessons.
Includes final section summary, dry-run configuration, test-slug matrix,
and known follow-up items.

Co-Authored-By: default avatarClaude Sonnet 4.6 <noreply@anthropic.com>
parent 74c5a6b9
Loading
Loading
Loading
Loading
+171 −0
Original line number Diff line number Diff line
# Initialization agent findings — lr_reduction-new_workflow-repairs dry run

**Date:** 2026-04-28 / 2026-04-29
**Clone:** `/media/ssd2/Projects/Claude/4` (uvdl3, session 4)
**Initialization runs:** 3 (initial on 2026-04-25, partial re-run on 2026-04-27, clean pass on 2026-04-28)
**Final status:** READY — all 10 sections + dry-run pre-flight pass

---

## Environment baseline (as of clean pass)

| Item | Value |
|---|---|
| Host | uvdl3 |
| Clone | `/media/ssd2/Projects/Claude/4/lr_reduction` |
| Remote (`{remote}`) | `agentic``https://github.com/bvacaliuc/LiquidsReflectometer.git` |
| Base branch | `new_workflow_ui_plan` |
| git | 2.34.1 |
| pixi | 0.48.1 |
| gh | 2.91.0 (token scopes: gist, read:org, repo, workflow) |
| glab | 1.93.0 (not needed — remote is GitHub) |
| jq | 1.6 |
| pre-commit | 4.5.1 via `pixi run pre-commit` (not in global PATH — acceptable) |
| SSH agent | 3 keys loaded (ed25519 code.ornl.gov, RSA, ed25519 vtwin) |
| Credential helper | `gh auth git-credential` scoped to `https://github.com` |
| Commit signing | disabled |
| Submodule | `tests/data/liquidsreflectometer-data` initialized at 872ef741 (heads/main) |
| Test collection | 281 tests via `pixi run test-reduction -- --collect-only` |
| Pre-commit baseline | fully clean (all checks passed) |

---

## Issues encountered and resolutions

### Issue 1 — Missing optional tools (2026-04-25, run 1)

**Symptom:** `gh`, `glab`, `jq` not in PATH.

**Impact:** §7 (platform/PR path) blocked; REST API fallback probe returned 401.

**Resolution:** User installed `gh`, `glab`, `jq`. Run 3 confirmed all present.

**Lesson for future init runs:** these tools are not in the base system image. Add installation to `setup/bootstrap.sh` or document as a prerequisite.

---

### Issue 2 — Pre-commit baseline failing: 780 ruff violations (2026-04-25, run 1)

**Symptom:** `pixi run pre-commit run --all-files` exited non-zero with 780 ruff violations. Ruff auto-fixed 28 violations in 16 files, dirtying the working tree.

**Root cause:** The bvacaliuc fork's `new_workflow_ui_plan` branch had diverged from `upstream/new_workflow` and lost the entire `[tool.ruff.lint] ignore` list from `pyproject.toml`. Upstream's ignore list suppresses 14 rule codes (E402, E501, E722, N802, N803, N806, N812, N815, N999, F403, F405, F821, E741, E743) for patterns that are established in this scientific Qt codebase. Without the ignore list, the fork enforced all rules against code that was never written to comply with them.

**Why upstream isn't blocked:** upstream uses `pre-commit.ci` (CI service) which auto-creates fix PRs; individual developers do not run hooks locally. Additionally, upstream's 14-rule ignore list covers most of the violations.

**Resolution sequence:**
1. Restored upstream ignore list to `pyproject.toml`, added `BLE001` (optional-import fallback pattern using `except Exception:` is intentional and cannot be auto-fixed), `ARG001/002/005` (Qt slot callbacks, pytest fixtures), `N801` (established class names `Direct_Beam`, `NR_Reduction`).
2. Applied `pixi run ruff check --fix .` + `pixi run ruff format .` across 41 files (46 auto-fixes, 39 reformatted).
3. Fixed remaining 16 violations in 5 deferred files manually: E702 (semicolons → separate lines), E712 (bool comparisons → truthiness or `# noqa`), F841 (unused variables → `# noqa` or `_` binding).
4. Committed in 4 commits on branch `new_workflow_ui_plan_with_ruff`, merged into `new_workflow_ui_plan`.

**Final state:** `pixi run pre-commit run --all-files` → all checks passed.

**Lesson for future init runs:** if the fork's `pyproject.toml` diverges from upstream's ruff config, check `git diff upstream/new_workflow -- pyproject.toml` first. The ignore list must be at least a superset of upstream's for the fork's code to have a passing baseline.

---

### Issue 3 — User commit loop: `--exit-non-zero-on-fix` + BLE001 (2026-04-27)

**Symptom:** User ran `pixi run ruff check --fix .` then staged all files and tried `git commit`. Commit failed with "files were modified by this hook" + 780 remaining violations → re-staged → commit failed again on BLE001 → permanent loop.

**Root cause — two mechanisms combined:**
- `--exit-non-zero-on-fix` in `.pre-commit-config.yaml`: ruff exits 1 whenever it makes auto-fixes, even successful ones. This is by design (tells git "I changed staged files, re-add them"). First retry after re-staging is expected.
- `BLE001` is not auto-fixable (`except Exception:` → specific exception requires human judgment). After re-staging, ruff still finds BLE001 → exits 1 → no way out.

**Resolution:** Adding `BLE001` to the ignore list broke the loop. The `--exit-non-zero-on-fix` behavior is then the expected one-cycle re-stage pattern.

**Lesson:** when a pre-commit ruff hook loops, check for violations in the "not auto-fixable" category. If they're established patterns (not bugs), add them to `ignore` rather than fighting them per-commit.

---

### Issue 4 — HTTPS push credential missing (2026-04-28, run 3)

**Symptom:** `git push agentic HEAD:refs/heads/${scratch}-branch``fatal: could not read Username for 'https://github.com': No such device or address`.

**Root cause:** `~/.gitconfig` had `credential.helper=` (empty string), which clears all credential helpers globally. `gh auth login` had been run (token stored in `~/.config/gh/hosts.yml`) but `gh auth setup-git` had not been run to wire gh as the git credential provider. The `ls-remote` worked because GitHub public repos allow unauthenticated reads; push requires auth.

**Resolution:** User ran `gh auth setup-git`, which added:
```
[credential "https://github.com"]
    helper =
    helper = !/usr/bin/gh auth git-credential
```
URL-scoped credential configs take priority over the global empty helper. Subsequent scratch ref test passed all 6 operations.

**Lesson:** `gh auth login` alone is not sufficient for `git push` over HTTPS. `gh auth setup-git` must also be run. Add to checklist pre-flight note in §3: "if using gh for HTTPS credentials, verify `gh auth setup-git` was run, not just `gh auth login`."

---

## Dry-run pre-flight (§5.1) — results

| Check | Result |
|---|---|
| `{dry-run-prefix}` namespace empty (`dry-run-2026-04-28-*/*`) | ✓ no refs found |
| Annotated tag push capability | ✓ verified with init-check annotated tag |
| Branch and tag delete capability | ✓ verified with init-check branch and tag |
| Network-loss simulation | out-of-scope (not mechanically verifiable without kernel network namespace tooling) |

---

## Orchestration readiness — final section summary

```
1. Local tools:        git 2.34.1 ✓, pixi 0.48.1 ✓, ssh/agent 3 keys ✓, curl 7.81.0 ✓,
                       gh 2.91.0 ✓, glab 1.93.0 ✓, jq 1.6 ✓,
                       pre-commit 4.5.1 via pixi ✓

2. Remote agentic:     read/branch-push/annotated-tag-push/branch-delete/tag-delete: ALL OK ✓
                       (scratch ref init-check-20260428T112216)

3. Agent auth:         gh token (repo+workflow scope) ✓; gh credential helper ✓;
                       3 SSH keys loaded ✓

4. Pre-commit hooks:   all checks passed ✓ (baseline fully clean)

5. Submodule init:     liquidsreflectometer-data at 872ef741 (heads/main) ✓

6. Commit signing:     disabled ✓

7. Platform:           github — gh 2.91.0 ✓; gh api user → bvacaliuc ✓

8. Test env:           pixi env present ✓; 281 tests collected ✓

9. Scratch refs:       cleaned — 0 local, 0 remote ✓

Dry-run pre-flight:    namespace empty ✓; annotated tag ✓; delete ✓;
                       network-loss: out-of-scope
```

---

## Dry-run configuration (for progress analysis)

```
DRY_RUN          = 1
{dry-run-prefix} = dry-run-2026-04-28
{dry-run-remote} = agentic
TA = TD = TI     = 10          (accelerated from 60 s)
N                = 3           (retry cap)
{base-branch}    = new_workflow_ui_plan
{remote}         = agentic
```

Test slugs and expected terminal states (see `dry-run.md §4`):

| Slug | Expected pathway | Terminal state |
|---|---|---|
| `dry-run-alpha` | pass first attempt | PR open on GitHub |
| `dry-run-beta` | fail → v2 passes | PR open on GitHub |
| `dry-run-gamma` | fail all N attempts | `review/dry-run-gamma-escalate` annotated tag |
| `dry-run-delta` | infrastructure failure | `plans/dry-run-delta-followup.md` on analysis branch; `review/dry-run-delta` deleted without re-triage |
| `dry-run-epsilon` | malformed triage branch | Developer logs and skips; no feature branch created |

## Known gaps / follow-up items

1. **`pre-commit` not in global PATH** — only accessible via `pixi run pre-commit`. The Developer and Integrator use it via pixi tasks, so this is not a blocker, but a future bootstrap task could `pixi global install pre-commit` to make it available globally.

2. **`glab` authenticated to gitlab.com only at SSH level**`glab auth status` shows 401 for the REST API on gitlab.com. Not relevant for this effort (GitHub target), but note for any future effort targeting `code.ornl.gov`.

3. **`new_workflow_ui_plan_with_ruff` branch** — still exists locally after merge into `new_workflow_ui_plan`. Can be deleted once confirmed merged: `git branch -d new_workflow_ui_plan_with_ruff`.

4. **pixi-lock-check `stage` key warning**`[WARNING] Unexpected key(s) present on local => pixi-lock-check: stage` appears on every commit. This is a version mismatch between the pre-commit hook config (uses `stage:`) and the locally installed pre-commit version's key name. Harmless but noisy; the hook still runs correctly as a pre-push hook.