Commit db5ddda6 authored by Bogdan Vacaliuc's avatar Bogdan Vacaliuc
Browse files

plan: document beamline-subdir pattern, skip list, and deferred items



Three changes feeding back from the §13 step 0 dry-run on bl4b-vtwin1:

§6.3 (EPICS module build)
  Document the "one module, per-beamline subdir" pattern used by
  motion today (bl4b-PPS, bl4b-Parker1) and vdet in the future. The
  classifier canonicalizes such references to mod/tag and records
  the trailing beamline-name residue as a VARIANT (expected) rather
  than a NOTE (investigate). Replace the prose classifier snippet
  with the actual shape from build_modules.sh so the plan and code
  don't drift.

  Also sharpen the NOTE: wording to explain that multi-component
  references like templates/makeBaseApp/top and lakeshore/main/
  lakeshore336 are canonicalized to the 2-level TOP; the build
  command remains `-r mod tag`, and the trailing residue exists on
  disk inside the module tarball.

§6.4 (IOC build skip list)
  Add bl4b-EICManager. It has a Makefile but no configure/RELEASE —
  the Makefile's install target only creates symlinks into
  /home/controls/releases/ and expects the EIC module to be managed
  out-of-band. The aggregator already skips it naturally; this is
  just documentation that matches reality.

plan/todo.md (new)
  Running list of items that surfaced but were deferred, starting with:

    - VDET time-stream dependency for nED on a vtwin (removed from
      bl4b-Det-nED for now; returns when simulated timing is needed)
    - git-release.sh recursive parser breaks on 3-component paths
    - PS71/PS72 TCP 4571 collision in bl4b-ProcServ-vtwin1 st.cmd
    - Latent --das3/--epics imbalance in share/scripts/dependencies.sh

Each item has a "Surfaced / Why / How to apply" structure so a
future session can pick it up without re-learning the context.

Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
parent 97236717
Loading
Loading
Loading
Loading

plan/todo.md

0 → 100644
+132 −0
Original line number Diff line number Diff line
# vtwin bootstrap — deferred items

Running list of items that surfaced during the BL4B virtual twin
bootstrap work but were consciously deferred. Each item captures *what*
needs doing, *why* it matters, and *when* it becomes load-bearing so
that a future session can pick it up without re-learning the context.

Status keys: `[open]` not started · `[blocked]` waiting on external
input · `[in-progress]` under active work · `[done]` fixed, kept
here as a durable record.

---

## [open] VDET — provide time stream for nED when detached from timing system

**Surfaced:** §13 step 0 dry-run on 2026-04-11. `bl4b-Det-nED/configure/
RELEASE` previously referenced `VDET=/home/controls/common/vdet/merge/
bl4b`. Bogdan removed the reference to keep the aggregator clean, but
the dependency will return once the vtwin needs to simulate accelerator
timing for nED's event stream.

**Why it matters:** nED consumes the accelerator timing signal (RTDL /
60 Hz gate) to timestamp detector events. On a virtual twin with no
real timing link, nED needs a software substitute. VDET generates the
missing time stream so nED can run in simulation without errors.

**Classifier shape:** `/home/controls/common/vdet/merge/bl4b`

This is the same "one module, per-beamline subdir" pattern the plan
§6.3 now describes for `motion`: the build target is the module's
branch/tag (`merge`?), and the beamline consumes a subdir inside it.
`merge` looks unusual for a tag — it is probably a git **branch**, not
a git tag, and `git-release.sh` will need either a special case or a
branch-aware invocation to check it out. Worth confirming with Bogdan
whether VDET lives on code.ornl.gov / trac and which ref to pin.

**How to apply when it returns:**
1. Add `VDET=...` back to `bl4b-Det-nED/configure/RELEASE` (or a
   RELEASE.local).
2. Re-run `~/bootstrap/epics/build_modules.sh --show-only` and verify
   that vdet classifies cleanly as a `common` rule with the beamline
   subdir flagged as a VARIANT, not a NOTE.
3. If the `merge` ref turns out to be a branch, either:
   - extend `git-release.sh` to accept `-b <branch>` (upstream change), or
   - have `build_modules.sh` recognize `vdet` as a special case and
     check it out by branch via a direct `git clone -b merge` before
     the main build loop.
4. Add nED's simulated-timing validation to the `verify` target so a
   regression (e.g. vdet build silently failing) surfaces immediately.

---

## [open] `git-release.sh` recursive parser breaks on 3-component paths

**Surfaced:** §13 step 0 review of `/home/controls/share/scripts/git/
git-release.sh` lines 315-317. The recursive dependency resolver uses
`sed 's|.*/\([^/]*\)/[^/]*|\1|'` to pull REPO from the second-to-last
component, which only works when the IOC's RELEASE entry is a
2-component `/home/controls/<root>/<mod>/<tag>` path.

For 3-component references (e.g. `hplc/main/P61L`, `lakeshore/main/
lakeshore336`, `motion/rel.../bl4b`) it extracts the wrong REPO
(`rel1.11_20170630` instead of `motion`) and the recursive clone fails
with "No such repository".

**Why it matters:** every IOC whose RELEASE includes a module with a
3-component reference cannot be recursively built via
`git-release.sh -r`. Our own `build_modules.sh` classifier already
handles the canonicalization correctly at the *top-level* call, so the
workaround is to call `git-release.sh -r <mod> <tag>` directly instead
of letting `-r` discover the dependency transitively.

**How to apply:** either
(a) upstream-fix `git-release.sh` to take basename two-parents up
    (`$(basename $(dirname $(dirname $DIR)))`) when the last component
    is not a tag-like string, or
(b) have `build_modules.sh` enumerate the full transitive dependency
    set itself (via dump-deps.mk on each already-built module) and
    call `git-release.sh` non-recursively for every dep. Option (b)
    requires the aggregator to be run after each module is cloned,
    which is more complex but more robust.

Deferring until the `epics-modules` target is actually being built,
so we can see real failures before picking a fix.

---

## [open] PS71 / PS72 TCP 4571 collision in bl4b-ProcServ-vtwin1 st.cmd

**Surfaced:** plan §3.2. `bl4b-ProcServ-vtwin1/iocBoot/ioc/st.cmd`
allocates TCP port 4571 to *both* PS71 and PS72. On startup the second
`drvAsynIPPortConfigure` call will fail and one of the two IOCs will
not be reachable.

**Why it matters:** a flag-only finding per the plan (§9 check 7). The
bootstrap never mutates beamline-repo state, so this is Bogdan's
decision. The `verify` target will emit a `NOTE:` but still pass.

**How to apply when fixed:** re-run `ioc status --all` to confirm both
PS71 and PS72 reach "Active: active (running)".

---

## [open] Latent bug in `/home/controls/share/scripts/dependencies.sh`

**Surfaced:** plan §6.3. The shared `dependencies.sh` script passes
`--das3` when `*common*` appears in the dependency path but has no
parallel `-e` branch for `*epics*`. Our own `build_modules.sh` does
the correct split, so we are insulated, but if the upstream script is
ever fixed, the bootstrap can shrink by calling `dependencies.sh`
directly.

**How to apply when upstream is fixed:** replace
`epics/build_modules.sh`'s classifier loop with a single call to
`/home/controls/share/scripts/dependencies.sh` and remove the
classifier — dump-deps.mk stays as the aggregator input.

---

## Conventions

When adding new items:

- Lead with the `[status] Title` and the one-sentence *what*.
- **Surfaced:** where/when the issue became visible (plan section,
  dry-run, test run, etc.).
- **Why it matters:** what breaks if we never address it. Don't write
  this as "because the plan says so" — write the real failure mode.
- **How to apply:** concrete steps a future session can execute.
  Include commands, file paths, and success criteria.
- Promote to `[done]` in place rather than deleting, so the historical
  context stays readable.
+70 −17
Original line number Diff line number Diff line
@@ -477,22 +477,66 @@ apply, and calls `git-release.sh` per path in topological order.
   | `/home/controls/prod/<mod>/<tag>` (neither common nor epics) | `-r <mod> <tag>` (default PREFIX=`/home/controls/prod`) |
   | `/home/controls/common/<mod>/<tag>`                          | `--prefix /home/controls/common -r <mod> <tag>`    |
   | `/home/controls/epics/<mod>/<tag>`                           | `-e --prefix /home/controls -r <mod> <tag>`        |
   | anything else                                                | warn, surface to operator, skip                    |
   | anything else                                                | UNHANDLED — surface to operator, exit 1            |

   **Path shape vs. path on disk.** A RELEASE file may point deeper than the
   module's canonical TOP — e.g. `TEMPLATE_TOP=$(EPICS_BASE)/templates/
   makeBaseApp/top` reaches *inside* base, and `LAKESHORE=…/lakeshore/main/
   lakeshore336` picks a device-specific subdir of the module. The classifier
   canonicalizes such references by taking the first two components after
   the root anchor (`prod/common`, `prod/epics`, `prod`, `common`, `epics`)
   as `mod/tag` and recording the trailing residue as a **NOTE:**. The
   build command remains `git-release.sh -r <mod> <tag>`; the residue
   exists on disk only because it lives inside the module tarball.

   **The "one module, per-beamline subdir" pattern** is a recurring
   complication. Some modules (today: `motion`; future: `vdet` — see
   `plan/todo.md`) are built once and then consumed by each beamline from
   a beamline-named subdirectory *inside* the release:

   Classifier implementation in `build_modules.sh`:
   ```
   /home/controls/prod/motion/rel1.11_20170630/bl4b   (bl4b-PPS's MOTION)
   /home/controls/prod/motion/rel1.16_20181120/bl4b   (bl4b-Parker1's MOTION)
   ```

   The build step is still `-r motion rel1.11_20170630` — the `bl4b`
   subdir is part of the module tarball, not a second "tag" component.
   The classifier recognizes this pattern by checking the trailing
   residue against the current `$BEAMLINE` and against a canonical
   beamline-name shape (`bl\d+[a-z]?`, `cg\d+`, `hb\d+`, `ref*`). When it
   matches, the residue is recorded as a **VARIANT (beamline subdir)**
   instead of a NOTE — semantically "expected" rather than "investigate".
   Multiple beamlines can accumulate on the same canonical module
   (`VARIANT: bl4b|cg3` when the same release is shared across beamlines).

   Classifier outline in `build_modules.sh` (full shape in the script):
   ```bash
   classify() {
       local d="$1" mod tag
       mod=$(basename "$(dirname "$d")")
       tag=$(basename "$d")
       case "$d" in
           /home/controls/prod/common/*/*) echo "--das3 -r --all $mod $tag" ;;
           /home/controls/prod/epics/*/*)  echo "-e -r $mod $tag" ;;
           /home/controls/prod/*/*)        echo "-r $mod $tag" ;;
           /home/controls/common/*/*)      echo "--prefix /home/controls/common -r $mod $tag" ;;
           /home/controls/epics/*/*)       echo "-e --prefix /home/controls -r $mod $tag" ;;
   _classify() {                  # canonicalize path → TSV(canon, args, extra, rule)
       local d=$1 rel=${d#/home/controls/}
       IFS=/ read -ra parts <<< "$rel"
       case "$rel" in
         prod/common/*) mod=${parts[2]} tag=${parts[3]} rule=prod-common
                        args="--das3 -r --all $mod $tag" ;;
         prod/epics/*)  mod=${parts[2]} tag=${parts[3]} rule=prod-epics
                        args="-e -r $mod $tag" ;;
         prod/*)        mod=${parts[1]} tag=${parts[2]} rule=prod
                        args="-r $mod $tag" ;;
         common/*)      mod=${parts[1]} tag=${parts[2]} rule=common
                        args="--prefix /home/controls/common -r $mod $tag" ;;
         epics/*)       mod=${parts[1]} tag=${parts[2]} rule=epics
                        args="-e --prefix /home/controls -r $mod $tag" ;;
         *) echo "UNHANDLED $d" >&2; return 1 ;;
       esac
       canon="/home/controls/$(dirname $rel | cut -d/ -f1-2)/$mod/$tag"
       extra=${d#$canon}; extra=${extra#/}; [ -z "$extra" ] && extra="-"
       printf '%s\t%s\t%s\t%s\n' "$canon" "$args" "$extra" "$rule"
   }

   _is_beamline_subdir() {        # is trailing residue a beamline-name subdir?
       local first=${1%%/*}
       [ "$first" = "$BEAMLINE" ] && return 0
       case "$first" in bl[0-9]*|cg[0-9]*|hb[0-9]*|ref[a-z-]*) return 0 ;; esac
       return 1
   }
   ```

@@ -581,14 +625,23 @@ be built with a single `make -C <dir>`.
   - If the build succeeds, verify `$d/bin/linux-x86_64/*` has a non-empty
     result (or `$d/iocBoot/ioc*/st.cmd` is present and `chmod +x`-executable).

3. **Skip list** (known stub/Python IOCs — no build step):
   - `bl4b-ArchiveEngine` (no Makefile)
3. **Skip list** (known stub/Python/config-only IOCs — no build step):
   - `bl4b-ArchiveEngine` (no top-level Makefile; only `configure/RELEASE.local`)
   - `bl4b-ExperimentPlanning` (empty Makefile)
   - `bl4b-PlanNewExp` (empty Makefile)
   - `bl4b-AdaraMonitor` (Python-only, configured via conda env)
   - `bl4b-IPTS` (Python-only)
   - `bl4b-AdaraMonitor` (Python-only, configured via conda env; no
     `configure/`)
   - `bl4b-IPTS` (Python-only; no `configure/`)
   - `bl4b-EICManager` (config-only; no `configure/`. Its Makefile's
     `install` target only creates symlinks into `/home/controls/releases/`
     and expects `/home/controls/common/eic/ExternalInstrumentControl/main`
     to already exist. The linked EIC module is managed out-of-band.)
   - the 3 `bl4b-ProcServ-*` meta IOCs are handled separately (see (E))

   The aggregator in §6.3 already naturally skips these (no
   `configure/RELEASE*` ⇒ not scanned), which is how the dry-run on
   `bl4b-vtwin1` identified all six as "apps skipped" during §13 step 0.

4. **Build the ProcServ IOC for this host:**
   `make -C /home/controls/${BEAMLINE}/applications/${BEAMLINE}-ProcServ-${MACHINE}`
   (on bl4b-vtwin1 → `bl4b-ProcServ-vtwin1`). The bin is already present on