Commit 391fe8d8 authored by Vacaliuc, Bogdan's avatar Vacaliuc, Bogdan
Browse files

CLAUDE.md: capture established facts from DANGLE failure analysis



Adds a durable "established facts" section to the project instructions so
future investigations start from the correct ground truth. Covers:

- Runtime vs substitutions-file discrepancy on mDANGLE (MRES, SREV, VELO,
  BVEL are all different from the committed defaults after the 2026-02-19/20
  recalibration, and URIP=Yes is set by the profibus.template include).
- The BDST=0.5 deg backlash trap with the new MRES: every "-6330 step retry"
  in the Galil log is a motor record backlash pre-position, not an encoder
  glitch. Fix is BDST=0.
- The full DANGLE/RotationAxis air-pad sseq sequencing (including the 2s and
  7s timings in SeqFinish) and how it drives DANGLE.DMOV.
- The profibus.template URIP override that makes mDANGLE read DRBV from the
  Profibus absolute encoder via a calc record.
- Authoritative file locations on bl4a-dassrv1 for autosave, Galil command
  log, IOC log, scan server log, and motor/air-pad templates.
- Galil controller-to-axis map (DMC1-1 through DMC3-2).
- The still-open mechanical question: ~30-50 % step achievement on small
  correction moves, independent of BDST, needs hands-on diagnosis.

Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
parent 9f0f031a
Loading
Loading
Loading
Loading
+115 −0
Original line number Diff line number Diff line
@@ -17,6 +17,121 @@ There are files that have been collected during the investigation of the fault.

* /SNS/users/6ov/BL4A/2026/04/09/

## BL4A DANGLE/mDANGLE — established facts for future work

These are durable facts discovered during the 2026-04 investigation that closed with
`DANGLE-Motion-Failure-Analysis.md`. Always treat them as the starting point and verify
only if the beamline has been reconfigured since 2026-04-10.

### Runtime parameter values differ from the substitutions file

On 2026-02-19/20 `mDANGLE` was recalibrated via live `caput` and the substitutions file
`bl4a-Galil1.substitutions` line 55 was **not** updated. Since then the IOC has been running
with:

| Field | Substitutions default (line 55) | Runtime (authoritative) |
|---|---|---|
| `MRES` | `1.663148032e-04` | **`7.93e-05` deg/step** |
| `SREV` | `51200` | **`25600`** |
| `VELO` | `6.0906 deg/s` | **`1.45202 deg/s`** |
| `BVEL` | `6.0906` | `1.45202` |
| `BDST` | `0.5 deg` | `0.5 deg` *(unchanged, but now means 6305 steps not 3006 — see trap below)* |
| `RDBD` | `0.011` | has bounced between `0.001`, `0.005`, `0.01` — check latest `.sav` |
| `URIP` | *(not set in subs)* | `Yes` *(set by `profibus.template`)* |

**Always verify the live values with `caget` before computing any step-to-deg conversions.**
The substitutions file is stale; do not trust it for math.

### The BDST = 0.5 deg backlash trap (root cause of the 2026-04-08 scan failures)

With `BDST = 0.5 deg` and runtime `MRES = 7.93e-05`, a backlash pre-position move is
`0.5 / 7.93e-05 = 6305 motor steps` (= the "-6320…-6339 step" retries that appear in the
Galil IOC log on every failing DANGLE move since 2026-02-24).

The trap fires whenever the retry phase makes `diff` flip sign at the target boundary:
`preferred_dir` flips from true to false inside one `do_work()` call, Case 3 of the motor
record's move-selection logic fires, and the motor executes a `-6305`-step reverse move
followed by a `+6305`-step takeup — both with step loss on this air-padded heavy arm.

The fix is `BDST=0` (air-padded rotary stages have negligible mechanical backlash).
See `DANGLE-Motion-Failure-Analysis.md` for the full derivation and all the evidence.

### DANGLE/RotationAxis air-pad sequencing

`BL4A:Mot:DANGLE` is a **virtual soft motor** that orchestrates a full air-pad lifecycle
around the physical `mDANGLE` move. Wiring (see `bl4a_airpad_signals.db` and
`bl4a_airpad_signals_motor.db`):

1. `DANGLE:Seq` (sseq): block RotationAxis via DISP → clear Done.VAL → RunCheck (abort if
   RotationAxis busy) → clear SeqError → `AirPadControl=1` → 3 s delay → `AirPadOnCheck`
   (abort if `AirPadStatus≠1`) → `Setpoint.VAL → mDANGLE CA` → FLNK `SeqFinish`
2. `DANGLE:SeqFinish` (sseq): 2 s delay → `AbortOnError``AirPadControl=0` → 7 s delay →
   `AirPadOffCheck``SetSeqDone``SetSeqDone2``SetDone.PROC``Done.VAL=1` → virtual
   motor's DINP sees 1 → `DANGLE.DMOV=1` → scan server `completion=True` put-callback returns.

`RotationAxis` has an analogous `RotationAxis:Seq` / `RotationAxis:SeqFinish` chain.
They share the same air pad and are mutually exclusive via DISP locking and the
`:RunCheck` calcouts.

### mDANGLE readback chain (URIP=Yes via profibus.template)

The physical motor's `DRBV` does **not** come from the Galil step counter. The file
`bl4a-Galil1App/Db/profibus.template` lines 45–49 unconditionally mutates every motor
with a Profibus encoder:

```epics
record(motor, "$(S):Mot:$(M)") {
  field(RDBL, "$(S):Mot:$(M):EncPos")
  field(RRES, "1")
  field(URIP, "Yes")
}
```

For `mDANGLE`:
- Raw Profibus counts: `BL4A:Mot:mDANGLE:Enc` (modbus port `m1` = `10.111.8.46:502`, addr 44,
  100 ms poll interval)
- Converted to degrees: `BL4A:Mot:mDANGLE:EncPos` (calc, `A*B+C` where `A=:Enc`,
  `B=.ERES=-0.000466906`, `C=EOFF=3390`)
- The motor record samples this CP-linked value on every retry decision

Any noise/latency/transient in the Profibus encoder path is promoted to physical motion
by the retry logic. This is a secondary reliability concern; consider making the
`profibus.template` URIP override opt-in per axis.

### Authoritative file locations on bl4a-dassrv1

| Purpose | Path |
|---|---|
| Motor record autosave (VELO, BVEL, BDST, RDBD, RTRY…) | `/home/controls/var/bl4a-Galil1/bl4a-Galil1.sav` + `.sav0/.sav1/.sav2` (rotating) |
| Motor record dated snapshots | `/home/controls/var/bl4a-Galil1/bl4a-Galil1.sav_YYMMDD-hhmmss` (on restart) |
| Motor record pass0 autosave (MRES, ERES, DVAL, OFF) | `/home/controls/var/bl4a-Galil1/bl4a-Galil1_pass0.sav*` |
| Galil command log (every PR/BG/MG/SH per controller) | `/home/controls/var/log/bl4a-Galil1.log` |
| Galil IOC stdout/stderr | `/home/controls/var/log/ioc_bl4a-Galil1.log` |
| Scan server stdout (has real RBV values in `TimeoutException`) | `/home/controls/var/scan/console.log` |
| Scan device definitions (tolerances, timeouts) | `/home/controls/bl4a/python/scantools/devices.py` |
| Air pad virtual motor wiring | `/home/controls/bl4a/applications/bl4a-Galil1/bl4a-Galil1App/Db/bl4a_airpad_signals*.db` |
| Motor substitutions (incl. mDANGLE at line 55) | `/home/controls/bl4a/applications/bl4a-Galil1/bl4a-Galil1App/Db/bl4a-Galil1.substitutions` |
| Profibus URIP override template | `/home/controls/bl4a/applications/bl4a-Galil1/bl4a-Galil1App/Db/profibus.template` |

### Galil controller-to-axis mapping

| Controller | IP | Notable axes |
|---|---|---|
| `DMC1-1` | `10.112.8.41` | F=SANGLE, G=PSC6, H=Slit3Trans |
| `DMC1-2` | `10.112.8.42` | D=mRotationAxis, G=LSlit3, H=RSlit3 |
| `DMC2-1` | `10.112.8.44` | FOMRot, PolLift, SMPol* |
| `DMC2-2` | `10.112.8.45` | Slit1*, Slit2* |
| `DMC3-1` | `10.112.8.50` | Slit0, SampleSlit, SF1Translation |
| `DMC3-2` | `10.112.8.51` | A=LSlit4, B=RSlit4, C=DTrans, **D=mDANGLE**, E=AnalyzerTrans, F=He3AnalyzerTrans, G=AnalyzerRot, H=AnalyzerLift |

### Open mechanical question

The stepper only achieves **~30–50 % of commanded steps** on small correction moves
(measured from the 04-08 archiver CSV vs Galil log). This persists independent of the
BDST trap and will limit achievable scan tolerance even after `BDST=0`. Candidate causes:
air-pad supply pressure or valve timing, motor shaft coupling slip, encoder-to-arm
coupling, drivetrain backlash/compliance. Needs hands-on diagnosis.

## Secure Temporary Files

When a task requires writing a temporary script or data file (e.g. to work around