slides: add slide 8 — A/B equivalence harness proposal (de244b55) · Commits · Vacaliuc, Bogdan / tasking

plan/quicknxsv2-modularization/slides/slide-8-equivalence-harness.png

0 → 100644

+232 KiB

Loading image diff...

plan/quicknxsv2-modularization/slides/slide-8-equivalence-harness.svg

0 → 100644

+202 −0

Original line number	Diff line number	Diff line
		<?xml version="1.0" encoding="UTF-8"?>
		<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1920 1080" width="1920" height="1080" font-family="Helvetica, Arial, sans-serif">
		<defs>
		<linearGradient id="headerGrad" x1="0" x2="0" y1="0" y2="1">
		<stop offset="0%" stop-color="#002E5D"/>
		<stop offset="100%" stop-color="#004B8D"/>
		</linearGradient>
		<linearGradient id="guiLane" x1="0" x2="0" y1="0" y2="1">
		<stop offset="0%" stop-color="#FFF3E6"/>
		<stop offset="100%" stop-color="#FEE4CB"/>
		</linearGradient>
		<linearGradient id="autoLane" x1="0" x2="0" y1="0" y2="1">
		<stop offset="0%" stop-color="#F0F7EE"/>
		<stop offset="100%" stop-color="#DDECD8"/>
		</linearGradient>
		<linearGradient id="diffBox" x1="0" x2="0" y1="0" y2="1">
		<stop offset="0%" stop-color="#FFF8E7"/>
		<stop offset="100%" stop-color="#FAEBC0"/>
		</linearGradient>
		<linearGradient id="passBox" x1="0" x2="0" y1="0" y2="1">
		<stop offset="0%" stop-color="#E2F1D9"/>
		<stop offset="100%" stop-color="#C9E4BB"/>
		</linearGradient>
		<linearGradient id="failBox" x1="0" x2="0" y1="0" y2="1">
		<stop offset="0%" stop-color="#FEEADF"/>
		<stop offset="100%" stop-color="#FBC8A6"/>
		</linearGradient>
		<marker id="flowArrow" viewBox="0 0 12 9" refX="11" refY="4.5" markerWidth="9" markerHeight="7" orient="auto">
		<path d="M 0 0 L 12 4.5 L 0 9 z" fill="#002E5D"/>
		</marker>
		<marker id="greenArrow" viewBox="0 0 12 9" refX="11" refY="4.5" markerWidth="9" markerHeight="7" orient="auto">
		<path d="M 0 0 L 12 4.5 L 0 9 z" fill="#1F5B1F"/>
		</marker>
		<marker id="redArrow" viewBox="0 0 12 9" refX="11" refY="4.5" markerWidth="9" markerHeight="7" orient="auto">
		<path d="M 0 0 L 12 4.5 L 0 9 z" fill="#C8102E"/>
		</marker>
		<filter id="softShadow" x="-10%" y="-10%" width="120%" height="120%">
		<feGaussianBlur in="SourceAlpha" stdDeviation="2"/>
		<feOffset dx="1" dy="2" result="offset"/>
		<feComponentTransfer><feFuncA type="linear" slope="0.22"/></feComponentTransfer>
		<feMerge><feMergeNode/><feMergeNode in="SourceGraphic"/></feMerge>
		</filter>
		</defs>

		<rect width="1920" height="1080" fill="#FAFAFA"/>

		<!-- Header -->
		<rect x="0" y="0" width="1920" height="100" fill="url(#headerGrad)"/>
		<text x="960" y="55" font-size="34" font-weight="700" fill="white" text-anchor="middle">A/B Equivalence Harness — Catch Disagreement Before and After the Refactor</text>
		<text x="960" y="85" font-size="17" fill="#B8D4E8" text-anchor="middle">One fixture set, two paths through the code, one diff — the regression barrier that lets the hack-a-thon refactor confidently</text>

		<!-- Sub-heading -->
		<text x="40" y="145" font-size="16" fill="#2A2A2A">The debt items in slide 3 are real because <tspan font-style="italic">nothing catches them</tspan> today. Building this harness is Tier-2 work (item 10 on slide 7) and is</text>
		<text x="40" y="167" font-size="16" fill="#2A2A2A">the highest-leverage investment for the hack-a-thon: every later refactor runs through it, every drift shows up as a failed diff.</text>

		<!-- ─── Input fixture stage ─── -->
		<g>
		<rect x="40" y="200" width="320" height="200" rx="12" fill="#FFFFFF" stroke="#002E5D" stroke-width="2.5" filter="url(#softShadow)"/>
		<rect x="40" y="200" width="320" height="40" rx="12" fill="#002E5D"/>
		<text x="200" y="226" font-size="16" font-weight="700" fill="white" text-anchor="middle">1. Reference fixture set</text>
		<g font-size="13" fill="#2A2A2A">
		<text x="60" y="265">● ~10 REF_M runs chosen by scientific lead:</text>
		<text x="75" y="284">- polarized and unpolarized</text>
		<text x="75" y="302">- single-peak and multi-peak</text>
		<text x="75" y="320">- known-good, published reductions</text>
		<text x="60" y="345">● Pinned template files / Configurations</text>
		<text x="60" y="365">● Under LFS in the repo</text>
		<text x="60" y="385">● One expected-output file per run</text>
		</g>
		</g>

		<!-- Branching arrow from fixtures to both paths -->
		<path d="M 370 270 C 420 270 420 295 475 295" stroke="#002E5D" stroke-width="2.5" fill="none" marker-end="url(#flowArrow)"/>
		<path d="M 370 330 C 420 330 420 425 475 425" stroke="#002E5D" stroke-width="2.5" fill="none" marker-end="url(#flowArrow)"/>

		<!-- Split label -->
		<text x="415" y="365" font-size="13" fill="#606060" font-style="italic" text-anchor="middle">(same input)</text>

		<!-- ─── Path A: GUI reduction ─── -->
		<g>
		<rect x="490" y="230" width="540" height="150" rx="12" fill="url(#guiLane)" stroke="#E87722" stroke-width="2" filter="url(#softShadow)"/>
		<text x="760" y="265" font-size="18" font-weight="700" fill="#8C3A00" text-anchor="middle">Path A — GUI reduction</text>
		<text x="510" y="293" font-size="13" fill="#3E2C00">Drive the quicknxsv2 main window with <tspan font-family="monospace" font-weight="700">pytest-qt</tspan>:</text>
		<text x="530" y="313" font-size="13" fill="#3E2C00">load run → set ROIs from fixture → click Reduce</text>
		<text x="510" y="340" font-size="13" fill="#3E2C00">Capture the .dat file from the output directory.</text>
		<text x="510" y="360" font-size="12" fill="#8C3A00" font-style="italic">exercises every Qt signal, every deepcopy, the CSD/NexusData path, QuickNXS post-scaling…</text>
		</g>

		<!-- Path B: autoreduce reduction -->
		<g>
		<rect x="490" y="400" width="540" height="150" rx="12" fill="url(#autoLane)" stroke="#6DA64F" stroke-width="2" filter="url(#softShadow)"/>
		<text x="760" y="435" font-size="18" font-weight="700" fill="#1F5B1F" text-anchor="middle">Path B — autoreduce reduction</text>
		<text x="510" y="463" font-size="13" fill="#1F3A1F">Invoke <tspan font-family="monospace" font-weight="700">mr_reduction.ReductionProcess(...).reduce()</tspan></text>
		<text x="530" y="483" font-size="13" fill="#1F3A1F">with the same input file and matched options.</text>
		<text x="510" y="510" font-size="13" fill="#1F3A1F">Capture the .dat file from <tspan font-family="monospace">shared/autoreduce/</tspan>.</text>
		<text x="510" y="530" font-size="12" fill="#1F5B1F" font-style="italic">exercises the template contract, the autoreduce-specific default set</text>
		</g>

		<!-- Arrows merging into diff box -->
		<path d="M 1040 295 C 1080 295 1080 370 1115 370" stroke="#002E5D" stroke-width="2.5" fill="none" marker-end="url(#flowArrow)"/>
		<path d="M 1040 475 C 1080 475 1080 400 1115 400" stroke="#002E5D" stroke-width="2.5" fill="none" marker-end="url(#flowArrow)"/>

		<!-- ─── Diff stage ─── -->
		<g>
		<rect x="1125" y="300" width="330" height="180" rx="12" fill="url(#diffBox)" stroke="#D2A000" stroke-width="2.5" filter="url(#softShadow)"/>
		<rect x="1125" y="300" width="330" height="40" rx="12" fill="#D2A000"/>
		<text x="1290" y="327" font-size="16" font-weight="700" fill="white" text-anchor="middle">2. Compare R(Q) curves</text>
		<g font-size="13" fill="#3A2C00">
		<text x="1145" y="365">● Q values must match bin-by-bin</text>
		<text x="1145" y="385">● R and dR compared relative tol.</text>
		<text x="1145" y="405">● Plus sample-log checksum:</text>
		<text x="1160" y="423" font-family="monospace" font-size="12">two_theta · lambda_min/max</text>
		<text x="1160" y="441" font-family="monospace" font-size="12">specular_pixel · scatt_peak_*</text>
		<text x="1145" y="463" font-style="italic">Thresholds agreed by scientific lead</text>
		</g>
		</g>

		<!-- Arrows to PASS / FAIL -->
		<path d="M 1460 360 L 1510 360" stroke="#1F5B1F" stroke-width="3" fill="none" marker-end="url(#greenArrow)"/>
		<path d="M 1460 420 L 1510 420" stroke="#C8102E" stroke-width="3" fill="none" marker-end="url(#redArrow)"/>

		<!-- PASS -->
		<g>
		<rect x="1520" y="310" width="360" height="90" rx="12" fill="url(#passBox)" stroke="#53A548" stroke-width="2.5" filter="url(#softShadow)"/>
		<text x="1700" y="345" font-size="20" font-weight="700" fill="#1F5B1F" text-anchor="middle">✓ PASS — equivalent</text>
		<text x="1700" y="370" font-size="13" fill="#1F5B1F" text-anchor="middle">refactor preserved the reduction</text>
		<text x="1700" y="389" font-size="13" fill="#1F5B1F" text-anchor="middle">merge the PR; the contract holds</text>
		</g>

		<!-- FAIL -->
		<g>
		<rect x="1520" y="410" width="360" height="130" rx="12" fill="url(#failBox)" stroke="#C8102E" stroke-width="2.5" filter="url(#softShadow)"/>
		<text x="1700" y="444" font-size="20" font-weight="700" fill="#8C0D24" text-anchor="middle">✗ FAIL — divergence</text>
		<text x="1540" y="470" font-size="13" fill="#3E2C00">● Plot both curves side-by-side</text>
		<text x="1540" y="490" font-size="13" fill="#3E2C00">● Emit the per-bin delta as JSON</text>
		<text x="1540" y="510" font-size="13" fill="#3E2C00">● Fail the CI job and block merge</text>
		<text x="1540" y="530" font-size="12" fill="#8C3A00" font-style="italic">→ refactor needs review or rollback</text>
		</g>

		<!-- ═════════════ BOTTOM HALF — Uses of the harness ═════════════ -->
		<text x="960" y="605" font-size="22" font-weight="700" fill="#002E5D" text-anchor="middle">How this harness pays off across the hack-a-thon</text>

		<!-- Three use cases -->
		<g>
		<rect x="40" y="630" width="600" height="370" rx="12" fill="#FFFFFF" stroke="#002E5D" stroke-width="2" filter="url(#softShadow)"/>
		<rect x="40" y="630" width="600" height="40" rx="12" fill="#002E5D"/>
		<text x="340" y="656" font-size="16" font-weight="700" fill="white" text-anchor="middle">Before the refactor (Day 1)</text>
		<text x="60" y="695" font-size="14" fill="#2A2A2A" font-weight="700">Baseline measurement:</text>
		<text x="60" y="720" font-size="13" fill="#2A2A2A">Run the harness on all fixtures <tspan font-style="italic">as-is</tspan>.</text>
		<text x="60" y="742" font-size="13" fill="#2A2A2A">Expect <tspan fill="#C8102E" font-weight="700">failures</tspan> on every fixture.</text>
		<text x="60" y="775" font-size="14" fill="#2A2A2A" font-weight="700">What the failures tell you:</text>
		<text x="60" y="800" font-size="13" fill="#2A2A2A">● which slide-3 tensions are active</text>
		<text x="60" y="820" font-size="13" fill="#2A2A2A">● how big the numerical impact actually is</text>
		<text x="60" y="840" font-size="13" fill="#2A2A2A">● which fixtures diverge most — the best</text>
		<text x="85" y="858" font-size="13" fill="#2A2A2A">candidates for deeper investigation</text>
		<text x="60" y="890" font-size="14" fill="#2A2A2A" font-weight="700">Outcome:</text>
		<text x="60" y="915" font-size="13" fill="#2A2A2A">A <tspan font-weight="700">debt manifest</tspan> with concrete numbers</text>
		<text x="60" y="935" font-size="13" fill="#2A2A2A">to attach to the Day-4 EWM tickets.</text>
		<text x="60" y="970" font-size="12" fill="#606060" font-style="italic">Duration: ~half a day to set up + populate fixtures.</text>
		</g>

		<g>
		<rect x="660" y="630" width="600" height="370" rx="12" fill="#FFFFFF" stroke="#E87722" stroke-width="2" filter="url(#softShadow)"/>
		<rect x="660" y="630" width="600" height="40" rx="12" fill="#E87722"/>
		<text x="960" y="656" font-size="16" font-weight="700" fill="white" text-anchor="middle">During the refactor (Days 2–5)</text>
		<text x="680" y="695" font-size="14" fill="#2A2A2A" font-weight="700">Every PR is gated by the harness:</text>
		<text x="680" y="720" font-size="13" fill="#2A2A2A">● <tspan font-family="monospace">build_mrr_kwargs()</tspan> lands → same diff</text>
		<text x="680" y="740" font-size="13" fill="#2A2A2A">● default-value pinning lands → some fixtures pass</text>
		<text x="680" y="760" font-size="13" fill="#2A2A2A">● _as_ints unification lands → more fixtures pass</text>
		<text x="680" y="780" font-size="13" fill="#2A2A2A">● QuickNXS-scale consolidation → all fixtures pass</text>
		<text x="680" y="815" font-size="14" fill="#2A2A2A" font-weight="700">Each green PR is earned:</text>
		<text x="680" y="840" font-size="13" fill="#2A2A2A">the refactor is not "done" until both paths</text>
		<text x="680" y="860" font-size="13" fill="#2A2A2A">produce <tspan font-weight="700">bit-for-bit equivalent</tspan> (or explicitly</text>
		<text x="680" y="880" font-size="13" fill="#2A2A2A">sign-off-divergent) reductions.</text>
		<text x="680" y="915" font-size="14" fill="#2A2A2A" font-weight="700">Living contract:</text>
		<text x="680" y="940" font-size="13" fill="#2A2A2A">Each tension in slide 3 gets an explicit decision</text>
		<text x="680" y="960" font-size="13" fill="#2A2A2A">recorded in the fixture's expected-output file.</text>
		</g>

		<g>
		<rect x="1280" y="630" width="600" height="370" rx="12" fill="#FFFFFF" stroke="#6DA64F" stroke-width="2" filter="url(#softShadow)"/>
		<rect x="1280" y="630" width="600" height="40" rx="12" fill="#6DA64F"/>
		<text x="1580" y="656" font-size="16" font-weight="700" fill="white" text-anchor="middle">After the refactor (permanent)</text>
		<text x="1300" y="695" font-size="14" fill="#2A2A2A" font-weight="700">CI gate forever:</text>
		<text x="1300" y="720" font-size="13" fill="#2A2A2A">● Runs on every push and every PR</text>
		<text x="1300" y="740" font-size="13" fill="#2A2A2A">● Runs on every Mantid bump</text>
		<text x="1300" y="760" font-size="13" fill="#2A2A2A">● Runs when instrument geometry changes (IDF)</text>
		<text x="1300" y="795" font-size="14" fill="#2A2A2A" font-weight="700">Catches what slipped through:</text>
		<text x="1300" y="820" font-size="13" fill="#2A2A2A">Item 9e67585 (NX vs NY pixel clipping) would</text>
		<text x="1300" y="840" font-size="13" fill="#2A2A2A">have been caught the day it landed.</text>
		<text x="1300" y="860" font-size="13" fill="#2A2A2A">Item 2029db1 (stitching overwrites run #) too.</text>
		<text x="1300" y="895" font-size="14" fill="#2A2A2A" font-weight="700">Audit trail:</text>
		<text x="1300" y="920" font-size="13" fill="#2A2A2A">Each fixture pass/fail is a recorded event.</text>
		<text x="1300" y="940" font-size="13" fill="#2A2A2A">Scientists can point at a known-good reduction</text>
		<text x="1300" y="960" font-size="13" fill="#2A2A2A">date in the CI history when defending a paper.</text>
		</g>

		<!-- Footer -->
		<rect x="0" y="1020" width="1920" height="60" fill="#001A35"/>
		<text x="40" y="1058" font-size="13" fill="#6A90B0">Builds on existing pytest-qt / pytest-xvfb infrastructure — no new CI runner required. All fixture runs fit under the existing LFS quota.</text>
		<text x="1880" y="1058" font-size="13" fill="#6A90B0" text-anchor="end">Reflectometry Hack-a-thon 2026 · slide 8 of 8</text>
		</svg>