Commit 03b5fbe7 authored by Vacaliuc, Bogdan's avatar Vacaliuc, Bogdan
Browse files

plan: review pdf-tools/ fitness-for-purpose for proposal documents



The pdf-tools/ directory inherited on this branch was built for short
technical investigation reports (DANGLE, S3-Gap, tthd style) on the
instrument-motion-investigations branch. Proposal documents have
materially different needs: sponsor template compliance, page limits,
bibliographies, figure captions, multi-author track-change workflows,
PDF/A and accessibility requirements.

This plan defers the actual evaluation to a future session — including
inventory of proposal-package needs, a representative test run of the
current tool, a survey of alternatives (Pandoc+LaTeX, Quarto, Typst,
Word pipeline), and a written decision-matrix recommendation. Scoped
deliberately as "review and recommend", not "refactor".

Cross-referenced with the related plan deferred on the
quicknxsv2-modularization branch, which raises the same question for
that use case.

Co-Authored-By: default avatarClaude Opus 4.7 (1M context) <noreply@anthropic.com>
parent 110872a6
Loading
Loading
Loading
Loading
+128 −0
Original line number Diff line number Diff line
# Review pdf-tools/ fitness-for-purpose for proposal documents

**Status**: open — needs evaluation and decision before relying on
`pdf-tools/md2pdf.py` for proposal deliverables on this branch.

## Why this needs reviewing

The `pdf-tools/` directory in this branch (and in
`instrument-motion-investigations`) was originally built to convert
**investigation reports** (DANGLE, S3-Gap, tthd, hs-HLS Motion-Failure
analyses) — short technical Markdown files (~30 KB) with code blocks,
tables, and an amber executive-summary callout, into print-quality PDFs
(~110 KB, ~15 pages) suitable for forwarding to instrument scientists
over email/Teams.

Proposal documents have **different requirements**:

- US Government / DOE proposal templates with strict formatting:
  page limits, margin requirements, font conventions, page-numbering
  rules, line-spacing constraints
- Embedded figures with sub-captions, equations, structured
  bibliographies, cross-referenced sections
- Required artefacts (cover pages, biographical sketches, current and
  pending support, budget tables) that the investigation-report
  workflow has no concept of
- Multi-author content with track-change handoffs, often originating
  in Word or LaTeX rather than vanilla Markdown
- Output files often need to match a sponsor-specified PDF/A profile,
  embed fonts, hit specific accessibility requirements

The investigation-report workflow's "Markdown in → opinionated PDF
out" pipeline does **not** address any of these. Treating it as the
proposal pipeline by default risks producing deliverables that the
sponsor rejects on formatting compliance alone.

## What "review" means here

This plan is intentionally scoped as **review and recommend** — not
"refactor pdf-tools" or "switch to a different tool." The deliverable
of this plan is a written recommendation, with evidence, that the user
can act on.

### Evaluation steps

1. **Inventory the actual proposal-document needs** for the proposals
   active in `/media/ssd2/Projects/Radiasoft/`. Read 2–3 representative
   submission packages:
   - What template does the sponsor require?
   - What sections, page limits, formatting rules apply?
   - What's the source-of-truth format (Word? LaTeX? Markdown?
     Confluence?)
   - What tools are the other proposal authors using?

2. **Test the existing pdf-tools/md2pdf.py** against one realistic
   proposal section. Note specifically:
   - Does it honour page limits / forced page breaks?
   - Can it render figures with sponsor-required captions?
   - Bibliography / cross-reference support?
   - Output PDF metadata (PDF/A compliance, embedded fonts,
     accessibility tags)?

3. **Survey alternatives** that proposal-writing tooling typically
   draws from. Non-exhaustive starting list:
   - **Pandoc + LaTeX template** — the academic / DOE standard;
     handles bibliographies, equations, page rules natively
   - **Quarto** — modern Markdown-to-PDF with publication features
     (figures, captions, cross-refs, sponsor templates exist for
     several agencies)
   - **Typst** — newer, faster than LaTeX, good Markdown-ish syntax,
     templates emerging
   - **Word + a controlled Markdown→Word pipeline** — if the sponsor
     accepts .docx and other authors are in Word, fighting the
     workflow may be worse than joining it

4. **Decision matrix**: rank options against the proposal needs from
   step 1, weighting:
   - Sponsor-template compliance (must-have)
   - Figure/caption/bibliography quality
   - Track-change / multi-author workflow compatibility
   - Maintenance burden (who keeps templates current as sponsor specs
     change?)
   - Output stability across runs (same input → byte-identical output?)

5. **Recommend** one of:
   - Keep `pdf-tools/md2pdf.py` (with documented limitations)
   - Replace with one of the surveyed alternatives, with a one-pager
     migration plan
   - Use a hybrid (e.g., authoring in Markdown for drafts, switching
     to Word/LaTeX for final submission)

## What pdf-tools/md2pdf.py currently does

(Quick inventory so the next session doesn't re-do discovery.)

Per the `pdf-tools/README.md` and `md2pdf.py` in this branch:
- `uv`-managed Python project
- Wraps `markdown` + a CSS-styling step + `weasyprint` (or similar)
  to produce PDFs
- Designed for the short-technical-report use case described in
  `instrument-motion-investigations/CLAUDE.md`'s
  "Investigation-report PDF workflow" section
- Output is opinionated (title page, TOC, callout styles) — not a
  generic PDF generator, more of a single-purpose template engine

## Related context

- Sister branch `instrument-motion-investigations` — where
  `pdf-tools/` originated and where it's *known* fit-for-purpose;
  reading its CLAUDE.md "Investigation-report PDF workflow" section
  shows the intended use case, which contrasts sharply with proposal
  needs.
- Branch `quicknxsv2-modularization` — also has this plan deferred
  for similar reasons (different use case, same question of whether
  the existing tool is the right tool).
- The user's preference (per `~/.claude/CLAUDE.md` `[ALWAYS] Design
  framing` section) is **robust over simplest** — a hasty "yes
  pdf-tools is fine" without doing the audit risks technical-debt
  cleanup later, which the user explicitly rejects.

## Suggested next session

1. Skim this file and its references.
2. Spend 30 minutes with one realistic proposal package in
   `/media/ssd2/Projects/Radiasoft/` to inventory needs.
3. Run `pdf-tools/md2pdf.py` against one chunk of proposal content.
4. Write the decision-matrix recommendation back into this file (or a
   sibling `pdf-tools-fitness-recommendation.md`).
5. Bring the recommendation to the user before doing any tool swap.