1. If jobs appear enqueued but never run, enable JSONL and check:
-`event=ENQUEUED` per site
-`snapshot.meta.waiting_jobs` and `snapshot.sites[site].queued_jobs`
2. If queue explodes, increase `--max-waiting` or slow `--submit-interval`.
3. If a site starves, switch to `--dispatch-policy round_robin` and reduce `--node-choices`.
## Known Env Constraint
Multiprocessing semaphores can fail in sandbox. If a `PermissionError` appears when creating semaphores, rerun outside the sandbox (escalated execution).