+3
−0
+45
−2
+2
−0
+30
−0
+72
−0
Loading
Implements the STALL_FLIT_PLAN.md Stage 1+2 math linking RAPS link loads
and slowdown factor to the Cassini hardware counter ratio:
(hni_tx_paused_0 + hni_tx_paused_1) / parbs_tarb_pi_posted_pkts
Key changes:
- base.py: add compute_stall_ratio(), compute_link_stall_packet_stats(),
aggregate_link_stall_stats(); apply_job_slowdown now sets job.stall_ratio
- network/__init__.py: export new functions; add NetworkModel.compute_tick_stall_stats()
using accumulated global_link_loads + mean packet size
- engine.py: collect per-job stall_ratios each tick; expose avg_stall_ratio,
max_stall_ratio, total_posted_pkts, total_tx_paused in TickReturn/TickData;
add stall_ratio_history time series
- job.py: JobStatistics records slowdown_factor and stall_ratio per completed job
- run_sim.py: write stall_ratio_history.parquet when network sim is active
- stats.py: avg_stall_ratio and max_stall_ratio added to Network Report
- config/frontier.yaml: add mean_packet_size_bytes=116, flit_size_bytes=64
- tests/unit/test_stall_ratio.py: 15 unit tests covering all new functions
Co-Authored-By:
Claude Sonnet 4.6 <noreply@anthropic.com>