+11
−1
Loading
When WRITE_PERFORMANCE is enabled, SIMULATION_TIME and TOTAL_TIME were stopped before asynchronous GPU operations (gpuMemcpyAsync) and MPI calls completed. This caused component timers (COMPUTE_TIME, SWMM_TIME, etc.) to continue accumulating after the parent timers had already stopped, resulting in their sum exceeding SIMULATION_TIME and producing negative "Other" time values. Added gpuStreamSynchronize(streams) and MPI_Barrier(ENSIFY_COMM_WORLD) before stopping SIMULATION_TIME in two locations: 1. Before WRITE_PERFORMANCE checkpoint (line 2131) 2. Before final simulation end (line 2177) This ensures all outstanding GPU and MPI work completes before timers are stopped, maintaining proper timer hierarchy and eliminating the negative residual issue.