Commit c7288a35 authored by Morales Hernandez, Mario's avatar Morales Hernandez, Mario
Browse files

Fix negative "Other" time in performance

When WRITE_PERFORMANCE is enabled, SIMULATION_TIME and TOTAL_TIME were
stopped before asynchronous GPU operations (gpuMemcpyAsync) and MPI calls
completed. This caused component timers (COMPUTE_TIME, SWMM_TIME, etc.)
to continue accumulating after the parent timers had already stopped,
resulting in their sum exceeding SIMULATION_TIME and producing negative
"Other" time values.

Added gpuStreamSynchronize(streams) and MPI_Barrier(ENSIFY_COMM_WORLD)
before stopping SIMULATION_TIME in two locations:
1. Before WRITE_PERFORMANCE checkpoint (line 2131)
2. Before final simulation end (line 2177)

This ensures all outstanding GPU and MPI work completes before timers
are stopped, maintaining proper timer hierarchy and eliminating the
negative residual issue.
parent be56b7c0
Loading
Loading
Loading
Loading
+11 −1
Original line number Diff line number Diff line
@@ -2128,6 +2128,11 @@ namespace Triton
        st.stop(IO_TIME);

        #if WRITE_PERFORMANCE
          // Synchronize before stopping timers to ensure all async operations complete
          gpuStreamSynchronize(streams);
          if (size > 1) {
            MPI_Barrier(ENSIFY_COMM_WORLD);
          }
          st.stop(SIMULATION_TIME);
          st.stop(TOTAL_TIME);
          out.write_times(st, print_id);
@@ -2164,6 +2169,11 @@ namespace Triton
    }


    // Synchronize before stopping timers to ensure all async operations complete
    gpuStreamSynchronize(streams);
    if (size > 1) {
      MPI_Barrier(ENSIFY_COMM_WORLD);
    }
    st.stop(SIMULATION_TIME);
    st.stop(TOTAL_TIME);