Large memory consumption in BufferSTL when using PerformPuts and Flush instead of Begin/EndStep (BP4)
Created by: franzpoeschel
During performance evaluations of the ADIOS2 backend in the openPMD API, I noticed an unusually large heap memory consumption in non-streaming workflows (in this case the BP4 engine) and traced it back to memory not being freed from the marshalling buffer (BufferSTL
).
Since openPMD's iterations cannot be easily modeled using the ADIOS2 step concept, this backend only uses steps for streaming engines. For disk-based engines, we use Engine::PerformPuts
and Engine::Flush
instead. The documentation for the latter method says:
Manually flush to underlying transport to guarantee data is moved
This suggests that data should not be present in ADIOS after this call. (?) The figure below shows the memory trace from a small example, writing 30 openPMD iterations from PIConGPU to disk with several flushes per iteration. This memory buildup is not visible when using ADIOS steps.
Did I understand the functionality of Engine::Flush
correctly? In that case, calling it should free the buffer. If not, is there a suggested alternative to avoid using ADIOS steps without building up heap memory usage in the described way?