row/column major considerations
Created by: germasch
As @pnorbert seems to be working on BP4s on-disk representation (and dumping it), here's a wishlist item that I think would be worth considering:
Currently, ADIOS2 makes the decisions based on row vs col-major layout based on which API is invoked (C++/C: row-major, Fortran: col-major). There are some considerations with that, though. E.g., some (many?) projects switching parts of their code/kernels to C/C++/CUDA, and as that transition continues, they may want to be able to write data in a single app from both languages. e.g., they might want to write Fortran-layout data using the C++ API.
Also, C++'s data layout isn't natively row-major, I'd argue, it's mostly natively non-existant (except for compile-time fixed dimension arrays). Many of the libraries that do multi-d arrays (boost, xtensor, Kokkos) support both layouts, as does Python. So even in a native C++ application, one might use and want to write col-major data. Some apps may be using a mix of row-major and col-major layouts for performance reasons.
So I think there's some work that could be done on the API side, that lets the application specify the data layout they want to use for a given variable. But that's something that can be added over time relatively easily. If supporting mixed layouts is an eventual goal, though, the output format needs to be able to handle it, too, though. I believe BP3 keeps a global layout flag (well, maybe per PG or something), so I think it may not be able to handle mixed layouts well without some kind of compatibility issues, and that's fine. But for BP4, it might be useful to consider this case now, while the binary format is still in flux. It'd probably be easy to add a row-major / col-major flag per Variable.
[One could also consider a more flexible approach that allows for a general strided approach, e.g., for the case that some array dimensions used in an application has been padded for alignment.]