adios2 wishlist, suggestions and concerns

Created by: germasch

To preface all of this, I like ADIOS2. I think it provides a good and fairly easy to use API to isolate users from a lot of the intricacies of high-performance I/O and beyond that provides a good interface for streaming data / coupling codes. I also like the way the code is written and organized.

I say this because what comes below may be perceived as criticism (and I guess some is), but it's not meant in a negative way, it's meant as help to improve certain things. Some issues may just be personal wishlist items, anyway. Some are things where I think the current design should be improved. Many of these issue I've been sitting on for a while, since as a newcomer to the project, I didn't want to appear to come in and start criticizing stuff all over, so previously I focused on smaller issue and trying to fix them. As might be clear to people who've been following this, I wasn't all that successful in being able to contribute more than obvious localized fixes. I know you guys are having a meeting coming up, so I figured I put this list out there, as it might be useful in discussing future plans and priorities.

Let me note that the below is informed by my own personal views and experiences. I'll try to focus on things which I believe to be generally useful. I obviously don't know all of your use cases / apps, though. I'll try to be brief. If you would like clarification, more detail or a discussion on a particular topic, I think it'd be best to open a separate issue on github for that topic.

Thread safety

It's appears that ADIOS2 doesn't have a well defined approach to thread safety at this time. I guess another way to put this is to say, if you want to use ADIOS2 from a threaded application, make sure you always call it from the master thread only. (I do see some locking in the code, but it's in the engines and I don't think that's sufficient.) The other aspect to that is that ADIOS2 itself can spawn threads to achieve better performance, there is some code which does so, and that looks fine to me. In the age of high on-node parallelism I think it'd be useful to have ADIOS2 support use in a threaded application. For example, I have my domain divided into a number of patches that all exist within the same MPI process. I can process these patches in parallel using OpenMP, and it'd be nice if I could also have each thread put their respective patch of the domain hand to ADIOS2 directly. (The ability to handle multiple blocks is there, but I don't think it's safe to use it in a threaded context.)
Making things thread is a good amount of work, so I doubt that's going to happen soon (nor do I think it's that it's an essential feature at this time.) However, implementing this at a later time will be hard, because the API doesn't appear to be designed with it in mind. Example: var.SetSelection(...); engine.Put(var, ...).

C++

There is no easy way for a C++ application to keep track of a collection of variables or attributes, due to the fact there there are 13 types out there, so one needs not one but 13 containers, and unless one wants to handle those by hand a bunch of macro-generated if-loops is needed. The core code at least has a base class which makes it easier to use a single container for keeping track of things, but that's not exposed in the interface. Even with that, though, using variables / attributes generically is still a hassle. Best workaround I can think of is to use a list of names instead, but that still requires a lookup every time you want to use something, and a macro if-loop to handle the different types.
Due to its use of STL containers, ADIOS2 is good about cleaning everything up when objects are destroyed. However, it also unfortunately hands out references/pointers to internal objects freely to he user, which become invalid once the underlying objects are destroyed. That leads to (arguably buggy) user code crashing or worse. Other libraries in this realm use reference counting for good reason)

Python

While one might argue that the issue above (references to internal objects becoming invalid) is a "buggy application" issue, it also causes the same issue in Python, where I don't think crashing is acceptable ever. h5py or mpi4py don't crash when an object goes away, even if the user does something unanticipated. With the adios2 interface, it's easy to get the whole python process to segfault.
There are at least some API inconsistencies between the Python API and C++ API. For example, it's not clear to me that var = io.DefineVariable("varName") creates a string dataset. Other overloads don't take types, either, but rather actual numpy datasets -- not because they're actually using them, but just to get the type from them.
WISHLIST ITEM: I think one of the greatest things about HDF5 is h5py. Specifically, its lazy evaluation features or whatever it's called: "Opening" a 20 GB particle dataset doesn't actually read in the data: dset = h5file["particles"] just returns a proxy object. Once you access, e.g., dset[10000:20000] it just reads in the particular data you're actually using from the HDF5 file. ADIOS2 has fundamentally the same capability, but I believe it's only accessible through the low level python interface, and hence much less convenient to use.

C

Overall, this is actually pretty standard, which is a good thing in my book.
There are some inconsistencies in naming the functions compared to the C++ API, and some missing functionality.

Fortran

Every Fortran API I've seen, for projects actually implemented in C/C++, uses the same approach, ie, simple Fortran wrappers that call through to implementations in C. Simple opaque handles are used. For example, that's how MPI, hdf5, petsc do it. ADIOS2 has a rather different Fortran interface, based on Fortran types replicating info also stored in the C++ core. Unfortunately, one has to be very careful for things to not get out of sync. There are numerous fixed-size buffers not protected against overflow in the current implementation. I don't see a way how to achieve proper C/C++ -- Fortran interoperability with the current interface. That is something that other libraries (like MPI, hdf5, petsc have) -- you can open a file in C, write some data, then call Fortran to write some other data to the same file. As codes transition kernels out of Fortran, e.g., to C/C++ to use CUDA, it'd be nice to have the same ADIOS2 objects accessible in both languages.
WISHLIST: As we all know, no one ever checks error codes (not even the included tests do). This affects both the Fortran and the C interface. I think it'd be very useful to have an internal error handler (default on) that stops the program when an error happens, rather than have things silently go wrong. MPI has chosen to do that by default, and I think it's very helpful.

Design

Essentially the same issue as with the C++11 interface exists in the core: A lot of macro-generated glue code making things kinda messy to deal with just about everywhere (in the core, in engines, etc.). I count > 200 invocations of ADIOS2_FOREACH_*_STDTYPE.
It'd be nice to have all Engines being Close() on destroying the underlying IO or ADIOS object. Because the way it looks ot me right now, if you forget, not all of your data will make it to disk.
core::Variable acts in two capacities, which should have been designed separately: One to interface with the user via the IO API. The other is to to keep track of per-engine state, e.g., m_BlocksInfo, m_BlocksSpan. This demonstrably leads to buggy behavior when using more than one Engine per I/O.
As should be already obvious, I have my own opinions on the Span interface. I'll open another issue for that, since it's complicated.
Supporting a hierarchical organization of Variables. I guess I already know this is a touchy topic. I don't think there is any argument that it's a useful way to organize things. People use it every day, both on computers / email / filing cabinets, etc. It's not perfect, but it's better than everything in one single folder. It's a superset of a flat namespace, so it doesn't prevent you from doing things in a flat namespace if you prefer. Not having a hierarchy makes it harder to migrate from ADIOS2 from another file format (I'd argue even numpy load/save support a hierarchy by means of using the underlying file system structure). I found myself implementing a hierarchy on top of ADIOS2 on day 1 of using it. Standards like OpenPMD don't directly map to ADIOS2 because they expect to be able to use a hierarchy. Creating a hierarchy yourself can be quite easily done for writing, but reading is much less obvious. regexp's are no replacement, for once because you first have to discover what common path prefix you want to even match to, and even then, queries are going to be O(N) at best. Asking users to implement this outside of the library is, I think, (a) going to lead to incompatible ways that people do it, (b) duplicated effort, (c) not actually even possible to do right without rescanning the global namespace every time: Say you advance to a new timestep, where a new variable got added. There is no way an application will be notified of that, so how would an application add this new variable to its proper place in the externally maintained hierarchy?

What I will say here is that it's non-trivial to add support for a hierarchical organization of variables at this point, because of design choices which were made earlier.