Use single metadata block for fixed-shaped variable
Created by: rtobar
Hi,
For some time we have been doing some experiments with ADIOS2 as a I/O storage backend. While in principle it works fine, we have been experiencing a big file size overhead given the type of data we store.
In our data model we have a "column" composed of "cells". The latter have 2 dimensions. We use a single 3D ADIOS2 variable to store the column, and adiosVar.Write
data into each of the cells with the appropriate Selection specification. Our Write calls are small though, amounting to a 4x1 array of single-precision complex values, for a total of 32 bytes. This creates a big overhead on the resulting BP files, which use about 20 times the storage that it would require if data were stored sequentially. When experimenting with using a simpler uni-dimensional 4-element array of complex values the overhead goes down to ~10x.
In our scenario the two inner dimensions sizes of our variable are always fixed. In principle users can continuously add data to our "column", but in practice we also set the outermost dimension size too to the max int value. Given this setup, it would be ideal to avoid hitting the penalty of writing metadata with each piece of data that gets written through adiosVar.Write
with the BP File storage backend.
I tried using constantDims = true
in the call to IO.DefineVariable
thinking that this might have helped, only to learn later that this applies to all dimensions objects, including the variable selection, and not only to the shape.