Commit 11c6b8d2 authored by Zfp Upstream's avatar Zfp Upstream Committed by Ben Boeckel
Browse files

zfp 2018-02-14 (7af98d0e)

Code extracted from:

    https://gitlab.kitware.com/third-party/zfp.git

at commit 7af98d0ef98f5f1241f51a0c4c5709b4beba3776 (for/vtk-old).
parent 97494493
OVERVIEW
zfp consists of three distinct components: (1) a set of low-level C codecs
for compressing and decompressing block subsets of one-, two-, and three-
dimensional single- and double-precision arrays; (2) a set of corresponding
C++ compressed array classes that support random access; and (3) a high-level
C interface for compressing and decompressing entire floating-point arrays.
The compression codecs operate on individual d-dimensional blocks of size
4^d, e.g. 4 values in 1D, 4x4 = 16 values in 2D, and 4x4x4 = 64 values in
3D. The block being compressed need not be stored contiguously but can
be processed by specifying regular strides in each dimension. This is
useful if the block is initially stored uncompressed as part of a larger
array.
The array classes represent an entire array of floating-point values as
a collection of compressed blocks, each whose compressed size in number
of bits is fixed and specified by the user. The array classes cache
uncompressed blocks to reduce the number of compression and decompression
calls. Whenever an array value is read, the corresponding block is first
looked up in the cache, and if found the uncompressed value is returned.
Otherwise the block is first decompressed and stored in the cache.
Whenever an array element is written (whether actually modified or not),
a "dirty bit" is set with its cached block to indicate that the block
must be compressed back to persistent storage when evicted from the cache.
The libzfp C interface is useful for quickly compressing and archiving
large floating-point arrays of arbitrary dimensions without having to
understand the technical details of the compression algorithm and codec.
This library comes with utility functions for specifying the compression
rate, precision, or accuracy of the compressed data.
All code examples below are for 3D arrays of doubles, but it should be
clear how to modify the function calls for single precision and for 1D
or 2D arrays.
GENERAL DESIGN AND LIMITATIONS
The zfp API has been designed to facilitate integration with existing
applications. After initial array declaration, a zfp array can often
be used in place of a regular C/C++ array or STL vector, e.g. using flat
indexing via a[index] or using multidimensional indexing via a(i),
a(i, j), or a(i, j, k). There are, however, some important differences.
For instance, it is not possible to take the address of an array element,
i.e. constructions like &a[i] and a + i are not allowed. Moreover, the
operators [] and () do not return regular C++ references. Instead, a
proxy reference class is used (similar to how STL bit vectors are
implemented). These proxy references can, however, safely be passed to
functions and used where regular references can.
zfp does not support special floating-point values like infinities and
NaNs, although denormalized numbers are handled correctly. Similarly,
because the compressor assumes that the array values vary smoothly,
using finite but large values like HUGE_VAL in place of infinities is
not advised, as this will introduce large errors in smaller values
within the same block. Future extensions will provide support for a
bit mask to mark the presence of non-values.
The zfp C++ classes are implemented entirely as header files and make
extensive use of C++ templates to reduce code redundancy. Most classes
are wrapped in the 'zfp' namespace.
API OVERVIEW
The documentation is divided into three parts: the high-level libzfp
library; the low-level compression codecs; and the compressed array
classes (in that order). Users interested only in the compressed arrays,
which do not directly expose anything related to compression other than
compression rate control, may safely skip the next two sections.
ZFP HIGH-LEVEL C INTERFACE
Users concerned only with storing their floating-point data compressed may
use zfp as a black box that maps a possibly non-contiguous floating-point
array to a compressed bit stream. The intent of libzfp is to provide both
a high- and low-level interface to the compressor that can be called from
both C and C++ (and possibly other languages). libzfp supports strided
access, e.g. for compressing vector fields one scalar at a time, or for
compressing arrays of structs.
Consider compressing the 3D C/C++ array
// define an uncompressed array
double a[nz][ny][nx];
where nx, ny, and nz can be any positive dimensions. To invoke the libzfp
compressor, the dimensions and type must first be specified in a zfp_field
parameter object that encapsulates the type, size, and memory layout of the
array:
// allocate metadata for the 3D array a[nz][ny][nx]
uint dims = 3;
zfp_type type = zfp_type_double;
zfp_field* field = zfp_field_3d(&a[0][0][0], type, nx, ny, nz);
For single-precision data, use zfp_type_float. Note that the high-level
API does not support integer arrays (zfp_type_int32 and zfp_type_int64).
Such arrays must be compressed via the low-level interface.
Functions similar to zfp_field_3d exist for declaring 1D and 2D arrays.
If the dimensionality of the array is unknown at this point, then a generic
zfp_field_alloc() call can be made to just allocate a zfp_field struct,
which can be filled in later using the zfp_field_set_* functions. If the
array is non-contiguous, then zfp_field_set_stride_3d should be called.
The zfp_field parameter object holds information about the uncompressed
array. To specify the compressed array, a zfp_stream object must be
allocated:
// allocate metadata for a compressed stream
zfp_stream* zfp = zfp_stream_open(NULL);
We may now specify the rate, precision, or accuracy (see the README file
for more details on the meaning of these parameters):
// set compression mode and parameters
zfp_stream_set_rate(zfp, rate, type, dims, 0);
zfp_stream_set_precision(zfp, precision, type);
zfp_stream_set_accuracy(zfp, tolerance, type);
Note that only one of these three functions should be called. The return
value from these functions gives the actual rate, precision, or tolerance,
and may differ slightly from the argument passed due to constraints imposed
by the compressor, e.g. each block must be stored using a whole number of
bits at least as large as the number of bits in the floating-point exponent;
the precision cannot exceed the number of bits in a floating-point value
(i.e. 32 for single and 64 for double precision); and the tolerance must
be a (possibly negative) power of two.
The compression parameters have now been specified, but before compression
can occur a buffer large enough to hold the compressed bit stream must be
allocated. Another utility function exists for estimating how many bytes
are needed:
// allocate buffer for compressed data
size_t bufsize = zfp_stream_maximum_size(zfp, field);
uchar* buffer = new uchar[bufsize];
Note that zfp_stream_maximum_size returns the smallest buffer size
necessary to safely compress the data--the actual compressed size may be
smaller. If the members of zfp and field are for whatever reason not
initialized correctly, then zfp_stream_maximum_size returns 0.
Before compression can commence, we must associate the allocated buffer
with a bit stream used by the compressor to read and write bits:
// associate bit stream with allocated buffer
bitstream* stream = stream_open(buffer, bufsize);
zfp_stream_set_bit_stream(zfp, stream);
Finally, the array is compressed as follows:
// compress entire array
size_t size = zfp_compress(zfp, field);
The return value is the actual number of bytes of compressed storage, and
as already mentioned, size <= bufsize. If size = 0, then the compressor
failed. Since zfp 0.5.0, the compressor does not rewind the bit stream
before compressing, which allows multiple fields to be compressed one
after the other. The return value from zfp_compress is always the total
number of bytes of compressed storage so far relative to the memory
location pointed to by 'buffer'.
To decompress the data, the field and compression parameters must be
initialized with the same values as used for compression, either via
the same sequence of function calls as above, or by recording these
fields and setting them directly. Metadata such as array dimensions and
compression parameters are by default not stored in the compressed stream.
It is up to the caller to store this information, either separately from
the compressed data, or via the zfp_write_header and zfp_read_header calls.
These calls allow the user to specify what information to store in the
header, including a 'magic' format identifier, the field type and
dimensions, and the compression parameters (see the ZFP_HEADER_* macros).
In addition to this initialization, the bit stream has to be rewound to
the beginning:
// rewind compressed stream and decompress array
zfp_stream_rewind(zfp);
int success = zfp_decompress(zfp, field);
The return value is zero if the decompressor failed.
ZFP LOW-LEVEL COMPRESSION AND DECOMPRESSION CODEC
For applications that wish to compress or decompress portions of an array
on demand, a low-level interface is available. Since this API is useful
primarily for supporting random access, the user also needs to manipulate
the bit stream (see inc/bitstream.h), e.g. to position the bit pointer
to where data is to be read or written. Please be advised that the bit
stream functions have been optimized for speed, and do not check for
buffer overruns or other types of programmer error.
Like the high-level API, the low-level API also makes use of the zfp_stream
parameter object (see section above) to specify compression parameters and
storage, but does not encapsulate array metadata in a zfp_field object.
Functions exists for encoding and decoding complete or partial blocks, with
or without strided access. In non-strided mode, the uncompressed block to
be encoded or decoded is assumed to be stored contiguously. For example,
// compress a single contiguous block
double block[4 * 4 * 4] = { /* some set of values */ };
uint bits = zfp_encode_block_double_3(zfp, block);
The return value is the number of bits of compressed storage for the block.
For fixed-rate streams, if random access is desired, then the stream should
also be flushed after each block is encoded:
// flush any buffered bits
zfp_stream_flush(zfp);
This flushing should be done only after the last block has been compressed in
fixed-precision and fixed-accuracy mode, or when random access is not needed
in fixed-rate mode.
The block above could also have been compressed as follows using strides:
// compress a single contiguous block using strides
double block[4][4][4] = { /* some set of values */ };
int sx = &block[0][0][1] - &block[0][0][0]; // x stride = 1
int sy = &block[0][1][0] - &block[0][0][0]; // y stride = 4
int sz = &block[1][0][0] - &block[0][0][0]; // z stride = 16
uint bits = zfp_encode_block_strided_double_3(zfp, block, sx, sy, sz);
The strides are measured in number of scalars, not in bytes.
For partial blocks, e.g. near the boundaries of arrays whose dimensions
are not multiples of four, there are corresponding functions that accept
parameters (nx, ny, nz) to specify the actual block dimensions, with
1 <= nx, ny, nz <= 4. Corresponding functions exist for decompression.
To position a bit stream for reading (decompression), use
// position the stream at given bit offset for reading
stream_rseek(stream, offset);
where the offset is measured in number of bits from the beginning of the
stream. For writing (compression), a corresponding call exists:
// position the stream at given bit offset for writing
stream_wseek(stream, offset);
Note that it is possible to decompress fewer bits than are stored with a
compressed block to quickly obtain an approximation. This is done by
setting zfp->maxbits to fewer bits than used during compression, e.g. to
decompress only the first 256 bits of each block:
// modify decompression parameters to decode 256 bits per block
uint maxbits;
uint maxprec;
int minexp;
zfp_stream_params(zfp, NULL, &maxbits, &maxprec, &minexp);
assert(maxbits >= 256);
zfp_stream_set_params(zfp, 256, 256, maxprec, minexp);
This feature may be combined with progressive decompression, as discussed
further in the FAQ.
COMPRESSED ARRAYS
Currently there are six array classes for 1D, 2D, and 3D arrays, each of
which can represent single- or double-precision values. Although these
arrays store values in a form different from conventional single- and
double-precision floating point, the user interacts with the arrays via
floats and doubles.
The description below is for 3D arrays of doubles--the necessary changes
for other array types should be obvious. To declare and zero initialize
an array, use
// declare nx * ny * nz array of compressed doubles
zfp::array3<double> a(nx, ny, nz, rate);
This declaration is conceptually equivalent to
double a[nz][ny][nx] = {};
or
std::vector<double> a(nx * ny * nz, 0.0);
but with the user specifying the amount of storage used. (A predefined type
array3d also exists, while the suffix 'f' is used for floats.) Note that
the array dimensions can be arbitrary, and need not be multiples of four
(see above for a discussion of incomplete blocks). The 'rate' argument
specifies how many bits per value (amortized) to store in the compressed
representation. By default the block size is restricted to a multiple of
64 bits, and therefore the rate argument can be specified in increments of
64 / 4^d bits in d dimensions, i.e.
1D arrays: 16-bit granularity
2D arrays: 4-bit granularity
3D arrays: 1-bit granularity
For finer granularity, the BITSTREAM_WORD_TYPE macro needs to be set to a
type narrower than 64 bits, e.g. if set to uint8 the rate granularity
becomes 8 / 4^d bits in d dimensions, or
1D arrays: 2-bit granularity
2D arrays: 1/2-bit granularity
3D arrays: 1/8-bit granularity
Note that finer granularity implies lower performance. Also note that
because the arrays are stored compressed, their effective precision is
likely to be higher than the user-specified rate.
The array can also optionally be initialized from an existing contiguous
floating-point array stored at 'pointer' with an x stride of 1, y stride
of nx, and z stride of nx * ny:
// declare and initialize 3D array of doubles
zfp::array3d a(nx, ny, nz, precision, pointer, cache_size);
The 'cache_size' argument specifies the minimum number of bytes to allocate
for the cache of uncompressed blocks (see the section on Caching below for
more details).
If not already initialized, a function set() can be used to copy uncompressed
data to the compressed array:
const double* pointer; // pointer to uncompressed, initialized data
a.set(pointer); // initialize compressed array with floating-point data
Similarly, a get() function exists for retrieving uncompressed data:
double* pointer; // pointer to where to write uncompressed data
a.get(pointer); // decompress and store the array at pointer
The compressed representation of an array can also be queried or initialized
directly without having to convert to/from its floating-point representation:
size_t bytes = compressed_size(); // number of bytes of compressed storage
uchar* compressed_data(); // pointer to compressed data
The array can through this pointer be initialized from offline compressed
storage, but only after its dimensions and rate have been specified (see
above). For this to work properly, the cache must first be emptied via a
clear_cache() call (see below).
Through operator overloading, the array can be accessed in one of two ways.
For read accesses, use
double value = a[index]; // fetch value with given flat array index
double value = a(i, j, k); // fetch value with 3D index (i, j, k)
These access the same value if and only if index = i + nx * (j + ny * k).
Note that 0 <= i < nx, 0 <= j < ny, and 0 <= k < nz, and i varies faster
than j, which varies faster than k.
Array values may be written and updated using the usual set of C++ assignment
and compound assignment operators. For example:
a[index] = value; // set value at flat array index
a(i, j, k) += value; // increment value with 3D index (i, j, k)
Whereas one might expect these operators to return a (non-const) reference
to an array element, this would allow seating a reference to a value that
currently is cached but is transient, which could be unsafe. Moreover,
this would preclude detecting when an array element is modified. Therefore,
the return type of both operators [] and () is a proxy reference class,
similar to std::vector<bool>::reference from the STL library. Because
read accesses to a mutable object cannot call the const-qualified accessor,
a proxy reference may be returned even for read calls, e.g. in
a[i - 1] = a[i];
the array a clearly must be mutable to allow assignment to a[i - 1], and
therefore the read access a[i] returns type zfp::array3d::reference. The
value associated with the read access is obtained via an implicit conversion.
Array dimensions (nx, ny, nz) can be queried using these functions:
size_t size(); // total number of elements nx * ny * nz
uint size_x(); // nx
uint size_y(); // ny
uint size_z(); // nz
The array dimensions can also be changed dynamically, e.g. if not known
at declaration time, using
void resize(uint nx, uint ny, uint nz, bool clear = true);
When clear = true, the array is explicitly zeroed. In either case, all
previous contents of the array are lost. If nx = ny = nz = 0, all storage
is freed.
Finally, the rate supported by the array may be queried via
double rate(); // number of compressed bits per value
and changed using
void set_rate(rate); // change rate
This also destroys prior contents.
CACHING
As mentioned above, the array class maintains a software write-back cache
of at least one uncompressed block. When a block in this cache is evicted
(e.g. due to a conflict), it is compressed back to permanent storage only
if it has previously been modified.
The size cache to use is specified by the user, and is an important
parameter that needs careful consideration in order to balance the extra
memory usage, performance, and quality (recall that data loss is incurred
only when a block is evicted from the cache and compressed). Although the
best choice varies from one application to another, we suggest allocating
at least two layers of blocks (2 * (nx / 4) * (ny / 4) blocks) for
applications that stream through the array and perform stencil computations
such as gathering data from neighboring elements. This allows limiting the
cache misses to compulsory ones. If the cache_size parameter is set to
zero bytes, then this default of two layers is used.
The cache size can be set during construction, or can be set at a later
time via
void set_cache_size(bytes); // change cache size
Note that if bytes = 0, then the array dimensions must have already been
specified for the default size to be computed correctly. When the cache
is resized, it is first flushed if not already empty. The cache can
also be flushed explicitly if desired by calling
void flush_cache(); // empty cache by first compressing any modified blocks
To empty the cache without compressing any cached data, call
void clear_cache(); // empty cache without compression
To query the byte size of the cache, use
size_t cache_size(); // actual cache size in bytes
include_directories(
"${CMAKE_CURRENT_SOURCE_DIR}/inc"
"${CMAKE_CURRENT_BINARY_DIR}")
set(zfp_sources
set(sources
src/bitstream.c
src/block1.h
src/block2.h
src/block3.h
src/decode1d.c
src/decode1f.c
src/decode1i.c
src/decode1l.c
src/decode2d.c
src/decode2f.c
src/decode2i.c
src/decode2l.c
src/decode3d.c
src/decode3f.c
src/decode3i.c
src/decode3l.c
src/encode1d.c
src/encode1f.c
src/encode1i.c
src/encode1l.c
src/encode2d.c
src/encode2f.c
src/encode2i.c
src/encode2l.c
src/encode3d.c
src/encode3f.c
src/encode3i.c
src/encode3l.c
src/traitsd.h
src/traitsf.h
src/zfp.c)
vtk_add_library(vtkzfp ${zfp_sources})
include(GenerateExportHeader)
generate_export_header(vtkzfp
EXPORT_MACRO_NAME VTKZFP_EXPORT)
vtk_add_library(vtkzfp ${sources})
target_include_directories(vtkzfp
PRIVATE
"${CMAKE_CURRENT_SOURCE_DIR}/include")
if (NOT WIN32 OR CYGWIN)
target_link_libraries(vtkzfp m)
target_link_libraries(vtkzfp
PRIVATE
m)
endif ()
if (WIN32)
target_compile_definitions(vtkzfp
PRIVATE
ZFP_SOURCE)
endif ()
if (NOT VTK_INSTALL_NO_DEVELOPMENT)
install(
DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/inc/"
install(DIRECTORY
"${CMAKE_CURRENT_SOURCE_DIR}/include"
DESTINATION "${VTK_INSTALL_INCLUDE_DIR}/vtkzfp"
COMPONENT Development)
endif ()
COMPONENT Development)
endif()
Copyright (c) 2014-2016, Lawrence Livermore National Security, LLC.
Copyright (c) 2014-2017, Lawrence Livermore National Security, LLC.
Produced at the Lawrence Livermore National Laboratory.
Written by Peter Lindstrom.
LLNL-CODE-663824.
......
This diff is collapsed.
......@@ -7,6 +7,5 @@ symbols to avoid conflicts with other copies of the library within a single
process.
* Add attributes to pass commit checks within VTK.
* Mangle all exported symbols to have a `vtkzfp_` prefix.
* Add a CMake build system to the project.
* Export symbols for Windows support.
* Mangle all exported symbols to have a `vtkzfp_` prefix.
ZFP
===
INTRODUCTION
------------
zfp is an open source C/C++ library for compressed numerical arrays that
support high throughput read and write random access. zfp also supports
streaming compression of integer and floating-point data, e.g., for
applications that read and write large data sets to and from disk.
zfp was developed at Lawrence Livermore National Laboratory and is loosely
based on the algorithm described in the following paper:
Peter Lindstrom
"Fixed-Rate Compressed Floating-Point Arrays"
IEEE Transactions on Visualization and Computer Graphics
20(12):2674-2683, December 2014
doi:10.1109/TVCG.2014.2346458
zfp was originally designed for floating-point arrays only, but has been
extended to also support integer data, and could for instance be used to
compress images and quantized volumetric data. To achieve high compression
ratios, zfp uses lossy but optionally error-bounded compression. Although
bit-for-bit lossless compression of floating-point data is not always
possible, zfp is usually accurate to within machine epsilon in near-lossless
mode.
zfp works best for 2D and 3D arrays that exhibit spatial correlation, such as
continuous fields from physics simulations, images, regularly sampled terrain
surfaces, etc. Although zfp also provides a 1D array class that can be used
for 1D signals such as audio, or even unstructured floating-point streams,
the compression scheme has not been well optimized for this use case, and
rate and quality may not be competitive with floating-point compressors
designed specifically for 1D streams.
zfp is freely available as open source under a BSD license, as outlined in
the file 'LICENSE'. For more information on zfp and comparisons with other
compressors, please see the zfp
[website](https://computation.llnl.gov/projects/floating-point-compression).
For questions, comments, requests, and bug reports, please contact
[Peter Lindstrom](mailto:pl@llnl.gov).
DOCUMENTATION
-------------
Full
[documentation](http://zfp.readthedocs.io/en/release0.5.2/)
is available online via Read the Docs. A
[PDF](http://readthedocs.org/projects/zfp/downloads/pdf/release0.5.2/)
version is also available.
INSTALLATION
------------
zfp consists of three distinct parts: a compression library written in C;
a set of C++ header files that implement compressed arrays; and a set of
C and C++ examples. The main compression codec is written in C and should
conform to both the ISO C89 and C99 standards. The C++ array classes are
implemented entirely in header files and can be included as is, but since
they call the compression library, applications must link with libzfp.
On Linux, macOS, and MinGW, zfp is easiest compiled using gcc and gmake.
CMake support is also available, e.g. for Windows builds. See below for
instructions on GNU and CMake builds.
zfp has successfully been built and tested using these compilers:
gcc versions 4.4.7, 4.7.2, 4.8.2, 4.9.2, 5.4.1, 6.3.0
icc versions 12.0.5, 12.1.5, 15.0.4, 16.0.1, 17.0.0, 18.0.0
clang version 3.6.0
xlc version 12.1
MinGW version 5.3.0
Visual Studio versions 14.0 (2015), 14.1 (2017)
NOTE: zfp requires 64-bit compiler and operating system support.
## GNU builds
To compile zfp using gcc, type
make
from this directory. This builds libzfp as a static library as well as
utilities and example programs. To optionally create a shared library,
type
make shared
and set LD_LIBRARY_PATH to point to ./lib. To test the compressor, type
make test
If the compilation or regression tests fail, it is possible that some of
the macros in the file 'Config' have to be adjusted. Also, the tests may
fail due to minute differences in the computed floating-point fields
being compressed (as indicated by checksum errors). It is surprisingly
difficult to portably generate a floating-point array that agrees
bit-for-bit across platforms. If most tests succeed and the failures
result in byte sizes and error values reasonably close to the expected
values, then it is likely that the compressor is working correctly.
## CMake builds
To build zfp using CMake on Linux or macOS, start a Unix shell and type
mkdir build
cd build
cmake ..
make
To also build the examples, replace the cmake line with
cmake -DBUILD_EXAMPLES=ON ..
To build zfp using Visual Studio on Windows, start an MSBuild shell and type
mkdir build
cd build
cmake ..
msbuild /p:Configuration=Release zfp.sln
msbuild /p:Configuration=Debug zfp.sln
This builds zfp in both debug and release mode. See the instructions for
Linux on how to change the cmake line to also build the example programs.
zfp 0.5.0, February 29, 2016
- Modified CODEC to more efficiently encode blocks whose values are all
zero or are smaller in magnitude than the absolute error tolerance.
This allows representing "empty" blocks using only one bit each. This
version is not backwards compatible with prior zfp versions.
- Changed behavior of zfp_compress and zfp_decompress to not automatically
rewind the bit stream. This makes it easier to concatenate multiple
compressed bit streams, e.g. when compressing vector fields or multiple