SST crashes with LibFabric
Created by: khuck
ADIOS2 was configured on a 36-core Linux workstation with the following cmake output:
Currently Loaded Modules:
1) gcc/8.1 2) mpi/openmpi-4.0.1_gcc-8.1 3) cmake/3.15.1 4) python/3.6.8
++ which mpicc
++ which mpic++
++ which mpif90
+ cmake -DCMAKE_C_COMPILER=/packages/openmpi/4.0.1-gcc8.1/bin/mpicc -DCMAKE_CXX_COMPILER=/packages/openmpi/4.0.1-gcc8.1/bin/mpic++ -DCMAKE_Fortran_COMPILER=/packages/openmpi/4.0.1-gcc8.1/bin/mpif90 -DADIOS2_USE_Python=ON -DCMAKE_INSTALL_PREFIX=/home/users/khuck/src/ADIOS2/install_mpi -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
-- The C compiler identification is GNU 8.1.0
-- The CXX compiler identification is GNU 8.1.0
-- Check for working C compiler: /packages/openmpi/4.0.1-gcc8.1/bin/mpicc
-- Check for working C compiler: /packages/openmpi/4.0.1-gcc8.1/bin/mpicc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /packages/openmpi/4.0.1-gcc8.1/bin/mpic++
-- Check for working CXX compiler: /packages/openmpi/4.0.1-gcc8.1/bin/mpic++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Could NOT find Blosc (missing: BLOSC_LIBRARY BLOSC_INCLUDE_DIR)
-- Could NOT find BZip2 (missing: BZIP2_LIBRARIES BZIP2_INCLUDE_DIR)
-- Could NOT find ZFP (missing: ZFP_LIBRARY ZFP_INCLUDE_DIR)
-- Could NOT find SZ (missing: SZ_LIBRARY ZLIB_LIBRARY ZSTD_LIBRARY SZ_INCLUDE_DIR)
-- Could NOT find MGARD (missing: MGARD_LIBRARY ZLIB_LIBRARY MGARD_INCLUDE_DIR)
-- Found ZLIB: /usr/lib64/libz.so (found version "1.2.7")
-- Could NOT find PNG: Found unsuitable version "1.5.13", but required is at least "1.6.0" (found /usr/lib64/libpng.so)
-- The Fortran compiler identification is GNU 8.1.0
-- Check for working Fortran compiler: /packages/openmpi/4.0.1-gcc8.1/bin/mpif90
-- Check for working Fortran compiler: /packages/openmpi/4.0.1-gcc8.1/bin/mpif90 -- works
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Checking whether /packages/openmpi/4.0.1-gcc8.1/bin/mpif90 supports Fortran 90
-- Checking whether /packages/openmpi/4.0.1-gcc8.1/bin/mpif90 supports Fortran 90 -- yes
-- Found MPI_C: /packages/openmpi/4.0.1-gcc8.1/bin/mpicc (found version "3.1")
-- Found MPI_CXX: /packages/openmpi/4.0.1-gcc8.1/bin/mpic++ (found version "3.1")
-- Found MPI_Fortran: /packages/openmpi/4.0.1-gcc8.1/bin/mpif90 (found version "3.1")
-- Found MPI: TRUE (found version "3.1") found components: C Fortran CXX
-- Found ZeroMQ: /usr/lib64/libzmq.so (found suitable version "4.1.4", minimum required is "4.1")
-- Could NOT find HDF5 (missing: HDF5_LIBRARIES HDF5_INCLUDE_DIRS C) (found version "")
-- Found PythonInterp: /packages/python/3.6.8/bin/python3 (found version "3.6.8")
-- Found PythonLibs: /packages/python/3.6.8/lib/libpython3.6m.so (found version "3.6.8")
-- Found PythonModule_numpy: /packages/python/3.6.8/lib/python3.6/site-packages/numpy
-- Found PythonModule_mpi4py: /home/users/khuck/.local/lib/python3.6/site-packages/mpi4py
-- Found PythonFull: /packages/python/3.6.8/bin/python3 found components: Interp Libs numpy mpi4py
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.27.1")
-- Checking for module 'libfabric'
-- Found libfabric, version 1.6.1
-- Found LIBFABRIC: /usr/lib64/libfabric.so (Required is at least version "1.6")
-- Checking for module 'cray-drc'
-- No package 'cray-drc' found
-- Could NOT find CrayDRC (missing: CrayDRC_LIBRARIES)
-- Looking for shmget
-- Looking for shmget - found
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- ADIOS2 ThirdParty: Configuring KWSys
-- Checking whether header cstdio is available
-- Checking whether header cstdio is available - yes
-- Checking for Large File Support
-- Checking for Large File Support - yes
-- Checking whether C++ compiler has 'long long'
-- Checking whether C++ compiler has 'long long' - yes
-- Checking whether C++ compiler has '__int64'
-- Checking whether C++ compiler has '__int64' - no
-- Checking whether wstring is available
-- Checking whether wstring is available - yes
-- Checking whether C compiler has ptrdiff_t in stddef.h
-- Checking whether C compiler has ptrdiff_t in stddef.h - yes
-- Checking whether C compiler has ssize_t in unistd.h
-- Checking whether C compiler has ssize_t in unistd.h - yes
-- Checking whether CXX compiler has setenv
-- Checking whether CXX compiler has setenv - yes
-- Checking whether CXX compiler has unsetenv
-- Checking whether CXX compiler has unsetenv - yes
-- Checking whether CXX compiler has environ in stdlib.h
-- Checking whether CXX compiler has environ in stdlib.h - no
-- Checking whether CXX compiler has utimes
-- Checking whether CXX compiler has utimes - yes
-- Checking whether CXX compiler has utimensat
-- Checking whether CXX compiler has utimensat - yes
-- Checking whether CXX compiler struct stat has st_mtim member
-- Checking whether CXX compiler struct stat has st_mtim member - yes
-- Checking whether CXX compiler struct stat has st_mtimespec member
-- Checking whether CXX compiler struct stat has st_mtimespec member - no
-- Checking whether <ext/stdio_filebuf.h> is available
-- Checking whether <ext/stdio_filebuf.h> is available - yes
-- ADIOS2 ThirdParty: Configuring GTest
-- ADIOS2 ThirdParty: Configuring pybind11
-- Found PythonLibs: /packages/python/3.6.8/lib/libpython3.6m.so
-- pybind11 v2.2.4
-- ADIOS2 ThirdParty: Configuring pugixml
-- ADIOS2 ThirdParty: Configuring nlohmann_json
-- ADIOS2 ThirdParty: Configuring atl
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of double
-- Check size of double - done
-- Check size of float
-- Check size of float - done
-- Check size of int
-- Check size of int - done
-- Check size of short
-- Check size of short - done
-- Looking for include file malloc.h
-- Looking for include file malloc.h - found
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for include file stdlib.h
-- Looking for include file stdlib.h - found
-- Looking for include file string.h
-- Looking for include file string.h - found
-- Looking for include file sys/time.h
-- Looking for include file sys/time.h - found
-- Looking for include file windows.h
-- Looking for include file windows.h - not found
-- Looking for fork
-- Looking for fork - found
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found version "2.2.1")
-- ADIOS2 ThirdParty: Configuring dill
-- Check size of void*
-- Check size of void* - done
-- Check size of long
-- Check size of long - done
-- Check if the system is big endian
-- Searching 16 bit integer
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Using unsigned short
-- Check if the system is big endian - little endian
-- Checking for module 'libffi'
-- Found libffi, version 3.0.13
-- Found LibFFI: -lffi
-- Enabling emulation
-- Looking for include file stdarg.h
-- Looking for include file stdarg.h - found
-- Looking for include file memory.h
-- Looking for include file memory.h - found
-- Found dill: /home/users/khuck/src/ADIOS2/build/thirdparty/dill/dill/dill-config.cmake (found version "2.4.0")
-- ADIOS2 ThirdParty: Configuring ffs
-- Check size of off_t
-- Check size of off_t - done
-- Check size of long double
-- Check size of long double - done
-- Check size of long long
-- Check size of long long - done
-- Check size of size_t
-- Check size of size_t - done
-- Looking for socket
-- Looking for socket - found
-- Found BISON: /usr/bin/bison (found version "3.0.4")
-- Found FLEX: /usr/bin/flex (found version "2.5.37")
-- Found dill: /home/users/khuck/src/ADIOS2/build/thirdparty/dill/dill/dill-config.cmake (found suitable version "2.4.0", minimum required is "2.3.1")
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found suitable version "2.2.1", minimum required is "2.2.1")
-- Looking for netdb.h
-- Looking for netdb.h - found
-- Looking for sockLib.h
-- Looking for sockLib.h - not found
-- Looking for sys/select.h
-- Looking for sys/select.h - found
-- Looking for sys/socket.h
-- Looking for sys/socket.h - found
-- Looking for sys/times.h
-- Looking for sys/times.h - found
-- Looking for sys/uio.h
-- Looking for sys/uio.h - found
-- Looking for sys/un.h
-- Looking for sys/un.h - found
-- Looking for winsock.h
-- Looking for winsock.h - not found
-- Looking for strtof
-- Looking for strtof - found
-- Looking for strtod
-- Looking for strtod - found
-- Looking for strtold
-- Looking for strtold - found
-- Looking for getdomainname
-- Looking for getdomainname - found
-- Check size of struct iovec
-- Check size of struct iovec - done
-- Performing Test HAS_IOV_BASE_IOVEC
-- Performing Test HAS_IOV_BASE_IOVEC - Success
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found version "2.2.1")
-- Found ffs: /home/users/khuck/src/ADIOS2/build/thirdparty/ffs/ffs/ffs-config.cmake (found version "1.6.0")
-- ADIOS2 ThirdParty: Configuring enet
-- Looking for getaddrinfo
-- Looking for getaddrinfo - found
-- Looking for getnameinfo
-- Looking for getnameinfo - found
-- Looking for gethostbyaddr_r
-- Looking for gethostbyaddr_r - found
-- Looking for gethostbyname_r
-- Looking for gethostbyname_r - found
-- Looking for poll
-- Looking for poll - found
-- Looking for fcntl
-- Looking for fcntl - found
-- Looking for inet_pton
-- Looking for inet_pton - found
-- Looking for inet_ntop
-- Looking for inet_ntop - found
-- Performing Test HAS_MSGHDR_FLAGS
-- Performing Test HAS_MSGHDR_FLAGS - Success
-- Performing Test HAS_SOCKLEN_T
-- Performing Test HAS_SOCKLEN_T - Success
-- Found enet: /home/users/khuck/src/ADIOS2/build/thirdparty/enet/enet/enet-config.cmake (found version "1.3.14")
-- ADIOS2 ThirdParty: Configuring EVPath
-- Performing Test HAVE_MATH
-- Performing Test HAVE_MATH - Failed
-- Performing Test HAVE_LIBM_MATH
-- Performing Test HAVE_LIBM_MATH - Success
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found suitable version "2.2.1", minimum required is "2.2.1")
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found version "2.2.1")
-- Found ffs: /home/users/khuck/src/ADIOS2/build/thirdparty/ffs/ffs/ffs-config.cmake (found suitable version "1.6.0", minimum required is "1.5.1")
-- Could NOT find nvml (missing: NVML_INCLUDE_DIR)
-- Looking for clock_gettime
-- Looking for clock_gettime - found
-- Found enet: /home/users/khuck/src/ADIOS2/build/thirdparty/enet/enet/enet-config.cmake (found suitable version "1.3.14", minimum required is "1.3.13")
-- - Udt4 library was not found. This is not a fatal error, just that the Udt4 transport will not be built.
-- Found LIBFABRIC: /usr/lib64/libfabric.so
-- Looking for ibv_create_qp
-- Looking for ibv_create_qp - not found
-- Looking for ibv_create_qp in ibverbs
-- Looking for ibv_create_qp in ibverbs - found
-- Found IBVERBS: ibverbs
-- Could NOT find nnti (missing: NNTI_INCLUDE_DIR NNTI_trios_nnti_LIBRARY NNTI_trios_support_LIBRARY)
-- Looking for hostlib.h
-- Looking for hostlib.h - not found
-- Looking for sys/sockio.h
-- Looking for sys/sockio.h - not found
-- Performing Test HAVE_FDS_BITS
-- Performing Test HAVE_FDS_BITS - Failed
-- Looking for writev
-- Looking for writev - found
-- Looking for uname
-- Looking for uname - found
-- Looking for getloadavg
-- Looking for getloadavg - found
-- Looking for gettimeofday
-- Looking for gettimeofday - found
-- Looking for getifaddrs
-- Looking for getifaddrs - found
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found suitable version "2.2.1", minimum required is "2.2.1")
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found version "2.2.1")
-- Found ffs: /home/users/khuck/src/ADIOS2/build/thirdparty/ffs/ffs/ffs-config.cmake (found suitable version "1.6.0", minimum required is "1.6.0")
-- Found EVPath: /home/users/khuck/src/ADIOS2/build/thirdparty/EVPath/EVPath/EVPathConfigCommon.cmake (found version "4.4.0")
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found suitable version "2.2.1", minimum required is "2.2.1")
-- Found atl: /home/users/khuck/src/ADIOS2/build/thirdparty/atl/atl/atl-config.cmake (found version "2.2.1")
-- Looking for rdma/fi_ext_gni.h
-- Looking for rdma/fi_ext_gni.h - not found
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- LTO enabled
-- Detecting Fortran/C Interface
-- Detecting Fortran/C Interface - Found GLOBAL and MODULE mangling
-- Verifying Fortran/CXX Compiler Compatibility
-- Verifying Fortran/CXX Compiler Compatibility - Success
-- Found MPI: TRUE (found version "3.1") found components: C
ADIOS2 build configuration:
ADIOS Version: 2.4.0
C++ Compiler : GNU 8.1.0
/packages/openmpi/4.0.1-gcc8.1/bin/mpic++
Fortran Compiler : GNU 8.1.0
/packages/openmpi/4.0.1-gcc8.1/bin/mpif90
Installation prefix: /home/users/khuck/src/ADIOS2/install_mpi
bin: bin
lib: lib64
include: include
cmake: lib64/cmake/adios2
python: lib64/python3.6/site-packages
Features:
Library Type: shared
Build Type: RelWithDebInfo
Testing: ON
Build Options:
Blosc : OFF
BZip2 : OFF
ZFP : OFF
SZ : OFF
MGARD : OFF
PNG : OFF
MPI : ON
DataMan : ON
SSC : ON
SST : ON
ZeroMQ : ON
HDF5 : OFF
Python : ON
Fortran : ON
SysVShMem: ON
Profiling: ON
Endian_Reverse: OFF
RDMA Transport for Staging: Available
-- Configuring done
-- Generating done
-- Build files have been written to: /home/users/khuck/src/ADIOS2/build
When trying to run the heatTransfer example on this workstation with SST, the following crash happened (similar/same crash happens without the --mca arguments):
mpirun --mca btl_openib_allow_ib true --mca btl_openib_warn_default_gid_prefix 0 -n 16 ./heatSimulation sim.bp 4 4 64 64 100 10 : -n 4 ./heatAnalysis sim.bp analysis.bp 2 2
Process decomposition : 4 x 4
Array size per process : 64 x 64
Number of output steps : 100
Iterations per step : 10
Using SST engine for input
Using BP4 engine for output
[delphi:53491] *** Process received signal ***
[delphi:53491] Signal: Segmentation fault (11)
[delphi:53491] Signal code: Address not mapped (1)
[delphi:53491] Failing at address: 0x10
[delphi:53491] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7f2fc755c5d0]
[delphi:53491] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7f2fc6d00080]
[delphi:53491] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7f2fc6d001ec]
[delphi:53491] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7f2fc6cf6ba6]
[delphi:53491] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7f2fc8a14827]
[delphi:53491] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7f2fc871b003]
[delphi:53491] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7f2fc8a7c3d3]
[delphi:53491] [ 7] ./heatSimulation[0x40fb36]
[delphi:53491] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53491] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f2fc71a23d5]
[delphi:53491] [10] ./heatSimulation[0x40b71f]
[delphi:53491] *** End of error message ***
[delphi:53489] *** Process received signal ***
[delphi:53489] Signal: Segmentation fault (11)
[delphi:53489] Signal code: Address not mapped (1)
[delphi:53489] Failing at address: 0x10
[delphi:53489] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7f115d8045d0]
[delphi:53489] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7f115cfa8080]
[delphi:53489] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7f115cfa81ec]
[delphi:53489] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7f115cf9eba6]
[delphi:53489] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7f115ecbc827]
[delphi:53489] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7f115e9c3003]
[delphi:53489] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7f115ed243d3]
[delphi:53489] [ 7] ./heatSimulation[0x40fb36]
[delphi:53489] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53489] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f115d44a3d5]
[delphi:53489] [10] ./heatSimulation[0x40b71f]
[delphi:53489] *** End of error message ***
[delphi:53482] *** Process received signal ***
[delphi:53482] Signal: Segmentation fault (11)
[delphi:53482] Signal code: Address not mapped (1)
[delphi:53482] Failing at address: 0x10
[delphi:53482] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7f63ceddf5d0]
[delphi:53482] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7f63ce583080]
[delphi:53482] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7f63ce5831ec]
[delphi:53482] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7f63ce579ba6]
[delphi:53482] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7f63d0297827]
[delphi:53482] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7f63cff9e003]
[delphi:53482] [ 6] [delphi:53486] *** Process received signal ***
[delphi:53486] Signal: Segmentation fault (11)
[delphi:53486] Signal code: Address not mapped (1)
[delphi:53486] Failing at address: 0x10
/storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7f63d02ff3d3]
[delphi:53482] [ 7] ./heatSimulation[0x40fb36]
[delphi:53482] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53482] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f63cea253d5]
[delphi:53482] [10] ./heatSimulation[0x40b71f]
[delphi:53482] *** End of error message ***
[delphi:53484] *** Process received signal ***
[delphi:53484] Signal: Segmentation fault (11)
[delphi:53484] Signal code: Address not mapped (1)
[delphi:53484] Failing at address: 0x10
[delphi:53486] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7fe2f7e8b5d0]
[delphi:53486] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7fe2f762f080]
[delphi:53486] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7fe2f762f1ec]
[delphi:53486] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7fe2f7625ba6]
[delphi:53486] [ 4] [delphi:53487] *** Process received signal ***
[delphi:53487] Signal: Segmentation fault (11)
[delphi:53487] Signal code: Address not mapped (1)
[delphi:53487] Failing at address: 0x10
[delphi:53487] [ 0] [delphi:53496] *** Process received signal ***
[delphi:53496] Signal: Segmentation fault (11)
[delphi:53496] Signal code: Address not mapped (1)
[delphi:53496] Failing at address: 0x10
[delphi:53496] [ 0] [delphi:53498] *** Process received signal ***
[delphi:53498] Signal: Segmentation fault (11)
[delphi:53498] Signal code: Address not mapped (1)
[delphi:53498] Failing at address: 0x10
[delphi:53484] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7fa4407215d0]
[delphi:53484] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7fa43fec5080]
[delphi:53484] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7fa43fec51ec]
[delphi:53484] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7fa43febbba6]
[delphi:53484] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7fa441bd9827]
[delphi:53484] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7fa4418e0003]
[delphi:53484] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7fa441c413d3]
[delphi:53484] [ 7] ./heatSimulation[0x40fb36]
[delphi:53484] [ 8] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7fe2f9343827]
[delphi:53486] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7fe2f904a003]
[delphi:53486] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7fe2f93ab3d3]
[delphi:53486] [ 7] ./heatSimulation[0x40fb36]
[delphi:53486] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53486] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fe2f7ad13d5]
[delphi:53486] [10] ./heatSimulation[0x40b71f]
[delphi:53486] *** End of error message ***
/lib64/libpthread.so.0(+0xf5d0)[0x7efcb0ff15d0]
[delphi:53487] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7efcb0795080]
[delphi:53487] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7efcb07951ec]
[delphi:53487] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7efcb078bba6]
[delphi:53487] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7efcb24a9827]
[delphi:53487] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7efcb21b0003]
[delphi:53487] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7efcb25113d3]
[delphi:53487] [ 7] ./heatSimulation[0x40fb36]
[delphi:53487] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53487] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7efcb0c373d5]
[delphi:53487] [10] ./heatSimulation[0x40b71f]
[delphi:53487] *** End of error message ***
/lib64/libpthread.so.0(+0xf5d0)[0x7fa144f785d0]
[delphi:53496] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7fa14471c080]
[delphi:53496] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7fa14471c1ec]
[delphi:53496] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7fa144712ba6]
[delphi:53496] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7fa146430827]
[delphi:53496] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7fa146137003]
[delphi:53496] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7fa1464983d3]
[delphi:53496] [ 7] ./heatSimulation[0x40fb36]
[delphi:53496] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53496] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fa144bbe3d5]
[delphi:53496] [10] [delphi:53498] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7f124180d5d0]
[delphi:53498] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7f1240fb1080]
[delphi:53498] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x172ff)[0x7f1240fb12ff]
[delphi:53498] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstReaderOpen+0xaa)[0x7f1240fa49fa]
[delphi:53498] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstReaderC1ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x161)[0x7f1242cb2be1]
[delphi:53498] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x71e)[0x7f12429cc12e]
[delphi:53498] [ 6] ./heatSimulation[0x40b71f]
[delphi:53496] *** End of error message ***
./heatSimulation[0x40b4ef]
[delphi:53484] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fa4403673d5]
[delphi:53484] [10] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7f1242d2d3d3]
[delphi:53498] [ 7] ./heatAnalysis[0x40a827]
[delphi:53498] [ 8] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f12414533d5]
[delphi:53498] [ 9] ./heatAnalysis[0x40ba5f]
[delphi:53498] *** End of error message ***
./heatSimulation[0x40b71f]
[delphi:53484] *** End of error message ***
[delphi:53499] *** Process received signal ***
[delphi:53499] Signal: Segmentation fault (11)
[delphi:53499] Signal code: Address not mapped (1)
[delphi:53499] Failing at address: 0x10
[delphi:53499] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7f1b289265d0]
[delphi:53499] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7f1b280ca080]
[delphi:53499] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x172ff)[0x7f1b280ca2ff]
[delphi:53499] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstReaderOpen+0xaa)[0x7f1b280bd9fa]
[delphi:53499] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstReaderC1ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x161)[0x7f1b29dcbbe1]
[delphi:53499] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x71e)[0x7f1b29ae512e]
[delphi:53499] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7f1b29e463d3]
[delphi:53499] [ 7] ./heatAnalysis[0x40a827]
[delphi:53499] [ 8] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f1b2856c3d5]
[delphi:53499] [ 9] ./heatAnalysis[0x40ba5f]
[delphi:53499] *** End of error message ***
[delphi:53485] *** Process received signal ***
[delphi:53485] Signal: Segmentation fault (11)
[delphi:53485] Signal code: Address not mapped (1)
[delphi:53485] Failing at address: 0x10
[delphi:53485] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x7fc7d7b995d0]
[delphi:53485] [ 1] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x17080)[0x7fc7d733d080]
[delphi:53485] [ 2] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(+0x171ec)[0x7fc7d733d1ec]
[delphi:53485] [ 3] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/../lib64/libadios2_sst.so.2(SstWriterOpen+0xc6)[0x7fc7d7333ba6]
[delphi:53485] [ 4] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core6engine9SstWriterC2ERNS0_2IOERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x167)[0x7fc7d9051827]
[delphi:53485] [ 5] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios24core2IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0x5f3)[0x7fc7d8d58003]
[delphi:53485] [ 6] /storage/users/khuck/src/ADIOS2/install_mpi/lib64/libadios2.so.2(_ZN6adios22IO4OpenERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_4ModeEP19ompi_communicator_t+0xe3)[0x7fc7d90b93d3]
[delphi:53485] [ 7] ./heatSimulation[0x40fb36]
[delphi:53485] [ 8] ./heatSimulation[0x40b4ef]
[delphi:53485] [ 9] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fc7d77df3d5]
[delphi:53485] [10] ./heatSimulation[0x40b71f]
[delphi:53485] *** End of error message ***
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 7 with PID 0 on node delphi exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
And the backtrace from one of the ranks:
[khuck@delphi cpp]$ gdb ./heatSimulation core.56907
ImportError: No module named site
[khuck@delphi cpp]$ module unload python
[khuck@delphi cpp]$ gdb ./heatSimulation core.56907
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /storage/users/khuck/src/adiosvm/Tutorial/heat2d/cpp/heatSimulation...done.
[New LWP 56907]
[New LWP 56929]
[New LWP 57462]
[New LWP 56941]
[New LWP 57851]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
btCore was generated by `./heatSimulation sim.bp 4 4 64 64 100 10'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007fef071c3080 in fi_ep_bind (flags=0, bfid=0x157b120, ep=0x0) at /usr/include/rdma/fi_endpoint.h:168
168 return ep->fid.ops->bind(&ep->fid, bfid, flags);
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.172-2.el7.x86_64 elfutils-libs-0.172-2.el7.x86_64 infinipath-psm-3.3-26_g604758e_open.2.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_6.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-9.el7.x86_64 libcom_err-1.42.9-13.el7.x86_64 libfabric-1.6.1-2.el7.x86_64 libffi-3.0.13-18.el7.x86_64 libibverbs-17.2-3.el7.x86_64 libnl3-3.2.28-4.el7.x86_64 libpciaccess-0.14-1.el7.x86_64 libpsm2-10.3.58-1.el7.x86_64 librdmacm-17.2-3.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libsodium13-1.0.5-1.el7.x86_64 libuuid-2.23.2-59.el7_6.1.x86_64 numactl-libs-2.0.9-7.el7.x86_64 openpgm-5.2.122-2.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-62.el7.x86_64 zeromq-4.1.4-5.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0 0x00007fef071c3080 in fi_ep_bind (flags=0, bfid=0x157b120, ep=0x0) at /usr/include/rdma/fi_endpoint.h:168
#1 init_fabric (fabric=0x1578c40, Params=<optimized out>)
at /home/users/khuck/src/ADIOS2/source/adios2/toolkit/sst/dp/rdma_dp.c:198
#2 0x00007fef071c31ec in RdmaInitWriter (Svcs=0x7fef073cbac0 <Svcs>, CP_Stream=0x155c420, Params=0x155c2f8)
at /home/users/khuck/src/ADIOS2/source/adios2/toolkit/sst/dp/rdma_dp.c:558
#3 0x00007fef071b9ba6 in SstWriterOpen (Name=Name@entry=0x155c3e0 "sim.bp", Params=Params@entry=0x155c2f8,
comm=comm@entry=0x155a710) at /home/users/khuck/src/ADIOS2/source/adios2/toolkit/sst/cp/cp_writer.c:1124
#4 0x00007fef08ed7827 in adios2::core::engine::SstWriter::SstWriter(adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode, ompi_communicator_t*) ()
at /home/users/khuck/src/ADIOS2/source/adios2/engine/sst/SstWriter.cpp:35
#5 0x00007fef08bde003 in construct<adios2::core::engine::SstWriter, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&>
(this=<optimized out>, __p=0x155c230) at /storage/packages/gcc/8.1/include/c++/8.1.0/new:169
#6 construct<adios2::core::engine::SstWriter, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> (__a=...,
__p=0x155c230) at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/alloc_traits.h:475
#7 _Sp_counted_ptr_inplace<adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> (__a=..., this=0x155c220)
at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/shared_ptr_base.h:549
#8 __shared_count<adios2::core::engine::SstWriter, std::allocator<adios2::core::engine::SstWriter>, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> (__a=..., this=<optimized out>)
at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/shared_ptr_base.h:662
#9 __shared_ptr<std::allocator<adios2::core::engine::SstWriter>, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> (
__a=..., __tag=..., this=<optimized out>)
at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/shared_ptr_base.h:1328
#10 shared_ptr<std::allocator<adios2::core::engine::SstWriter>, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> (
__a=..., __tag=..., this=<optimized out>)
at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/shared_ptr.h:360
---Type <return> to continue, or q <return> to quit---
#11 allocate_shared<adios2::core::engine::SstWriter, std::allocator<adios2::core::engine::SstWriter>, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> (__a=...)
at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/shared_ptr.h:707
#12 make_shared<adios2::core::engine::SstWriter, adios2::core::IO&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode const&, ompi_communicator_t*&> ()
at /storage/packages/gcc/8.1/include/c++/8.1.0/bits/shared_ptr.h:723
#13 adios2::core::IO::Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode, ompi_communicator_t*) () at /home/users/khuck/src/ADIOS2/source/adios2/core/IO.cpp:567
#14 0x00007fef08f3f3d3 in adios2::IO::Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, adios2::Mode, ompi_communicator_t*) ()
at /home/users/khuck/src/ADIOS2/bindings/CXX11/adios2/cxx11/IO.cpp:112
#15 0x000000000040fb36 in IO::IO(Settings const&, ompi_communicator_t*) () at simulation/IO_adios2.cpp:79
#16 0x000000000040b4ef in main () at simulation/heatSimulation.cpp:78
#17 0x00007fef076653d5 in __libc_start_main (main=0x40b320 <main>, argc=8, argv=0x7ffe942a12e8,
init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe942a12d8)
at ../csu/libc-start.c:266
#18 0x000000000040b71f in _start () at simulation/IO_adios2.cpp:118