SST: RDMA data plane issues on InfiniBand
Created by: keichi
I found some issues in the RDMA data plane on InfiniBand.
RDMA DP fails to detect InfiniBand network interfaces
The RDMA DP checks if the name of the underlying fabric provider of an interface matches some pre-defined value (verbs
, gni
and psm2
) to see if it supports RDMA (except if a specific interface is forced via FABRIC_IFACE
or DataInterface
). However, the provider name can also be something like verbs;ofi_rxm
or ofi_rxd;verbs
, where ofi_rxm
and ofi_rxd
are utility fabric providers that are layered on top of base providers (see libfabric docs about prov_name). In this case, that fabric interface gets ignored by RDMA DP.
$ jsrun -n 1 $PROJWORK/csc143/keichi/libfabric-install/bin/fi_info -p verbs -t FI_EP_RDM
provider: verbs;ofi_rxm
fabric: IB-0xfe80000000000000
domain: mlx5_0
version: 1.0
type: FI_EP_RDM
protocol: FI_PROTO_RXM
provider: verbs;ofi_rxd
fabric: IB-0xfe80000000000000
domain: mlx5_1-dgram
version: 1.0
type: FI_EP_RDM
protocol: FI_PROTO_RXD
provider: verbs;ofi_rxd
fabric: IB-0xfe80000000000000
domain: mlx5_3-dgram
version: 1.0
type: FI_EP_RDM
protocol: FI_PROTO_RXD
provider: verbs;ofi_rxd
fabric: IB-0xfe80000000000000
domain: mlx5_0-dgram
version: 1.0
type: FI_EP_RDM
protocol: FI_PROTO_RXD
provider: verbs;ofi_rxd
fabric: IB-0xfe80000000000000
domain: mlx5_2-dgram
version: 1.0
type: FI_EP_RDM
protocol: FI_PROTO_RXD
RDMA DP hangs on InfiniBand
RDMA DP reader hangs inside RdmaWaitForCompletion()
while synchronously waiting for RMA read requests to complete. This is because the RDMA DP does not explicitly request automatic progress (FI_PROGRESS_AUTO
). If the provider chooses manual progress (FI_PROGRESS_MANUAL
), the application will need to periodically call fi_cq_read()
or similar functions to progress the communication. Currently only the reader side callsfi_cq_read()
in RDMA DP. Although all fabric providers are required to support automatic progress, some (including theverbs
provider) choose manual progress by default. RMA reads won’t complete on these fabric providers unless we call fi_cq_read()
on the reader.
The issues listed above are confirmed on both Summit and JAXA JSS2, using the latest libfabric release (v1.7.2).