AndiH/Makefile Secret

## README.md

      
    Raw
  

              README.md
            
          
    MVAPICH2-GDR Multi-Node MPI Bug


Software version: MVAPICH 2.3.3-GDR (with latest Allreduce Fix)
Submitter: Andreas Herten (Jülich Supercomputing Center (JSC), Forschungszentrum Jülich)
System: JUWELS Supercomputer at JSC
InfiniBand OFED version: 4.6

Update, 27 Jan 2020: See section "Env Variable: MV2_USE_RDMA_CM=0" at the end
Short Description

A simple MPI program crashes when using multiple nodes.
Files in this repository are provided to reproduce the behavior.
Description

Going forward from the previous bug fixed regarding MPI_Allreduce(), we can not launch a program using two nodes. The problem already occurs for a basic MPI skeleton, consisting of MPI_Init() and MPI_Finalize():
The attached program mpi-init.cu is used to reproduce the behavior. It basically consists of
std::cout << "Begin." << std::endl;
MPI_Init(&argc,&argv);

std::cout << "End." << std::endl;
MPI_Finalize();
Please compile it with make.
When running mpi-init.exe on one node, everything works as intended:
➜ srun --nodes 1 --ntasks-per-node 1 ./mpi-init.exe
Begin.
End.

… but when launching the executable on two nodes, a crash occurs:
➜ srun --nodes 2 --ntasks-per-node 1 ./mpi-init.exe
Begin.
Begin.
INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in rdma_cm_get_local_ip:1556
[jwc09n006.adm09.juwels.fzj.de:mpi_rank_1][error_sighandler] Caught error: Segmentation fault (signal 11)
srun: error: jwc09n006: task 1: Segmentation fault
srun: error: jwc09n003: task 0: Terminated
srun: Force Terminated job step 2090291.5

Notes

As before, this problem only occurs on our JUWELS system. On JURECA, with OFED 4.7, the program works as expected.
We intend to upgrade the OFED stack on JUWELS to match that of JURECA in a week. If you think the problem relates to the OFED stack, we can postpone further debugging on the problem at hand until we upgrade the stack next week.
Env Variable: MV2_USE_RDMA_CM=0

Setting MV2_USE_RDMA_CM=0 does fix the issue.
➜ srun --nodes 2 ./mpi-init.exe
Begin.
Begin.
INTERNAL ERROR: invalid error code ffffffff (Ring Index out of range) in rdma_cm_get_local_ip:1556
[jwc09n012.adm09.juwels.fzj.de:mpi_rank_1][error_sighandler] Caught error: Segmentation fault (signal 11)
srun: error: jwc09n012: task 1: Segmentation fault
srun: error: jwc09n009: task 0: Terminated
srun: Force Terminated job step 2095679.0

➜ MV2_USE_RDMA_CM=0 srun --nodes 2 ./mpi-init.exe
Begin.
Begin.
End.
End.


## Makefile
MPICXX = mpic++
NVCC = nvcc

FLAGS =
MPIFLAGS = -Wall -I$$CUDA_HOME/include/ -L$$CUDA_HOME/lib64/ -lcudart

.PHONY: all

all: mpi-init.exe

%.o: %.cu Makefile
	$(NVCC) $(FLAGS) -c -o $@ $<

%.exe: %.o
	$(MPICXX) $(FLAGS) $(MPIFLAGS) -o $@ $<

.PHONY: clean
clean:
	rm *.exe
	rm *.o

## mpi-init.cu
#include <iostream>
#include <mpi.h>

int main(int argc, char** argv) {

    std::cout << "Begin." << std::endl;
    MPI_Init(&argc,&argv);

    std::cout << "End." << std::endl;
    MPI_Finalize();

    return 0;
}
	MPICXX = mpic++
	NVCC = nvcc

	FLAGS =
	MPIFLAGS = -Wall -I$$CUDA_HOME/include/ -L$$CUDA_HOME/lib64/ -lcudart

	.PHONY: all

	all: mpi-init.exe

	%.o: %.cu Makefile
	$(NVCC) $(FLAGS) -c -o $@ $<

	%.exe: %.o
	$(MPICXX) $(FLAGS) $(MPIFLAGS) -o $@ $<

	.PHONY: clean
	clean:
	rm *.exe
	rm *.o
	#include <iostream>
	#include <mpi.h>

	int main(int argc, char** argv) {

	std::cout << "Begin." << std::endl;
	MPI_Init(&argc,&argv);

	std::cout << "End." << std::endl;
	MPI_Finalize();

	return 0;
	}