Hello,
I recently installed a test system that utilizes multirail Infiniband. It consists of two fat nodes, each with four Xeon E5-4650's. Each CPU has got a ConnectX-2 card directly attached. The latest MLNX-OFED is installed and working. Is there a way to force MXM to use all four cards? Both nodes are connected to a IS5025 switch.
If I run for example the osu_alltoall benchmark by
/usr/mpi/gcc/openmpi-1.8.4/bin/mpirun --mca btl,self openib -n 64 --hostfile test /usr/mpi/gcc/openmpi-1.8.4/tests/osu-micro-benchmarks-4.4/osu_alltoall
I get a lot of warnings like this:
[1421687772.785621] [linux-3e34:12178:0] | ib_dev.c:405 MXM WARN Skipping IB device 'mlx4_2' - up to 2 devices are supported |
[1421687772.785640] [linux-3e34:12178:0] | ib_dev.c:405 MXM WARN Skipping IB device 'mlx4_1' - up to 2 devices are supported |
[1421687772.785647] [linux-3e34:12178:0] | ib_dev.c:405 MXM WARN Skipping IB device 'mlx4_0' - up to 2 devices are supported |
# OSU MPI All-to-All Personalized Exchange Latency Test v4.4
# Size Avg Latency(us)
1 66.02
2 64.99
4 66.64
8 76.15
16 81.32
32 88.54
64 137.83
128 186.70
256 294.64
512 558.72
1024 1287.07
2048 2418.95
4096 3637.48
8192 5647.53
16384 9947.06
32768 19036.50
65536 38769.77
131072 71470.19
262144 141088.41
524288 294086.71
1048576 600280.88
by disabling mxm via --mca mtl ^mxm, the warnings disappear, and also the latency goes down dramatically:
# Size | Avg Latency(us) |
1 | 37.48 |
2 | 37.12 |
4 | 38.24 |
8 | 39.59 |
16 | 50.07 |
32 | 47.93 |
64 | 53.22 |
128 | 77.66 |
256 | 116.47 |
512 | 214.17 |
1024 | 335.79 |
2048 | 594.70 |
4096 | 1045.58 |
8192 | 1334.22 |
16384 | 2972.10 |
32768 | 4990.44 |
65536 | 9215.92 |
131072 | 16271.00 |
262144 | 31121.92 |
524288 | 61814.72 |
1048576 | 124195.41 |
I would be thankful for any suggestions!
Kind regards,
Tobias