Thanks, that's a good point.... I guess the OpenMPI stack things that it
has 2 phy transports and trying to load-share, runs into connectivity
problems.
Do you think we can have both RoCE and IB active on the same set of hosts?
Some groups here would like to use RoCE but of course MPI/IB is the
communcation of choice.
Actually here is a question for the selection of routes among end-points:
from the Fat-Tree topology we have multiple alternative paths from
connecting each end-point (X, *Y *). Who determines the specific route
communication will take from two specific end-points (A, B ) ? Is it
the MPI stack itself at IB connection establishment time or does it consult
with SM ? Re-routing (or selection of alternative than the initial one) can
take place at the request of say the MPI stack or does the SM have to be
consulted or adjust its own routing tables or ?
Finally we are using the OFED that just came with RHEL 6.3 (I think 1.5.4
?) for various mostly non-technical reasons. Do you have any concrete
argument in favor of deploying Mellanox own latest OFED for that Linux
distribution?
Thanks!
Michael