I would check BIOS settings of the third system? Could you check sections (2.4.2-2.4.5) of the performance tuning guide?
Are all systems the same system architecture? Same slot for IB on motherboard all system?
Maybe also try unplug the cables from switch and start opensm on the node and test the direct connect (see might be the switch port?)
The MT4099 is the ConnectX-3 VPI HCA adapter, I believe it should get better latency.