Greetings,
I've recently installed OFED 2.2.1 drivers for SLES 11 SP2 (x86_64) on a system previously running OFED 1.5.3. The device is described as "ConnectX-3 VPI adapter card; single-port QSFP; FDR10 IB (40Gb/s) and 10GigE; PCIe3.0x8 8GT/s;RoHS R6"
The installation procedure was successful and can confirm the self test passed with no issues:
/usr/bin/hca_self_test.ofed
---- Performing Adapter Device Self Test ----
Number of CAs Detected ................. 1
PCI Device Check ....................... PASS
Kernel Arch ............................ x86_64
Host Driver Version .................... MLNX_OFED_LINUX-2.2-1.0.1 (OFED-2.2-1.0.0): 3.0.51-0.7.9-default
Host Driver RPM Check .................. PASS
Firmware on CA #0 VPI .................. v2.31.5050
Firmware Check on CA #0 (VPI) .......... PASS
Host Driver Initialization ............. PASS
Number of CA Ports Active .............. 0
Kernel Syslog Check .................... PASS
Node GUID on CA #0 (VPI) ............... NA
However after rebooting the server a number of kernel modules won't load automatically.
# service openibd restart
Unloading ib_addr [FAILED]
ERROR: Module ib_addr is in use by ib_uverbs,ib_core
# service openibd stop
Unloading ib_addr [FAILED]
ERROR: Module ib_addr is in use by ib_uverbs,ib_core
If I attempt to get device info I get:
# /usr/bin/ibv_devinfo
No IB devices found
It looks like not all kernel modules are loaded at boot time, in particular mlx4_core is missing
# lsmod | grep -i ib
ib_ipoib 131809 0
ib_cm 53475 1 ib_ipoib
ib_sa 43351 2 ib_ipoib,ib_cm
ib_uverbs 56998 0
ib_umad 18232 0
ib_mad 56201 3 ib_cm,ib_sa,ib_umad
ib_core 121849 6 ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,ib_mad
ib_addr 14606 2 ib_uverbs,ib_core
compat 22052 8 ib_ipoib,ib_cm,ib_sa,ib_uverbs,ib_umad,ib_mad,ib_core,ib_addr
ipv6_lib 340431 68 ib_ipoib,bridge,ib_addr,ipv6
libsas 77150 1 isci
scsi_transport_sas 40465 2 isci,libsas
libahci 34841 1 ahci
libata 228894 3 libsas,ahci,libahci
scsi_mod 231620 11 sg,isci,sd_mod,libsas,scsi_transport_sas,scsi_dh_rdac,scsi_dh_alua,scsi_dh_emc,scsi_dh_hp_sw,scsi_dh,libata
... if I attempt to load it manually I get:
# modprobe mlx4_core
FATAL: Error inserting mlx4_core (/lib/modules/3.0.51-0.7.9-default/updates/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko): Unknown symbol in module, or unknown parameter (see dmesg)
FATAL: Error running install command for mlx4_core
Finally the output of dmesg is showing:
# dmesg | grep mlx4_core
[ 25.391619] mlx4_core: Unknown parameter `pfctx'
[ 25.572530] mlx4_core: Unknown parameter `pfctx'
[ 25.650660] mlx4_core: Unknown parameter `pfctx'
[ 25.708439] mlx4_core: Unknown parameter `pfctx'
[ 25.775047] mlx4_core: Unknown parameter `pfctx'
[13893.321476] mlx4_core: Unknown parameter `pfctx'
Perhaps I need to add kernel support with more options ?
John