Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6211

XenServer 6.1 and IPoIB

$
0
0

Hi all!

 

I have some problems to get working IPoIB in XenServer environment.

 

Hardware:

 

IBM BladeCenter H with VOLTAIRE 40 GB (QDR) INFINIBAND SWITCH MODULE

IBM HS22 blades with Mellanox 40Gb/s QDR Infiniband Expansion Card (CFFh)

 

Software:

 

XenServer 6.1 with MLNX_OFED_LINUX-2.1-1.0.0-xenserver6.x-i686.iso installed.

 

Some outputs from Blade3:

 

[root@blade3 ~]# ibstat

CA 'mlx4_0'

        CA type: MT26428

        Number of ports: 2

        Firmware version: 2.9.1000

        Hardware version: b0

        Node GUID: 0x0002c9030028b394

        System image GUID: 0x0002c9030028b397

        Port 1:

                State: Active

                Physical state: LinkUp

                Rate: 40

                Base lid: 5

                LMC: 0

                SM lid: 1

                Capability mask: 0x02510868

                Port GUID: 0x0002c9030028b395

                Link layer: InfiniBand

        Port 2:

                State: Down

                Physical state: Polling

                Rate: 10

                Base lid: 0

                LMC: 0

                SM lid: 0

                Capability mask: 0x02510868

                Port GUID: 0x0002c9030028b396

                Link layer: InfiniBand

 

[root@blade3 ~]# ibdev2netdev

mlx4_0 port 1 ==> ib0 (Up)

mlx4_0 port 2 ==> ib1 (Down)

 

 

Some outputs from Blade4:

 

[root@blade4 ~]# ibstat

CA 'mlx4_0'

        CA type: MT26428

        Number of ports: 2

        Firmware version: 2.9.1000

        Hardware version: b0

        Node GUID: 0x0002c9030028b38c

        System image GUID: 0x0002c9030028b38f

        Port 1:

                State: Active

                Physical state: LinkUp

                Rate: 40

                Base lid: 1

                LMC: 0

                SM lid: 1

                Capability mask: 0x02510868

                Port GUID: 0x0002c9030028b38d

                Link layer: InfiniBand

        Port 2:

                State: Down

                Physical state: Polling

                Rate: 10

                Base lid: 0

                LMC: 0

                SM lid: 0

                Capability mask: 0x02510868

                Port GUID: 0x0002c9030028b38e

                Link layer: InfiniBand

 

[root@blade4 ~]# ibdev2netdev

mlx4_0 port 1 ==> ib0 (Up)

mlx4_0 port 2 ==> ib1 (Down)

 

The problem is that I can't start opensm, it hangs after starting:

 

[root@blade4 ~]# cat /var/log/opensm.log

Jan 17 09:37:50 890720 [B75958D0] 0x03 -> OpenSM 4.0.5.MLNX20131217.d8345a7

Jan 17 09:37:50 890775 [B75958D0] 0x80 -> OpenSM 4.0.5.MLNX20131217.d8345a7

Jan 17 09:37:50 891449 [B75958D0] 0x02 -> osm_vendor_init: 1000 pending umads specified

Jan 17 09:37:50 905260 [B75958D0] 0x80 -> Entering DISCOVERING state

Jan 17 09:37:50 905359 [B75958D0] 0x02 -> osm_vendor_bind: Mgmt class 0x81 binding to port GUID 0x2c9030028b38d

Jan 17 09:37:50 939076 [B75958D0] 0x02 -> osm_vendor_bind: Mgmt class 0x03 binding to port GUID 0x2c9030028b38d

Jan 17 09:37:50 939124 [B75958D0] 0x02 -> osm_vendor_bind: Mgmt class 0x04 binding to port GUID 0x2c9030028b38d

Jan 17 09:37:50 939169 [B75958D0] 0x02 -> osm_vendor_bind: Mgmt class 0x21 binding to port GUID 0x2c9030028b38d

Jan 17 09:37:50 939212 [B75958D0] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0002c9030028b38d

Jan 17 09:37:50 946927 [B3D8DB90] 0x80 -> Entering MASTER state

Jan 17 09:37:50 946952 [B3D8DB90] 0x01 -> osm_prtn_make_partitions: Partition configuration /etc/opensm/partitions.conf is not accessible (No such file or directory)

Jan 17 09:37:50 947868 [B3D8DB90] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches

Jan 17 09:37:50 949698 [B3D8DB90] 0x02 -> SUBNET UP

Jan 17 09:37:51 621411 [B6D93B90] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:1 TID:0x0000000000000016

Jan 17 09:37:51 621445 [B6D93B90] 0x02 -> trap_rcv_process_request: Trap 144 Node description update

Jan 17 09:37:51 621463 [B6D93B90] 0x02 -> log_notice: Reporting Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) from LID:1 GID:fe80::2:c903:28:b38d

Jan 17 09:37:52 940296 [B6592B90] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::ffff:ffff

Jan 17 09:37:52 940710 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12d4

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:52 940985 [B6592B90] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff28:b38d

Jan 17 09:37:52 941240 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12d7

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:52 941521 [B6D93B90] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:401b:ffff::1

Jan 17 09:37:52 941752 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12d8

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:52 942026 [B6592B90] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1

Jan 17 09:37:52 942249 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12d9

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:52 949767 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12da

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:52 950519 [B5590B90] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:1 GID:ff12:601b:ffff::1:ff28:b395

Jan 17 09:37:52 950979 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12de

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:52 952236 [B258AB90] 0x01 -> log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C

                        SubnGetResp(SwitchInfo), attr_mod 0x0, TID 0x12e2

                        Initial path: 0,1 Return path: 0,28

Jan 17 09:37:54 530472 [B75958D0] 0x80 -> Exiting SM

 

 

How can I fix it ? What's wrong ?


Viewing all articles
Browse latest Browse all 6211

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>