Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6211

ESX 5.1.0 IPoIB NFS datastore freezes

$
0
0

Hello,

 

hopefully this is the right place to provide some information about a VMWare IPoIB datastore freeze. We are testing a new VMWare ESX 5.1.0 setup. Sadly we have only 3 ConnectX (gen 1) cards left, the other ConnectX2 ones are in our productive ESX 4.1 environment. We know that this is not officially supported right now but I want to make sure that the error will not happen when we upgrade the productive machines.

 

The hardware is:

Fujitsu RX300 S6 (Dual Intel X5670)

ConnectX MT25418 Firmware 2.9.1000

ESX 5.1.0 1117900

Mellanox driver 1.8.1

 

When copying data between VMs all of a sudden the adapter freezes and the datastore is "lost". From the vmkernel log we can read endless lines as below:

 

2013-07-22T17:22:48.775Z cpu10:8202)<3>vmnic_ib1:ipoib_send:504: found skb where it does not belongtx_head = 3827020, tx_tail =3827020

2013-07-22T17:22:48.775Z cpu10:8202)<3>vmnic_ib1:ipoib_send:505: netif_queue_stopped = 0

2013-07-22T17:22:48.775Z cpu10:8202)Backtrace for current CPU #10, worldID=8202, ebp=0x41220029af68

2013-07-22T17:22:48.776Z cpu10:8202)0x41220029af68:[0x41802a310d59]ipoib_send@<None>#<None>+0x5d4 stack: 0xffffff, 0x0, 0x412410d4c948,

2013-07-22T17:22:48.777Z cpu10:8202)0x41220029b018:[0x41802a310d59]ipoib_send@<None>#<None>+0x5d4 stack: 0x41220029b088, 0x418029e0a55b

2013-07-22T17:22:48.777Z cpu10:8202)0x41220029b148:[0x41802a317160]ipoib_mcast_send@<None>#<None>+0xf7 stack: 0x41220029b188, 0x418029d

2013-07-22T17:22:48.778Z cpu10:8202)0x41220029b238:[0x41802a31dabf]ipoib_start_xmit@<None>#<None>+0x396 stack: 0x41220029b598, 0x412200

2013-07-22T17:22:48.778Z cpu10:8202)0x41220029b398:[0x41802a31ac3b]vmipoib_start_xmit@<None>#<None>+0x49a stack: 0x41000be0b880, 0x839e

2013-07-22T17:22:48.779Z cpu10:8202)0x41220029b468:[0x41802a16d8f0]DevStartTxImmediate@com.vmware.driverAPI#9.2+0x137 stack: 0x41220029

2013-07-22T17:22:48.779Z cpu10:8202)0x41220029b4d8:[0x418029d3470e]UplinkDevTransmit@vmkernel#nover+0x295 stack: 0x10787a40, 0x41220029

2013-07-22T17:22:48.780Z cpu10:8202)0x41220029b558:[0x418029dabbaa]NetSchedFIFORunLocked@vmkernel#nover+0x1a5 stack: 0xc0bd95300, 0x0,

2013-07-22T17:22:48.781Z cpu10:8202)0x41220029b5e8:[0x418029dabf57]NetSchedFIFOInput@vmkernel#nover+0x24e stack: 0x41220029b638, 0x4180

2013-07-22T17:22:48.781Z cpu10:8202)0x41220029b698:[0x418029dab0b2]NetSchedInput@vmkernel#nover+0x191 stack: 0x41220029b748, 0x41000bd9

2013-07-22T17:22:48.782Z cpu10:8202)0x41220029b738:[0x418029d3ced0]IOChain_Resume@vmkernel#nover+0x247 stack: 0x41220029b798, 0x418029d

2013-07-22T17:22:48.782Z cpu10:8202)0x41220029b788:[0x418029d2c0e4]PortOutput@vmkernel#nover+0xe3 stack: 0x41220029b808, 0x41802a216a2a

2013-07-22T17:22:48.783Z cpu10:8202)0x41220029b808:[0x41802a2254c8]TeamES_Output@<None>#<None>+0x16b stack: 0x0, 0x418029cc3879, 0x4122

2013-07-22T17:22:48.784Z cpu10:8202)0x41220029ba08:[0x41802a218047]EtherswitchPortDispatch@<None>#<None>+0x142a stack: 0xffffffff000000

2013-07-22T17:22:48.784Z cpu10:8202)0x41220029ba78:[0x418029d2b2c7]Port_InputResume@vmkernel#nover+0x146 stack: 0x410001553540, 0x41220

2013-07-22T17:22:48.785Z cpu10:8202)0x41220029baa8:[0x41802a3b95cb]TcpipTxDispatch@<None>#<None>+0x9a stack: 0x7c1f45, 0x41220029bad8,

2013-07-22T17:22:48.785Z cpu10:8202)0x41220029bb28:[0x41802a3ba118]TcpipDispatch@<None>#<None>+0x1c7 stack: 0x246, 0x41220029bb70, 0x41

2013-07-22T17:22:48.786Z cpu10:8202)0x41220029bca8:[0x418029d0b245]WorldletProcessQueue@vmkernel#nover+0x4b0 stack: 0x41220029bd58, 0xb

2013-07-22T17:22:48.786Z cpu10:8202)0x41220029bce8:[0x418029d0b895]WorldletBHHandler@vmkernel#nover+0x60 stack: 0x100000000000001, 0x41

2013-07-22T17:22:48.786Z cpu10:8202)0x41220029bd68:[0x418029c2083a]BH_Check@vmkernel#nover+0x185 stack: 0x41220029be68, 0x41220029be08,

2013-07-22T17:22:48.787Z cpu10:8202)0x41220029be68:[0x418029dbc9bc]CpuSchedIdleLoopInt@vmkernel#nover+0x13b stack: 0x41220029be98, 0x41

2013-07-22T17:22:48.787Z cpu10:8202)0x41220029be78:[0x418029dc66de]CpuSched_IdleLoop@vmkernel#nover+0x15 stack: 0xa, 0x14, 0x41220029bf

2013-07-22T17:22:48.787Z cpu10:8202)0x41220029be98:[0x418029c4f71e]Init_SlaveIdle@vmkernel#nover+0x49 stack: 0x0, 0x0, 0x0, 0x0, 0x0

2013-07-22T17:22:48.788Z cpu10:8202)0x41220029bfe8:[0x418029ee26a6]SMPSlaveIdle@vmkernel#nover+0x31d stack: 0x0, 0x0, 0x0, 0x0, 0x0

 

 

Any help is appreciated.

 

Best regards.

 

Markus


Viewing all articles
Browse latest Browse all 6211

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>