Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6211

ethtool self diagnostic failed

$
0
0

Installed a ConnectX 3 EN 40G Ethernet card model CX313A.

 

Linux OS is:  2.6.32-5-amd64 #1 SMP Sat Jul 12 16:47:57 UTC 2014 x86_64 GNU/Linux

 

ethtool -i eth6

driver: mlx4_en

version: 2.4-1.0.0.1 (Feb 17 2015)

firmware-version: 2.33.5000

bus-info: 0000:08:00.0

 

I ran the ethtool self diagnostic on the card:

ethtool -t eth6 offline

The test result is FAIL

The test extra info:

Interrupt Test     -5

Link Test     -12

Speed Test     -12

Register Test     0

Loopback Test     0

 

dmesg log:

 

[ 2837.440991] mlx4_core 0000:08:00.0: command NOP (0x31) timed out: in_param=0x0, in_mod=0x1f, op_mod=0x0, get_status err=0, status_reg=0x31004000, go_bit=0, t_bit=0, toggle=0x1

[ 2837.440999] mlx4_core 0000:08:00.0: mlx4_enter_error_state: device is going to be reset

[ 2837.948504] mlx4_core 0000:08:00.0: mlx4_enter_error_state: device was reset successfully

[ 2837.948608] mlx4_en 0000:08:00.0: Internal error detected, restarting device

[ 2837.948673] mlx4_core 0000:08:00.0: mlx4_enter_error_state: end

[ 2838.845428] mlx4_core 0000:08:00.0: Internal error mark was detected on device ffff881024660000

[ 2838.845680] mlx4_core 0000:08:00.0: mlx4_handle_error_state was started

[ 2838.845744] mlx4_handle_error_state: calling mlx4_restart_one

[ 2843.886584] mlx4_core 0000:08:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s

[ 2843.886586] mlx4_core 0000:08:00.0: PCIe link width is x8, device supports x8

[ 2845.487271] mlx4_core 0000:08:00.0: irq 82 for MSI/MSI-X

[ 2845.487273] mlx4_core 0000:08:00.0: irq 83 for MSI/MSI-X

[ 2845.487275] mlx4_core 0000:08:00.0: irq 84 for MSI/MSI-X

[ 2845.487277] mlx4_core 0000:08:00.0: irq 85 for MSI/MSI-X

[ 2845.487278] mlx4_core 0000:08:00.0: irq 86 for MSI/MSI-X

[ 2845.487280] mlx4_core 0000:08:00.0: irq 87 for MSI/MSI-X

[ 2845.487281] mlx4_core 0000:08:00.0: irq 88 for MSI/MSI-X

[ 2845.487283] mlx4_core 0000:08:00.0: irq 89 for MSI/MSI-X

[ 2845.487284] mlx4_core 0000:08:00.0: irq 90 for MSI/MSI-X

[ 2845.487286] mlx4_core 0000:08:00.0: irq 91 for MSI/MSI-X

[ 2845.487288] mlx4_core 0000:08:00.0: irq 92 for MSI/MSI-X

[ 2845.487289] mlx4_core 0000:08:00.0: irq 93 for MSI/MSI-X

[ 2845.487291] mlx4_core 0000:08:00.0: irq 94 for MSI/MSI-X

[ 2845.521456] mlx4_en 0000:08:00.0: Activating port:1

[ 2845.529510] mlx4_en: eth6: Using 8 TX rings

[ 2845.529512] mlx4_en: eth6: Using 8 RX rings

[ 2845.529657] mlx4_en: eth6: Initializing port

[ 2845.530099] mlx4_handle_error_state: mlx4_restart_one was ended, ret=0

[ 2845.530100] mlx4_handle_error_state end

 

One last thing,  it was noted in the kernel log:

 

[594196.708563] mlx4_core 0000:08:00.0: Temperature Threshold was reached! Threshold: 105 celsius degrees; Current Temperature: 107

 

Is this a bad card or something else going on.

 

Thanks,


Chet


Viewing all articles
Browse latest Browse all 6211

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>