Quantcast
Channel: Mellanox Interconnect Community: Message List
Viewing all articles
Browse latest Browse all 6211

ConnectX-3 Pro VXLAN Performance Overhead

$
0
0

Hi,

 

I'm testing out ConnectX-3 Pro with VXLAN Offload in our lab. Using a single-stream iperf performance test, we get ~34Gbit/s transfer speed of non-VXLAN transport, but only ~28Gbit/s with VXLAN encapsulation.

 

In both cases, the bottleneck is the CPU on the receiving side. Looking at a perf dump, the top usage:

 

Without VXLAN:

+   24.27%            iperf  [kernel.kallsyms]       [k] copy_user_enhanced_fast_string

+    6.49%            iperf  [kernel.kallsyms]       [k] mlx4_en_process_rx_cq

+    5.34%            iperf  [kernel.kallsyms]       [k] tcp_gro_receive

+    3.43%            iperf  [kernel.kallsyms]       [k] dev_gro_receive

+    3.28%            iperf  [kernel.kallsyms]       [k] mlx4_en_complete_rx_desc

+    3.05%            iperf  [kernel.kallsyms]       [k] memcpy

+    2.88%            iperf  [kernel.kallsyms]       [k] inet_gro_receive

 

With VXLAN:

+   20.06%            iperf  [kernel.kallsyms]      [k] copy_user_enhanced_fast_string

+    6.04%            iperf  [kernel.kallsyms]      [k] mlx4_en_process_rx_cq

+    5.43%            iperf  [kernel.kallsyms]      [k] inet_gro_receive

+    3.29%            iperf  [kernel.kallsyms]      [k] dev_gro_receive

+    3.24%            iperf  [kernel.kallsyms]      [k] tcp_gro_receive

+    3.08%            iperf  [kernel.kallsyms]      [k] skb_gro_receive

+    3.02%            iperf  [kernel.kallsyms]      [k] memcpy

+    2.85%            iperf  [kernel.kallsyms]      [k] mlx4_en_complete_rx_desc

 

This is Centos 6.5, kernel 3.15.0, Firmware 2.31.5050.

 

We're certainly happy with 28Gbit/s, but I'm wondering if there are plans to improve this to the point that VXLAN adds no additional CPU overhead at all, or if there is any tuning I can do towards the same goal?

 

- Thorvald


Viewing all articles
Browse latest Browse all 6211

Trending Articles