Hi,
we have a cluster setup of 4 Dell R730 systems with X540-AT2 10GBit DualCards. The servers are equipped with Dell 1.92TB SSD Disk in Raid5.
In between we have Cisco 3172TQ 10GBit copper switches with JumboFrames Support.
We had initially configured mode 4 bonding and now have configured mode 0 bonding on the pve machines. Versus the mode 4 bonding in the beginning,
we see an improvement. We also have an MTU of 9000 set on the physical interfaces and the bonding. Also the vmbridge has the correct MTU setting.
Beside that, we set the values from https://darksideclouds.wordpress.com/2016/10/10/tuning-10gb-nics-highway-to-hell/ which we found referenced here in the forum.
So right now, we get quite adequate performance between pve and pve server.
But within VM (Ubuntu 20.04, 2 Cores, 4GB RAM, all drivers with virtio), we get a horrible speed.
Here is a config of one of the vm's:
Running iperf3 between these two basic vm's we get:
I've already upgraded to latest ixgbe driver from intel, we set sysctl settings:
We can not really understand, where this big performance issue comes from and would be happy to get some thoughts from whom, who got it already running.
Thanks,
Patrick
we have a cluster setup of 4 Dell R730 systems with X540-AT2 10GBit DualCards. The servers are equipped with Dell 1.92TB SSD Disk in Raid5.
In between we have Cisco 3172TQ 10GBit copper switches with JumboFrames Support.
We had initially configured mode 4 bonding and now have configured mode 0 bonding on the pve machines. Versus the mode 4 bonding in the beginning,
we see an improvement. We also have an MTU of 9000 set on the physical interfaces and the bonding. Also the vmbridge has the correct MTU setting.
Beside that, we set the values from https://darksideclouds.wordpress.com/2016/10/10/tuning-10gb-nics-highway-to-hell/ which we found referenced here in the forum.
So right now, we get quite adequate performance between pve and pve server.
But within VM (Ubuntu 20.04, 2 Cores, 4GB RAM, all drivers with virtio), we get a horrible speed.
Here is a config of one of the vm's:
Code:
agent: 1
boot: dcn
bootdisk: scsi0
cores: 2
description: host%3A gac32-pve01
ide2: none,media=cdrom
memory: 4096
name: gac33-tst01
net0: virtio=7E:36:4F:88:23:7F,bridge=vmbr1,tag=1233
net1: virtio=A6:B3:08:0F:53:2E,bridge=vmbr0,tag=1232
net4: virtio=16:B6:0F:A5:AD:02,bridge=vmbr0,tag=1235
numa: 0
onboot: 1
ostype: l26
scsi0: DATA0:vm-167-disk-0,cache=writeback,size=30G
scsihw: virtio-scsi-pci
smbios1: uuid=ce220f7d-f602-434d-899a-9dcdd26ad13b
sockets: 1
vmgenid: c5357b76-da86-4a4a-b803-fa3f8e5cb53a
Running iperf3 between these two basic vm's we get:
Code:
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 172.19.35.226, port 46614
[ 5] local 172.19.35.225 port 5201 connected to 172.19.35.226 port 46616
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 190 MBytes 1.59 Gbits/sec
[ 5] 1.00-2.00 sec 206 MBytes 1.73 Gbits/sec
[ 5] 2.00-3.00 sec 215 MBytes 1.80 Gbits/sec
[ 5] 3.00-4.00 sec 204 MBytes 1.71 Gbits/sec
[ 5] 4.00-5.00 sec 213 MBytes 1.78 Gbits/sec
[ 5] 5.00-6.01 sec 76.2 MBytes 631 Mbits/sec
[ 5] 6.01-7.00 sec 296 MBytes 2.51 Gbits/sec
[ 5] 7.00-8.00 sec 206 MBytes 1.73 Gbits/sec
[ 5] 8.00-9.00 sec 214 MBytes 1.79 Gbits/sec
[ 5] 9.00-10.00 sec 134 MBytes 1.12 Gbits/sec
[ 5] 10.00-10.01 sec 128 KBytes 302 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.01 sec 1.91 GBytes 1.64 Gbits/sec receiver
I've already upgraded to latest ixgbe driver from intel, we set sysctl settings:
Code:
# Maximum receive socket buffer size
net.core.rmem_max = 134217728
# Maximum send socket buffer size
net.core.wmem_max = 134217728
# Minimum, initial and max TCP Receive buffer size in Bytes
net.ipv4.tcp_rmem = 4096 87380 134217728
# Minimum, initial and max buffer space allocated
net.ipv4.tcp_wmem = 4096 65536 134217728
# Maximum number of packets queued on the input side
net.core.netdev_max_backlog = 300000
# Auto tuning
net.ipv4.tcp_moderate_rcvbuf =1
# Don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1
# The Hamilton TCP (HighSpeed-TCP) algorithm is a packet loss based congestion control and is more aggressive pushing up to max bandwidth (total BDP) and favors hosts with lower TTL / VARTTL.
net.ipv4.tcp_congestion_control=htcp
# If you are using jumbo frames set this to avoid MTU black holes.
net.ipv4.tcp_mtu_probing = 1
We can not really understand, where this big performance issue comes from and would be happy to get some thoughts from whom, who got it already running.
Thanks,
Patrick