We successfully completed testing and moved virtual routers in to production this last weekend. The virtual routers are RouterOS CHR (Cloud Hosted Router) instances running Linux 64 with VirtIO drivers.
It's now 17:00 and network utilisation has dropped off and the associated packet loss has subsequently dissapeared:
The problem results in around 2% packet loss at 170 thousand packets per second.
We are running Proxmox 4.3-1 with all available commercial updates on a 3 node cluster with Ceph.
Networking comprises of 2 x Intel X540-AT2 10Gbps UTP NICs for virtual machine traffic and 2 x Intel 82599ES 10Gbps SFP+ for Ceph replication. Both NIC pairs are configured as active/backup bonds to stop traffic unnecessarily crossing the network stack (bond0 = eth0 & eth1 where eth0 is primary and bond1 = eth2 & eth3 where eth3 is primary).
I witness the problem between two virtual routers on the same physical host and VLAN, which I presume would meant that traffic wouldn't flow to and from the switch port.
/etc/network/interfaces
I read a post where someone experienced a similar limitation with VMWare where they could see that the guest's ring buffer wasn't set adequately. Is this something I can check in Proxmox?
Someone
It's now 17:00 and network utilisation has dropped off and the associated packet loss has subsequently dissapeared:
Code:
[davidh@zatjnb01-cr01] /interface> monitor-traffic aggregate
rx-packets-per-second: 123 601
rx-bits-per-second: 651.3Mbps
fp-rx-packets-per-second: 0
fp-rx-bits-per-second: 0bps
rx-drops-per-second: 0
rx-errors-per-second: 0
tx-packets-per-second: 123 111
tx-bits-per-second: 648.5Mbps
fp-tx-packets-per-second: 0
fp-tx-bits-per-second: 0bps
tx-drops-per-second: 0
tx-errors-per-second: 0
The problem results in around 2% packet loss at 170 thousand packets per second.
We are running Proxmox 4.3-1 with all available commercial updates on a 3 node cluster with Ceph.
Networking comprises of 2 x Intel X540-AT2 10Gbps UTP NICs for virtual machine traffic and 2 x Intel 82599ES 10Gbps SFP+ for Ceph replication. Both NIC pairs are configured as active/backup bonds to stop traffic unnecessarily crossing the network stack (bond0 = eth0 & eth1 where eth0 is primary and bond1 = eth2 & eth3 where eth3 is primary).
I witness the problem between two virtual routers on the same physical host and VLAN, which I presume would meant that traffic wouldn't flow to and from the switch port.
/etc/network/interfaces
Code:
auto lo
iface lo inet loopback
auto bond0
iface bond0 inet manual
slaves eth0,eth1
bond_miimon 100
bond_mode active-backup
mtu 9216
auto bond1
iface bond1 inet static
address 10.254.1.2
netmask 255.255.255.0
slaves eth2,eth3
bond_miimon 100
bond_mode active-backup
mtu 9216
auto eth0
iface eth0 inet manual
bond-master bond0
bond-primary eth0
mtu 9216
auto eth1
iface eth1 inet manual
bond-master bond0
mtu 9216
auto eth2
iface eth2 inet manual
bond-master bond1
mtu 9216
auto eth3
iface eth3 inet manual
bond-master bond1
bond-primary eth3
mtu 9216
auto vmbr0
iface vmbr0 inet static
address 198.19.17.66
netmask 255.255.255.224
gateway 198.19.17.65
bridge_ports bond0
bridge_stp off
bridge_fd 0
mtu 9216
I read a post where someone experienced a similar limitation with VMWare where they could see that the guest's ring buffer wasn't set adequately. Is this something I can check in Proxmox?
Someone