Loss + Consider r2q change

Ch@rlus · Oct 28, 2019

Hey guys !

We recently expended one of our clusters, with larges servers. Theses servers are currently handling about 100-150 KVM VM, and are running great, except for one detail : We have some random loss, during 1-2min, on some VMs, temporarily, and then everything comes back online.

During theses "loss", we have a some "HTB: quantum of class 10001 is big. Consider r2q change.".

While searching the forum, I saw that this message was linked to the VM shaping/qos (there is indeed one).

That said, the VM is very far from reaching its network limit, and disabling QoS does not solve anything

For your information, the loss also appears during an "internal" ping (from the host that carries the VMs). In theory, this eliminates a problem with the network card or connection...

Have any of you ever had a similar problem? Can this be related to the number of vm on the host?

Alwin · Dec 12, 2019

Ch@rlus said:
That said, the VM is very far from reaching its network limit, and disabling QoS does not solve anything

Did you shut down and start the VM again, when you disabled the QoS?

Ch@rlus said:
For your information, the loss also appears during an "internal" ping (from the host that carries the VMs). In theory, this eliminates a problem with the network card or connection...

Not necessarily, is the ping triggering the loss or just exhibiting the issue? There is still other traffic on that interface, right?

Ch@rlus said:
During theses "loss", we have a some "HTB: quantum of class 10001 is big. Consider r2q change.".

AFAIU, it is a hint to finetune the QoS.
http://linux-ip.net/articles/Traffic-Control-HOWTO/classful-qdiscs.html

Ch@rlus · Dec 12, 2019

Alwin said:
Did you shut down and start the VM again, when you disabled the QoS?

Not necessarily, is the ping triggering the loss or just exhibiting the issue? There is still other traffic on that interface, right?

AFAIU, it is a hint to finetune the QoS.
http://linux-ip.net/articles/Traffic-Control-HOWTO/classful-qdiscs.html

Thanks for your reply

We ended up changing the network cards to fix theses issues => Our first "internal" ping tests were actually wrong, and went via the host's network card.

We switched to Intel X520s, which are much more stable.

EDIT : (We did had tried to reboot the VM after the QoS change)

Alwin · Dec 13, 2019

Ch@rlus said:
EDIT : (We did had tried to reboot the VM after the QoS change)

A reboot happens inside the VM. It needs to be a stop & start to get a new KVM process.

Proxmox VE 6.1 has this as a new feature ("reboot"), that also does a shutdown & start.

Ch@rlus · Dec 13, 2019

Yes, of course, it was a "hard" reboot (stop/start of the VM inside proxmox).

Search

Search

Loss + Consider r2q change

Ch@rlus

Renowned Member

Alwin

Proxmox Retired Staff

Ch@rlus

Renowned Member

Alwin

Proxmox Retired Staff

Ch@rlus

Renowned Member