Done this yesterday, issue is goneThe packagepve-qemu-kvm=6.2.0-8
which reverts the problematic change for QEMU's RBD driver, is now available on thepvetest
repository.
You can test it by:
- Adding/Enabling the pvetest repository (can also be done in the
Repostiories
panel in the UI).apt update && apt install pve-qemu-kvm
- Removing/disabling the
pvetest
repository again.- VMs need to be stopped/started or migrated to pick up the new version.
great to hear that ... thanks for that informationDone this yesterday, issue is gone![]()
After deploying this package across my cluster the issue is much improved, but i still had 3 workloads shutdown (compared to 10 plus before updating kvm pkg) over night which powered off during a backup process.Done this yesterday, issue is gone![]()
Please try to pin Kernel Linux 5.13.19-6-pve for now. Has helped with several current raising problems with 7.2-4 the last days.Fabian_E said:
The package pve-qemu-kvm=6.2.0-8 which reverts the problematic change for QEMU's RBD driver, is now available on the pvetest repository.
You can test it by:
- Adding/Enabling the pvetest repository (can also be done in the Repostiories panel in the UI).
- apt update && apt install pve-qemu-kvm
- Removing/disabling the pvetest repository again.
- VMs need to be stopped/started or migrated to pick up the new version.
After deploying this package across my cluster the issue is much improved, but i still had 3 workloads shutdown (compared to 10 plus before updating kvm pkg) over night which powered off during a backup process.
Interestingly the hosts that powered off weren't actually included in the backup but all make use of RBD storage for data storage (video archive)
Any further ideas from the community on what i can try to solve this issue once and for all?
I saw mentioned in an earlier post someone referring to disabling krdb flag in the disk config, is that correct?
View attachment 37324
Thanks in advance.
That sounds a bit odd.. Anything in the kernel log /Interestingly the hosts that powered off weren't actually included in the backup but all make use of RBD storage for data storage (video archive)
journalctl
, could even be an OOM kill (which then would be rather unrelated to any such bugs and a sign of memory over commitment, if it really was a OOM kill)Enabling KRBD, not disabling. What can also help is enabling IO-Thread (and theI saw mentioned in an earlier post someone referring to disabling krdb flag in the disk config, is that correct?
VirtIO SCSI single
controller, if you're using SCSI for the VM disks), also using the default No Cache
cache mode for disks, as others, like e.g. writeback, while being sometimes seemingly faster, can cause much more erratic IO and sensitivity for host memory pressure for the guest.Please try to pin Kernel Linux 5.13.19-6-pve for now. Has helped with several current raising problems with 7.2-4 the last days.
Firmware or Kernel package? (just to be sure to understand what helped you)I bit the bullet and rolled back the firmware across the cluster.
Apologies its been a long week!Firmware or Kernel package? (just to be sure to understand what helped you)
Is this a cluster? As then you could live-migrate the VMs to already updated PVE hosts, that way they also get started with the new PVE-QEMU version without any interruption in the guest itself.Is it a concern that I only restarted this one VM to use the newer version of pve-qemu-kvm? Restarting all other VMs is a big project which requires a lot of coordination.
That's a good idea. Thanks for the reply!Thanks for your feedback.
Is this a cluster? As then you could live-migrate the VMs to already updated PVE hosts, that way they also get started with the new PVE-QEMU version without any interruption in the guest itself.