I think the problem lies in the combination detect_zeroes and discard.
I changed one of my vm:s config-file in /etc/pve... to “detect_zeroes=off,discard=on” and I haven't had any problem with that vm for several days.
Can anybody else confirm this? I don't have so high load on my cluster now...
Ok, I found out that my problem seems to be this bug https://bugzilla.proxmox.com/show_bug.cgi?id=2311
When a vm is running with CEPH/KRBD it uses a block device for the vms image. I’ve not tried to run Proxmox with ZFS but my guess is that it also uses a block-device for the vm image. If you...
Yes, I'm running the newest kernel now on all nodes.
Yes, I upgraded BIOS on all nodes.
I’m pretty sure it’s the same bug as described in this thread https://forum.proxmox.com/threads/v...y-and-buffer-i-o-errors-since-qemu-3-0.55452/ and bug reported here...
Anyone solutions to this. I started to get the same problems after upgrading to Proxmox 6.2 from 6.1. But I also migrated some vms from another systems to my Proxmox system at the same time so it could be that I didn’t notice it earlier because it was not as high load on the system before that...
Yes, I've started a thread about this last week. I'm running some old CentOS 5 and a FreeBSD with problems.
I only have problems when I use Ceph with KRBD. What storage do you use?
I thought I had the last BIOS, but there was a new version out. Anyway, I applied it and the lastest microcode updates and the problem still persists.
Of course, I’m just trying to narrow down where the problem is.
Any other thought of what I can try?
Running vms on Ceph with krbd is unstable since we upgraded to Proxmox 6.2
We are running Proxmox on 5 nodes with 2 nvme disks for Ceph in each node and around 50 vms, we don’t use lxc. We use krbd for our Ceph storage pool and VirtIO-scsi as vm disk controller.
After upgrading to Proxmox 6.2...