Hi Community,
although officially not supported or best practice I know that many people (like me) are running PBS in a VM on a Synology NAS for their backups of their homelabs. I had zero issues with this in the last 18 months.
Two days ago a backup session suddenly failed with a non-writable datastore. The datastore mount fell back into emergency-ro without a further reason. I ran an e2fsck on the datastore with no issues and rebooted the VM.
Yesterday evening, already being in bed, my monitoring cried. PBS down, Backup Datastore mounts on the host down, Synology down. The complete DS923+ crashed. Did not even react to shutdown via button and I needed to force-poweroff it. Rebooted, e2fschk the datastore, remounted it, started the failed backup session manually.
That backup session again failed after 3 minutes. Datastore was back in emergency-ro.
What was happening?
3 days ago I updated the PBS. With it came the new V7 kernel which was in use. This was the start of all iussues.
The crash only happens under certain load. Which is in my case: All 3 hosts in my homelab are running a backup to PBS. This is leading to DID_BAD_TARGET I/O-Errors in the PBS-VM:
There are no disk errors etc. on the host (Synology) side. e2fsck on the virtual datastore disk never found errors. It's reproducible with every backup which causes significant load.
My Synology:
DS923+, DSM 7.3.2-86009 (Kernel 4.4)
Guest: PBS-VM with Kernel 7.0.2-4-pve
Storage: ext4 on virtio-scsi/blk, btrfs/SHR on the Synology host
Solution: Pin the Kernel to V6 In my case:
All is running well here since then.
I guess the virtio/virtio-scsi behavior in Kernel 7 might have some compatibility issues with the QEMU/Kernel-4.4 in DSM.
EDIT: I don't think it's related only to DSM - I found this here which is an identical issue on a PVE host, all pointing to the new io_uring in Kernel 7: https://forum.proxmox.com/threads/hung-on-restore-since-upgrade-to-kernel-7-proxmox-9.183717/
Take care,
Marco
although officially not supported or best practice I know that many people (like me) are running PBS in a VM on a Synology NAS for their backups of their homelabs. I had zero issues with this in the last 18 months.
Two days ago a backup session suddenly failed with a non-writable datastore. The datastore mount fell back into emergency-ro without a further reason. I ran an e2fsck on the datastore with no issues and rebooted the VM.
Yesterday evening, already being in bed, my monitoring cried. PBS down, Backup Datastore mounts on the host down, Synology down. The complete DS923+ crashed. Did not even react to shutdown via button and I needed to force-poweroff it. Rebooted, e2fschk the datastore, remounted it, started the failed backup session manually.
That backup session again failed after 3 minutes. Datastore was back in emergency-ro.
What was happening?
3 days ago I updated the PBS. With it came the new V7 kernel which was in use. This was the start of all iussues.
The crash only happens under certain load. Which is in my case: All 3 hosts in my homelab are running a backup to PBS. This is leading to DID_BAD_TARGET I/O-Errors in the PBS-VM:
Code:
sd 1:0:1:0: [sdb] tag#X FAILED Result: hostbyte=DID_BAD_TARGET
I/O error, dev sdb, sector ...
EXT4-fs (sdb1): Remounting filesystem read-only
There are no disk errors etc. on the host (Synology) side. e2fsck on the virtual datastore disk never found errors. It's reproducible with every backup which causes significant load.
My Synology:
DS923+, DSM 7.3.2-86009 (Kernel 4.4)
Guest: PBS-VM with Kernel 7.0.2-4-pve
Storage: ext4 on virtio-scsi/blk, btrfs/SHR on the Synology host
Solution: Pin the Kernel to V6 In my case:
Code:
proxmox-boot-tool kernel pin 6.17.13-9-pve
All is running well here since then.
I guess the virtio/virtio-scsi behavior in Kernel 7 might have some compatibility issues with the QEMU/Kernel-4.4 in DSM.
EDIT: I don't think it's related only to DSM - I found this here which is an identical issue on a PVE host, all pointing to the new io_uring in Kernel 7: https://forum.proxmox.com/threads/hung-on-restore-since-upgrade-to-kernel-7-proxmox-9.183717/
Take care,
Marco
Last edited: