Hi all,
we run a PVE server (no cluster yet) with a few Linux VMs and a separate Proxmox Backup Server. Both systems use ZFS for storage. The VMs are plain Debian Bookworm installations, ext4 FS.
Today we wanted to back up a VM this way, but it got stuck at 2% and we interrupted it after some time. This would not be a big problem, BUT: This seems to have caused corruption of the VM's internal filesystem, leading to a crash of an application, but the VM itself continued running. I remember that I saw a timeout and crash message mentioning jbd2/sda1-8 inside the VM's dmesg, but unfortunately I did not extract the dmesg contents before I rebooted it. It looks like nothing of this information was saved after the crash, probably because the FS was stuck. The VM then failed to boot normally, dropping me into an initramfs shell needing to fsck the disk. After that, it resumed normally.
While this is certainly bad and would need investigation (I'll probably try to reproduce the issue during the next days), I wonder if Proxmox always uses its own snapshotting mechanism. With all VM storage on ZFS, it should be perfectly safe to create a ZFS snapshot and use this for the backup, without dealing with the VM itself in any way. Does Proxmox make use of this, or can it be configured to do so?
Thanks!
Regards,
Philipp
Backup log:
VM config:
Version:
we run a PVE server (no cluster yet) with a few Linux VMs and a separate Proxmox Backup Server. Both systems use ZFS for storage. The VMs are plain Debian Bookworm installations, ext4 FS.
Today we wanted to back up a VM this way, but it got stuck at 2% and we interrupted it after some time. This would not be a big problem, BUT: This seems to have caused corruption of the VM's internal filesystem, leading to a crash of an application, but the VM itself continued running. I remember that I saw a timeout and crash message mentioning jbd2/sda1-8 inside the VM's dmesg, but unfortunately I did not extract the dmesg contents before I rebooted it. It looks like nothing of this information was saved after the crash, probably because the FS was stuck. The VM then failed to boot normally, dropping me into an initramfs shell needing to fsck the disk. After that, it resumed normally.
While this is certainly bad and would need investigation (I'll probably try to reproduce the issue during the next days), I wonder if Proxmox always uses its own snapshotting mechanism. With all VM storage on ZFS, it should be perfectly safe to create a ZFS snapshot and use this for the backup, without dealing with the VM itself in any way. Does Proxmox make use of this, or can it be configured to do so?
Thanks!
Regards,
Philipp
Backup log:
Code:
INFO: Starting Backup of VM 108 (qemu)
INFO: Backup started at 2024-08-13 11:01:33
INFO: status = running
INFO: VM Name: storage-2-erp
INFO: include disk 'scsi0' 'local-nvme-pool:vm-108-disk-0' 200G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/108/2024-08-13T09:01:33Z'
INFO: started backup task '21fff6f2-85d9-4cf1-9143-6030a6adad7b'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO: 0% (1.1 GiB of 200.0 GiB) in 3s, read: 378.7 MiB/s, write: 352.0 MiB/s
INFO: 1% (2.0 GiB of 200.0 GiB) in 13s, read: 95.6 MiB/s, write: 94.0 MiB/s
ERROR: interrupted by signal
INFO: aborting backup job
INFO: resuming VM again
ERROR: Backup of VM 108 failed - interrupted by signal
INFO: Failed at 2024-08-13 12:55:14
ERROR: Backup job failed - interrupted by signal
INFO: notified via target `mail-to-root`
TASK ERROR: interrupted by signal
VM config:
Code:
boot: order=scsi0;ide2;net0
cores: 4
cpu: host,flags=+ibpb;+virt-ssbd;+amd-ssbd
ide2: local:iso/debian-12.1.0-amd64-DVD-1.iso,media=cdrom,size=3900480K
memory: 12288
meta: creation-qemu=8.0.2,ctime=1710258899
name: storage-2-erp
net0: virtio=B6:29:FC:24:DC:12,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-nvme-pool:vm-108-disk-0,iothread=1,size=200G
scsihw: virtio-scsi-single
smbios1: uuid=71a00ef3-0407-4dae-8e38-15c5e2e86973
sockets: 1
vmgenid: 727fbb5f-18c4-4ddf-88d3-cee864c5a5aa
Version:
Code:
proxmox-ve: 8.2.0 (running kernel: 6.2.16-3-pve)
pve-manager: 8.2.4 (running version: 8.2.4/faa83925c9641325)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.8: 6.8.8-2
proxmox-kernel-6.8.8-2-pve-signed: 6.8.8-2
proxmox-kernel-6.8.4-2-pve-signed: 6.8.4-2
proxmox-kernel-6.2.16-20-pve: 6.2.16-20
proxmox-kernel-6.2: 6.2.16-20
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.4
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.7
libpve-cluster-perl: 8.0.7
libpve-common-perl: 8.2.1
libpve-guest-common-perl: 5.1.3
libpve-http-server-perl: 5.1.0
libpve-network-perl: 0.9.8
libpve-rs-perl: 0.8.9
libpve-storage-perl: 8.2.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.2.7-1
proxmox-backup-file-restore: 3.2.7-1
proxmox-firewall: 0.4.2
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.2.3
pve-cluster: 8.0.7
pve-container: 5.1.12
pve-docs: 8.2.2
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.1
pve-firewall: 5.0.7
pve-firmware: 3.12-1
pve-ha-manager: 4.0.5
pve-i18n: 3.2.2
pve-qemu-kvm: 9.0.0-6
pve-xtermjs: 5.3.0-3
qemu-server: 8.2.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.4-pve1