Hello!
In the past few weeks we've been setting up a Proxmox cluster to be able to host our applications.
From the start we've been experiencing some weird issues when backing up the VM's to an NFS target (Synology NAS)
Randomly during the backup (sometimes 0 or sometimes multiple) VM's seem to be crashing at random points into the disk backups.
The storage is set up using ZFS over ISCSI to a TrueNAS Scale server.
The VM's all run an Ubuntu server 22.04 Cloudinit image.
I've posted the relevant logs below:
The time the VM crashes into it being backed up varies too. Sometimes it happens after less than ten seconds and sometimes it takes minutes before it crashes.
I've tried looking through syslogs of the vm's at the time they're crashing but I can't find any log entries relating to something going wrong (also no dmesg entries on the host).
What steps could I take to try to diagnose these issues?
If any other information is required please let me know!
In the past few weeks we've been setting up a Proxmox cluster to be able to host our applications.
From the start we've been experiencing some weird issues when backing up the VM's to an NFS target (Synology NAS)
Randomly during the backup (sometimes 0 or sometimes multiple) VM's seem to be crashing at random points into the disk backups.
The storage is set up using ZFS over ISCSI to a TrueNAS Scale server.
The VM's all run an Ubuntu server 22.04 Cloudinit image.
I've posted the relevant logs below:
Code:
INFO: Backup started at 2023-02-15 03:44:19
INFO: status = running
INFO: VM Name: kube-master-01
INFO: include disk 'scsi0' 'slc-storage-01-ssd:vm-401-disk-0' 51404M
iscsiadm: No session found.
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating vzdump archive '/mnt/pve/RS-Brainworkz-Backup-No-Offsite/dump/vzdump-qemu-401-2023_02_15-03_44_19.vma.zst'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '512e4bb0-0410-4553-9f7d-89fe61aff9be'
INFO: resuming VM again
INFO: 0% (178.9 MiB of 50.2 GiB) in 3s, read: 59.6 MiB/s, write: 9.8 MiB/s
INFO: 1% (516.1 MiB of 50.2 GiB) in 9s, read: 56.2 MiB/s, write: 42.6 MiB/s
INFO: 2% (1.1 GiB of 50.2 GiB) in 19s, read: 56.3 MiB/s, write: 43.5 MiB/s
INFO: 3% (1.5 GiB of 50.2 GiB) in 28s, read: 56.3 MiB/s, write: 36.7 MiB/s
INFO: 4% (2.0 GiB of 50.2 GiB) in 37s, read: 56.2 MiB/s, write: 31.6 MiB/s
INFO: 5% (2.5 GiB of 50.2 GiB) in 46s, read: 56.2 MiB/s, write: 32.1 MiB/s
INFO: 6% (3.0 GiB of 50.2 GiB) in 55s, read: 56.4 MiB/s, write: 29.5 MiB/s
INFO: 7% (3.5 GiB of 50.2 GiB) in 1m 4s, read: 56.2 MiB/s, write: 21.3 KiB/s
INFO: 8% (4.0 GiB of 50.2 GiB) in 1m 13s, read: 56.2 MiB/s, write: 5.3 MiB/s
INFO: 9% (4.6 GiB of 50.2 GiB) in 1m 23s, read: 56.3 MiB/s, write: 18.7 MiB/s
INFO: 10% (5.1 GiB of 50.2 GiB) in 1m 32s, read: 56.3 MiB/s, write: 2.1 MiB/s
INFO: 11% (5.6 GiB of 50.2 GiB) in 1m 41s, read: 56.3 MiB/s, write: 1.7 MiB/s
INFO: 12% (6.1 GiB of 50.2 GiB) in 1m 51s, read: 50.6 MiB/s, write: 11.1 MiB/s
INFO: 13% (6.5 GiB of 50.2 GiB) in 2m, read: 56.2 MiB/s, write: 4.0 KiB/s
INFO: 14% (7.0 GiB of 50.2 GiB) in 2m 9s, read: 56.3 MiB/s, write: 0 B/s
INFO: 15% (7.5 GiB of 50.2 GiB) in 2m 18s, read: 56.2 MiB/s, write: 0 B/s
INFO: 16% (8.0 GiB of 50.2 GiB) in 2m 27s, read: 56.3 MiB/s, write: 0 B/s
INFO: 17% (8.6 GiB of 50.2 GiB) in 2m 37s, read: 56.2 MiB/s, write: 124.4 KiB/s
ERROR: VM 401 not running
INFO: aborting backup job
ERROR: VM 401 not running
INFO: resuming VM again
ERROR: Backup of VM 401 failed - VM 401 not running
INFO: Failed at 2023-02-15 03:47:09
The time the VM crashes into it being backed up varies too. Sometimes it happens after less than ten seconds and sometimes it takes minutes before it crashes.
I've tried looking through syslogs of the vm's at the time they're crashing but I can't find any log entries relating to something going wrong (also no dmesg entries on the host).
What steps could I take to try to diagnose these issues?
If any other information is required please let me know!