Vm backup stuck at 99% no error just hangs

Laynord · Aug 27, 2023

as mentioned in title vm backup stuck at 99% no error just hangs
backup disk is not full I can stop and relaunch the backup but it just doesn't work on some machines .

No message other than Info : 99% for 2.5 h on this screen

Lukas Wagner · Aug 28, 2023

Could you provide me with the output from pveversion -v? What kind of storage do you use as a backup target (e.g. Backup Server, NFS, CIFS, etc.)? What is the hardware configuration for that VM/CT? (/etc/pve/{qemu,lxc/<vmid>.conf)

Do you see anything odd in the system logs?

Laynord · Aug 28, 2023

Lukas Wagner said:
Could you provide me with the output from pveversion -v? What kind of storage do you use as a backup target (e.g. Backup Server, NFS, CIFS, etc.)? What is the hardware configuration for that VM/CT? (/etc/pve/{qemu,lxc/<vmid>.conf)

Do you see anything odd in the system logs?

The version is :

proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-4
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.15.102-1-pve: 5.15.102-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.30-2-pve: 5.15.30-3
ceph-fuse: 15.2.16-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve1

The target is a USB adapted nvme drive as directory in the datacenter part of proxmox and set as backups

and the VM config output is :

#Client%3A #My Name  #(ID%3A 1)
#Email%3A #My Email#
#Service ID%3A 79
#Hostname%3A #VmName#
#Main IP%3A #VMIp#
#IP address allocation%3A
##VMIp#
#Product%3#Product name# (ID%3A 24)
boot: c
bootdisk: scsi0
cipassword: $5$pH8FPBfE$B5dkhxNr5ZHWn13LPFXnjGyJ1daX9WA3fUkwWgYh41A
ciuser: underple
cores: 7
cpu: kvm64
cpuunits: 1024
ide2: local-lvm:vm-113-cloudinit,media=cdrom
ipconfig0: ip=Static ip,gw=Static ip
kvm: 1
memory: 32768
meta: creation-qemu=7.0.0,ctime=1668200626
name: VmName
net0: e1000=82:10:2B:17:93:47,bridge=vmbr2
onboot: 1
scsi0: local-lvm:vm-113-disk-0,cache=none,format=raw,size=250G
scsihw: virtio-scsi-pci
smbios1: uuid=0e034933-722e-4424-852e-c3cd4ba42a90
sockets: 1
vcpus: 7
vga: std
vmgenid: a91e6fc5-97d6-40fa-983c-6ae590241263

And for this question: "Do you see anything odd in the system logs?"
I cannot see much as I currently have a raid fail and everything is spammed every 4 seconds by drive errors
And for some reason I cant use nano's goto line feature
to try and find anything have to scroll 64K lines for 5 mn to go to bottom

Lukas Wagner · Aug 28, 2023

Laynord said:
And for this question: "Do you see anything odd in the system logs?"
I cannot see much as I currently have a raid fail and everything is spammed every 4 seconds by drive errors
And for some reason I cant use nano's goto line feature
to try and find anything have to scroll 64K lines for 5 mn to go to bottom

journalctl -e to the rescue - that lists the log entries bottom to top

Also, you can limit journalctl's output using --since and --until, eg. journalctl --since "1 hour ago"

Now, you say that you currently have issues with your drives? I could imagine that this could be the reason for the hanging backup job. What kind of failure is it exactly? Can you provide any log messages from that?

Laynord · Aug 28, 2023

The current fail is one drive died so I'm upgrading from spinning rust to SSDs
And one drive has broken sectors I think, I am not sure if I recall correctly but I don't know how to fix it and it didn't break anything until now (it wasn't broken when installed )
With some grep ignoring the disk failures I have this as logs in var log syslog : ( linked filed )
Last 99% backup was started the 27th at 17:08
and ended ( stuck / stalled ) 3h and some change after

Laynord · Aug 28, 2023

Other backups work tho btw *

Lukas Wagner said:
journalctl -e to the rescue - that lists the log entries bottom to top

Also, you can limit journalctl's output using --since and --until, eg. journalctl --since "1 hour ago"

Now, you say that you currently have issues with your drives? I could imagine that this could be the reason for the hanging backup job. What kind of failure is it exactly? Can you provide any log messages from that?

Just some didnt and had to do some manual ones like export a pfsence config and other

Lukas Wagner · Aug 29, 2023

Hmmm, looking at the logs it really looks like that was caused by the failing RAID array. At least I don't really have any other explanation.
It seems like the messages from the raid controller have stopped at a certain time (after a reboot to be precise) - did you change anything, e.g. swapped disks?

Search

Search

Vm backup stuck at 99% no error just hangs

Laynord

New Member

Lukas Wagner

Proxmox Staff Member

Laynord

New Member

Lukas Wagner

Proxmox Staff Member

Laynord

New Member

Attachments

Laynord

New Member

Lukas Wagner

Proxmox Staff Member

We value your privacy