Vm backup stuck at 99% no error just hangs

Laynord

New Member
Apr 5, 2023
12
0
1
as mentioned in title vm backup stuck at 99% no error just hangs
backup disk is not full I can stop and relaunch the backup but it just doesn't work on some machines . 1693169573124.png
No message other than Info : 99% for 2.5 h on this screen
 
Could you provide me with the output from pveversion -v? What kind of storage do you use as a backup target (e.g. Backup Server, NFS, CIFS, etc.)? What is the hardware configuration for that VM/CT? (/etc/pve/{qemu,lxc/<vmid>.conf)

Do you see anything odd in the system logs?
 
Could you provide me with the output from pveversion -v? What kind of storage do you use as a backup target (e.g. Backup Server, NFS, CIFS, etc.)? What is the hardware configuration for that VM/CT? (/etc/pve/{qemu,lxc/<vmid>.conf)

Do you see anything odd in the system logs?
The version is :
proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve) pve-manager: 7.4-16 (running version: 7.4-16/0f39f621) pve-kernel-5.15: 7.4-4 pve-kernel-5.15.108-1-pve: 5.15.108-2 pve-kernel-5.15.102-1-pve: 5.15.102-1 pve-kernel-5.15.83-1-pve: 5.15.83-1 pve-kernel-5.15.64-1-pve: 5.15.64-1 pve-kernel-5.15.30-2-pve: 5.15.30-3 ceph-fuse: 15.2.16-pve1 corosync: 3.1.7-pve1 criu: 3.15-1+pve-1 glusterfs-client: 9.2-1 ifupdown2: 3.1.0-1+pmx4 ksm-control-daemon: 1.4-1 libjs-extjs: 7.0.0-1 libknet1: 1.24-pve2 libproxmox-acme-perl: 1.4.4 libproxmox-backup-qemu0: 1.3.1-1 libproxmox-rs-perl: 0.2.1 libpve-access-control: 7.4.1 libpve-apiclient-perl: 3.2-1 libpve-common-perl: 7.4-2 libpve-guest-common-perl: 4.2-4 libpve-http-server-perl: 4.2-3 libpve-rs-perl: 0.7.7 libpve-storage-perl: 7.4-3 libspice-server1: 0.14.3-2.1 lvm2: 2.03.11-2.1 lxc-pve: 5.0.2-2 lxcfs: 5.0.3-pve1 novnc-pve: 1.4.0-1 proxmox-backup-client: 2.4.3-1 proxmox-backup-file-restore: 2.4.3-1 proxmox-kernel-helper: 7.4-1 proxmox-mail-forward: 0.1.1-1 proxmox-mini-journalreader: 1.3-1 proxmox-offline-mirror-helper: 0.5.2 proxmox-widget-toolkit: 3.7.3 pve-cluster: 7.3-3 pve-container: 4.4-6 pve-docs: 7.4-2 pve-edk2-firmware: 3.20230228-4~bpo11+1 pve-firewall: 4.3-5 pve-firmware: 3.6-5 pve-ha-manager: 3.6.1 pve-i18n: 2.12-1 pve-qemu-kvm: 7.2.0-8 pve-xtermjs: 4.16.0-2 qemu-server: 7.4-4 smartmontools: 7.2-pve3 spiceterm: 3.2-2 swtpm: 0.8.0~bpo11+3 vncterm: 1.7-1 zfsutils-linux: 2.1.11-pve1
The target is a USB adapted nvme drive as directory in the datacenter part of proxmox and set as backups

and the VM config output is :

#Client%3A #My Name #(ID%3A 1) #Email%3A #My Email# #Service ID%3A 79 #Hostname%3A #VmName# #Main IP%3A #VMIp# #IP address allocation%3A ##VMIp# #Product%3#Product name# (ID%3A 24) boot: c bootdisk: scsi0 cipassword: $5$pH8FPBfE$B5dkhxNr5ZHWn13LPFXnjGyJ1daX9WA3fUkwWgYh41A ciuser: underple cores: 7 cpu: kvm64 cpuunits: 1024 ide2: local-lvm:vm-113-cloudinit,media=cdrom ipconfig0: ip=Static ip,gw=Static ip kvm: 1 memory: 32768 meta: creation-qemu=7.0.0,ctime=1668200626 name: VmName net0: e1000=82:10:2B:17:93:47,bridge=vmbr2 onboot: 1 scsi0: local-lvm:vm-113-disk-0,cache=none,format=raw,size=250G scsihw: virtio-scsi-pci smbios1: uuid=0e034933-722e-4424-852e-c3cd4ba42a90 sockets: 1 vcpus: 7 vga: std vmgenid: a91e6fc5-97d6-40fa-983c-6ae590241263

And for this question: "Do you see anything odd in the system logs?"
I cannot see much as I currently have a raid fail and everything is spammed every 4 seconds by drive errors
And for some reason I cant use nano's goto line feature
to try and find anything have to scroll 64K lines for 5 mn to go to bottom
 
Last edited:
And for this question: "Do you see anything odd in the system logs?"
I cannot see much as I currently have a raid fail and everything is spammed every 4 seconds by drive errors
And for some reason I cant use nano's goto line feature
to try and find anything have to scroll 64K lines for 5 mn to go to bottom
journalctl -e to the rescue - that lists the log entries bottom to top :)

Also, you can limit journalctl's output using --since and --until, eg. journalctl --since "1 hour ago"

Now, you say that you currently have issues with your drives? I could imagine that this could be the reason for the hanging backup job. What kind of failure is it exactly? Can you provide any log messages from that?
 
The current fail is one drive died so I'm upgrading from spinning rust to SSDs
And one drive has broken sectors I think, I am not sure if I recall correctly but I don't know how to fix it and it didn't break anything until now (it wasn't broken when installed )
With some grep ignoring the disk failures I have this as logs in var log syslog : ( linked filed )
Last 99% backup was started the 27th at 17:08
and ended ( stuck / stalled ) 3h and some change after
 

Attachments

Last edited:
Other backups work tho btw *
journalctl -e to the rescue - that lists the log entries bottom to top :)

Also, you can limit journalctl's output using --since and --until, eg. journalctl --since "1 hour ago"

Now, you say that you currently have issues with your drives? I could imagine that this could be the reason for the hanging backup job. What kind of failure is it exactly? Can you provide any log messages from that?
Just some didnt and had to do some manual ones like export a pfsence config and other
 
Hmmm, looking at the logs it really looks like that was caused by the failing RAID array. At least I don't really have any other explanation.
It seems like the messages from the raid controller have stopped at a certain time (after a reboot to be precise) - did you change anything, e.g. swapped disks?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!