Backup hangs at 99%

muekno

Member
Dec 15, 2023
61
1
8
I am releative new to PROXMOX but have succesfully set up a PVE 8.1.1 machine, migrated 10 VMs from ESXi
Then setup a separate PROXMOX backupserver, backup store is mirrored ZVS.
Had also successfully made some test backups, one with 500Gb, snapshot mode, everythings fine.
Now I made a backup from a 12GB Debian 11 Linux VM, the same methode as the others and it stucks at 99%. Backup Volume has about 400GB free space.
The 99% where done in 3:44 minutes, now it stuck more nearly an hour.
Forum answer to this problem do not help, Google research too.

Regards
Rainer
 
Hi,
please share the output of pveversion -v, qm config <ID> replacing <ID> with the actual value and the full backup task log from both Proxmox VE and PBS side.
 
Code:
root@pverh:~# pveversion -v
proxmox-ve: 8.1.0 (running kernel: 6.5.13-5-pve)
pve-manager: 8.1.10 (running version: 8.1.10/4b06efb5db453f29)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.5.13-5-pve-signed: 6.5.13-5
proxmox-kernel-6.5: 6.5.13-5
proxmox-kernel-6.5.11-4-pve-signed: 6.5.11-4
ceph-fuse: 17.2.7-pve1
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx8
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-4
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.0
libproxmox-backup-qemu0: 1.4.1
libproxmox-rs-perl: 0.3.3
libpve-access-control: 8.1.3
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.5
libpve-cluster-perl: 8.0.5
libpve-common-perl: 8.1.1
libpve-guest-common-perl: 5.0.6
libpve-http-server-perl: 5.0.6
libpve-network-perl: 0.9.6
libpve-rs-perl: 0.8.8
libpve-storage-perl: 8.1.5
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.4.0-3
proxmox-backup-client: 3.1.5-1
proxmox-backup-file-restore: 3.1.5-1
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.2.3
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.5
proxmox-widget-toolkit: 4.1.5
pve-cluster: 8.0.5
pve-container: 5.0.9
pve-docs: 8.1.5
pve-edk2-firmware: 4.2023.08-4
pve-firewall: 5.0.3
pve-firmware: 3.11-1
pve-ha-manager: 4.0.3
pve-i18n: 3.2.1
pve-qemu-kvm: 8.1.5-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.1.1
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.3-pve2
root@pverh:~#

Code:
root@pverh:~# qm config 505
agent: 1
balloon: 4096
boot: order=scsi0
cores: 2
hotplug: disk,network
lock: backup
memory: 5120

Where do I find the full backup logs
 
You can double click the VM/CT 505 - Backup task in the bottom part of the UI in Proxmox VE. And on the backup server, the tasks can be found in the upper right corner
 
Code:
root@pverh:~# qm config 505
agent: 1
balloon: 4096
boot: order=scsi0
cores: 2
hotplug: disk,network
lock: backup
memory: 5120
This seems to be incomplete, the definition for the SCSI0 disk is missing for example
 
Sorry
Code:
root@pverh:~# qm config 505
agent: 1
balloon: 4096
boot: order=scsi0
cores: 2
hotplug: disk,network
lock: backup
memory: 5120
name: mail
net0: virtio=BC:24:11:81:0D:BD,bridge=vmbr0
onboot: 1
ostype: l26
parent: before_debian_12
scsi0: VM_Space:vm-505-disk-0,size=12G
scsihw: virtio-scsi-pci
smbios1: uuid=5d7206ad-aa98-4c6e-a61c-12b7639cd2ca
startup: order=4,up=30
vmgenid: 930b8893-7349-49c1-8cbd-51fdcf76b6c3
vmstatestorage: VM_Space
root@pverh:~#

Backup Server
Code:
2024-04-24T14:15:57+02:00: starting new backup on datastore 'Backup' from ::ffff:172.16.1.80: "vm/505/2024-04-24T12:15:56Z"
2024-04-24T14:15:57+02:00: GET /previous: 400 Bad Request: no valid previous backup
2024-04-24T14:15:57+02:00: created new fixed index 1 ("vm/505/2024-04-24T12:15:56Z/drive-scsi0.img.fidx")
2024-04-24T14:15:57+02:00: add blob "/backup-pool/vm/505/2024-04-24T12:15:56Z/qemu-server.conf.blob" (329 bytes, comp: 329)

PVE Server
Code:
NFO: starting new backup job: vzdump 505 --notes-template '{{guestname}}' --notification-mode auto --node pverh --remove 0 --mode snapshot --storage backup
INFO: Starting Backup of VM 505 (qemu)
INFO: Backup started at 2024-04-24 14:15:56
INFO: status = running
INFO: VM Name: mail
INFO: include disk 'scsi0' 'VM_Space:vm-505-disk-0' 12G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/505/2024-04-24T12:15:56Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '9e3cdad5-4e25-4057-9415-7720a72a98ef'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO:   2% (324.0 MiB of 12.0 GiB) in 3s, read: 108.0 MiB/s, write: 82.7 MiB/s
INFO:   4% (540.0 MiB of 12.0 GiB) in 6s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:   6% (760.0 MiB of 12.0 GiB) in 9s, read: 73.3 MiB/s, write: 73.3 MiB/s
INFO:   7% (980.0 MiB of 12.0 GiB) in 12s, read: 73.3 MiB/s, write: 73.3 MiB/s
INFO:   9% (1.2 GiB of 12.0 GiB) in 15s, read: 76.0 MiB/s, write: 76.0 MiB/s
INFO:  11% (1.4 GiB of 12.0 GiB) in 18s, read: 76.0 MiB/s, write: 74.7 MiB/s
INFO:  13% (1.6 GiB of 12.0 GiB) in 21s, read: 81.3 MiB/s, write: 80.0 MiB/s
INFO:  15% (1.9 GiB of 12.0 GiB) in 24s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:  17% (2.1 GiB of 12.0 GiB) in 27s, read: 73.3 MiB/s, write: 73.3 MiB/s
INFO:  19% (2.3 GiB of 12.0 GiB) in 30s, read: 86.7 MiB/s, write: 68.0 MiB/s
INFO:  21% (2.5 GiB of 12.0 GiB) in 33s, read: 70.7 MiB/s, write: 68.0 MiB/s
INFO:  22% (2.7 GiB of 12.0 GiB) in 36s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  24% (2.9 GiB of 12.0 GiB) in 39s, read: 62.7 MiB/s, write: 62.7 MiB/s
INFO:  25% (3.1 GiB of 12.0 GiB) in 42s, read: 68.0 MiB/s, write: 64.0 MiB/s
INFO:  27% (3.3 GiB of 12.0 GiB) in 45s, read: 68.0 MiB/s, write: 66.7 MiB/s
INFO:  29% (3.5 GiB of 12.0 GiB) in 48s, read: 77.3 MiB/s, write: 70.7 MiB/s
INFO:  31% (3.7 GiB of 12.0 GiB) in 51s, read: 66.7 MiB/s, write: 49.3 MiB/s
INFO:  32% (3.8 GiB of 12.0 GiB) in 54s, read: 37.3 MiB/s, write: 37.3 MiB/s
INFO:  33% (4.0 GiB of 12.0 GiB) in 59s, read: 31.2 MiB/s, write: 28.0 MiB/s
INFO:  35% (4.3 GiB of 12.0 GiB) in 1m 2s, read: 100.0 MiB/s, write: 81.3 MiB/s
INFO:  37% (4.5 GiB of 12.0 GiB) in 1m 5s, read: 82.7 MiB/s, write: 82.7 MiB/s
INFO:  39% (4.8 GiB of 12.0 GiB) in 1m 8s, read: 82.7 MiB/s, write: 82.7 MiB/s
INFO:  41% (5.0 GiB of 12.0 GiB) in 1m 11s, read: 62.7 MiB/s, write: 54.7 MiB/s
INFO:  42% (5.1 GiB of 12.0 GiB) in 1m 16s, read: 28.0 MiB/s, write: 26.4 MiB/s
INFO:  43% (5.2 GiB of 12.0 GiB) in 1m 19s, read: 40.0 MiB/s, write: 40.0 MiB/s
INFO:  44% (5.3 GiB of 12.0 GiB) in 1m 22s, read: 29.3 MiB/s, write: 29.3 MiB/s
INFO:  45% (5.4 GiB of 12.0 GiB) in 1m 30s, read: 19.5 MiB/s, write: 19.5 MiB/s
INFO:  46% (5.5 GiB of 12.0 GiB) in 1m 33s, read: 28.0 MiB/s, write: 26.7 MiB/s
INFO:  47% (5.7 GiB of 12.0 GiB) in 1m 43s, read: 15.6 MiB/s, write: 14.8 MiB/s
INFO:  48% (5.8 GiB of 12.0 GiB) in 1m 47s, read: 25.0 MiB/s, write: 24.0 MiB/s
INFO:  49% (5.9 GiB of 12.0 GiB) in 1m 54s, read: 14.9 MiB/s, write: 12.0 MiB/s
INFO:  50% (6.1 GiB of 12.0 GiB) in 1m 59s, read: 38.4 MiB/s, write: 33.6 MiB/s
INFO:  51% (6.2 GiB of 12.0 GiB) in 2m 2s, read: 56.0 MiB/s, write: 40.0 MiB/s
INFO:  54% (6.5 GiB of 12.0 GiB) in 2m 5s, read: 90.7 MiB/s, write: 90.7 MiB/s
INFO:  55% (6.7 GiB of 12.0 GiB) in 2m 8s, read: 68.0 MiB/s, write: 68.0 MiB/s
INFO:  57% (6.9 GiB of 12.0 GiB) in 2m 11s, read: 57.3 MiB/s, write: 57.3 MiB/s
INFO:  58% (7.0 GiB of 12.0 GiB) in 2m 14s, read: 32.0 MiB/s, write: 32.0 MiB/s
INFO:  59% (7.1 GiB of 12.0 GiB) in 2m 19s, read: 32.0 MiB/s, write: 30.4 MiB/s
INFO:  60% (7.3 GiB of 12.0 GiB) in 2m 22s, read: 54.7 MiB/s, write: 46.7 MiB/s
INFO:  61% (7.4 GiB of 12.0 GiB) in 2m 25s, read: 38.7 MiB/s, write: 32.0 MiB/s
INFO:  63% (7.6 GiB of 12.0 GiB) in 2m 28s, read: 66.7 MiB/s, write: 57.3 MiB/s
INFO:  64% (7.7 GiB of 12.0 GiB) in 2m 34s, read: 24.0 MiB/s, write: 24.0 MiB/s
INFO:  65% (7.9 GiB of 12.0 GiB) in 2m 37s, read: 57.3 MiB/s, write: 56.0 MiB/s
INFO:  66% (8.0 GiB of 12.0 GiB) in 2m 40s, read: 24.0 MiB/s, write: 24.0 MiB/s
INFO:  67% (8.1 GiB of 12.0 GiB) in 2m 43s, read: 58.7 MiB/s, write: 50.7 MiB/s
INFO:  69% (8.4 GiB of 12.0 GiB) in 2m 46s, read: 85.3 MiB/s, write: 85.3 MiB/s
INFO:  70% (8.5 GiB of 12.0 GiB) in 2m 49s, read: 44.0 MiB/s, write: 44.0 MiB/s
INFO:  72% (8.6 GiB of 12.0 GiB) in 2m 52s, read: 44.0 MiB/s, write: 42.7 MiB/s
INFO:  73% (8.8 GiB of 12.0 GiB) in 2m 56s, read: 46.0 MiB/s, write: 44.0 MiB/s
INFO:  75% (9.1 GiB of 12.0 GiB) in 2m 59s, read: 81.3 MiB/s, write: 68.0 MiB/s
INFO:  77% (9.3 GiB of 12.0 GiB) in 3m 2s, read: 70.7 MiB/s, write: 70.7 MiB/s
INFO:  78% (9.5 GiB of 12.0 GiB) in 3m 5s, read: 70.7 MiB/s, write: 70.7 MiB/s
INFO:  80% (9.7 GiB of 12.0 GiB) in 3m 8s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:  82% (9.9 GiB of 12.0 GiB) in 3m 11s, read: 70.7 MiB/s, write: 70.7 MiB/s
INFO:  83% (10.1 GiB of 12.0 GiB) in 3m 14s, read: 60.0 MiB/s, write: 53.3 MiB/s
INFO:  85% (10.3 GiB of 12.0 GiB) in 3m 17s, read: 81.3 MiB/s, write: 80.0 MiB/s
INFO:  87% (10.5 GiB of 12.0 GiB) in 3m 20s, read: 73.3 MiB/s, write: 73.3 MiB/s
INFO:  88% (10.6 GiB of 12.0 GiB) in 3m 23s, read: 32.0 MiB/s, write: 30.7 MiB/s
INFO:  90% (10.8 GiB of 12.0 GiB) in 3m 26s, read: 64.0 MiB/s, write: 53.3 MiB/s
INFO:  91% (11.0 GiB of 12.0 GiB) in 3m 29s, read: 72.0 MiB/s, write: 68.0 MiB/s
INFO:  93% (11.2 GiB of 12.0 GiB) in 3m 32s, read: 77.3 MiB/s, write: 61.3 MiB/s
INFO:  95% (11.5 GiB of 12.0 GiB) in 3m 35s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:  97% (11.7 GiB of 12.0 GiB) in 3m 38s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:  98% (11.9 GiB of 12.0 GiB) in 3m 41s, read: 68.0 MiB/s, write: 68.0 MiB/s
INFO:  99% (12.0 GiB of 12.0 GiB) in 3m 44s, read: 38.7 MiB/s, write: 38.7 MiB/s
 
Can you check the output of ps faxl for the lines surrounding the kvm --id 505 ... process. What kind of storage type is VM_Space? Are there any errors in the system logs/journal? Is the VM itself still responsive (i.e. can you interact with the guest system)?
 
"Can you check the output of ps faxl for the lines surrounding the kvm --id 505 ... process" please explain how to do

VM Space is mirrored ZVS, all the successfull tests are on the same VM_Space.
The VM works fine, its my mail relay, your notifications are forwarded by this VM.
Where can I find the system log, as I wrote I am quite new to PROXMOX, but not to Linux and was working with ESXI since more than 20 years.
 
"Can you check the output of ps faxl for the lines surrounding the kvm --id 505 ... process" please explain how to do
The command ps faxl can be run in the CLI and the lines surrounding the process kvm --id 505 ... might be interesting. You can also just write ps faxl | grep 505 to get only lines matching 505
VM Space is mirrored ZVS, all the successfull tests are on the same VM_Space.
The VM works fine, its my mail relay, your notifications are forwarded by this VM.
Where can I find the system log, as I wrote I am quite new to PROXMOX, but not to Linux and was working with ESXI since more than 20 years.
It can be found in the UI under [Your node] > System > System Log or you can run journalctl -b to see the system journal for the current boot in the CLI.
 
If you want to gather some debug information, you can install debugger and debug symbols with apt install libproxmox-backup-qemu0-dbgsym pve-qemu-kvm-dbgsym gdb and then run gdb --batch --ex 't a a bt' -p $(cat /var/run/qemu-server/505.pid)
 
Code:
root@pverh:~# ps faxl | grep 505
0     0  354676  334394  20   0   6332  2176 pipe_r S+   pts/0      0:00          \_ grep 505
7     0    1936       1  20   0 7710128 4681092 do_sys Sl ?        83:17 /usr/bin/kvm -id 505 -name mail,debug-threads=on -no-shutdown -chardev socket,id=qmp,path=/var/run/qemu-server/505.qmp,server=on,wait=off -mon chardev=qmp,mode=control -chardev socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5 -mon chardev=qmp-event,mode=control -pidfile /var/run/qemu-server/505.pid -daemonize -smbios type=1,uuid=5d7206ad-aa98-4c6e-a61c-12b7639cd2ca -smp 2,sockets=1,cores=2,maxcpus=2 -nodefaults -boot menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg -vnc unix:/var/run/qemu-server/505.vnc,password=on -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep -m 5120 -device pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e -device pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f -device vmgenid,guid=930b8893-7349-49c1-8cbd-51fdcf76b6c3 -device piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=tablet,bus=uhci.0,port=1 -device VGA,id=vga,bus=pci.0,addr=0x2 -chardev socket,path=/var/run/qemu-server/505.qga,server=on,wait=off,id=qga0 -device virtio-serial,id=qga0,bus=pci.0,addr=0x8 -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3,free-page-reporting=on -iscsi initiator-name=iqn.1993-08.org.debian:01:cd254f85485e -device virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5 -drive file=/dev/zvol/VMs/vm-505-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=io_uring,detect-zeroes=on -device scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100 -netdev type=tap,id=net0,ifname=tap505i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on -device virtio-net-pci,mac=BC:24:11:81:0D:BD,netdev=net0,bus=pci.0,addr=0x12,id=net0,rx_queue_size=1024,tx_queue_size=256 -machine type=pc+pve0
5     0  320877       1  20   0 251996 148760 hrtime Ss  ?          0:16 task UPID:pverh:0004E56D:00A2E5A1:6628F7FC:vzdump:505:root@pam:
root@pverh:~#


System log is quit long, but I can not find something that looks as an error. I assume a line with an error should have the word error in it.


Thanks for your help
 
Hello Fiona,

should that be a solution? Installing Debugger. thebackup still hangs, It is not the problem to stop and discard it and start it new, but I want to learn abut PROXMOX.
Concerning subscription, I am just a one man show, IT consultant, specialiced on opentext GroupWise and SUSE SLES and and opentext OES Server etc. And I have quit a good knowledge in VMware. My customers mostly use VMware with VMsphare and are not amused about Broadcoms new VMware policys. So I am asked to evaluate PROXMOX as a replacement. The best to Evaluate is working in a production enviorement. So I moved my production and test VMs from ESXi to PROXMOX,. So I can recommend it to my customers and support them in migration. So for me a subscription has to be paid from my work. I am shure my customers will subscribe if the decision is towards PROXMOX.

By the way I am in the Business since about 40 years.

Kind Regards
Rainer
 
should that be a solution? Installing Debugger. thebackup still hangs
It's not a solution, but it might lead to one. We can hope to get useful debug information while the issue is present.
It is not the problem to stop and discard it and start it new, but I want to learn abut PROXMOX.
It would be nice if you could gather the debug information first :) If there is an actual bug to be fixed it could help other users too. Although I haven't seen any other reports about such an issue recently, so it doesn't seem too wide-spread at least.
 
OK no problem, but give me a little bit of time. If I install the debuger, would that work for the work for the current process. As I wrote, I did make some (5) test backups in the same way before, all running fine. This backup was done before I would upgrade the Debiabin 11 to Debian12.

Kind Regards
Rainer
 
Hallo Fiona,

I made a snaphot of the 505 VM and the the backup. May it be that is a problem?

Regards
Rainer
 
Now it becomes real critical
1. I have a backup from my 507 600gb VM and after that I did a snapshot, seem successful
now be for restoring the backup, to save time i wanted to go back to the snapshot, but the rollback seams no to end.

2. I had get running the 505 12GB VM in a stable state, so I tried to stop the hanging backup. No success, so I did a shutdown via the shell
and tried to rollback the snapshot (done before the backup) but this rollback will not end too. may be a mistake

From my tests before rollbacks from such small (12GB) VMs did not longer than some minutes.

Remember I did not installed the debug yet.

The 505 VM may be recreated as it is just a mail relay and secondary DNS so it contains no valuable data.

Need really help
 
After rebooting PVE and Backup server all machines came up. The PVE needed more than half an hour to start, but then it came up and startet most VMs even the 505.
ZFS shows no errors. The I started the restore of the 507 from backup. Today morning the restore window stucks at 75%, I think itlost connection as a refresh did not help. Opening a new GUI log on the bottom show restore OK at 4:24 this morning.
But all VMs comming up last night green are showing a yellow !, what mean VM Hardware error. The PVE Server is a 2 XENON porfessional FUJITSU Server with 40GB RAM enough free HD space, so resources shouldn't be a problem.
Every thing worked fine until the problem with the stuck backup starts yesterday.

any hint what to do?

Regards
Rainer
 
Found the 1TB ZFS Disk is full, seams before the restore of the 597 (600GB) the VMs disk was not deleted. Try to get space by deleting no longer needed test VMs am moving other to free space on local -lvm
 
Had removed unused disks 0 and 1 from the 507 VM got now enough free space again
got most VM up and runing. still problem starting the 507 VM 600 GB valuable data. Hope the backup is OK so I can restore again if I dont get the last restore up
 
If I install the debuger, would that work for the work for the current process.
Yes, as long as the version of the debug symbols is the same version as the running binary.

Regarding the other issues you had, if the ZFS is full that could very well have been causing trouble. And for ZFS, being 90% full can already be problematic.

still problem starting the 507 VM 600 GB valuable data. Hope the backup is OK so I can restore again if I dont get the last restore up
What is the error you get when attempting to start the VM (you can double click the start task in the bottom panel in the UI to see the log)?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!