[SOLVED] Backup hangs at 99%

Hello Fiona,

tries a new restore of 507 just some minutes before. Looking at the free space of V_Space the 507 used space was released and I hope if restore is complete in some hour the vm will start up fine.

Do you have any idea why I had disk 0,1,2 of the 507 after the restore tonight. Three times 600 gb is too much for 1TB, even with compression and deduplication. 0, 1, are marked as unused under hardware, so I deleted them and had space to start the rest. My nerves are blank

Kind regards
Rainer
 
Do you have any idea why I had disk 0,1,2 of the 507 after the restore tonight. Three times 600 gb is too much for 1TB, even with compression and deduplication. 0, 1, are marked as unused under hardware, so I deleted them and had space to start the rest. My nerves are blank
I can't tell from here, that would require seeing the whole task history logs. One possibility would be that you changed where the drive is attached (e.g. from scsi0 to scsi1) and then restored a backup. Then the original disk is kept, as it's not considered to be "the same" as in the backup. Another is failed migration attempts where cleanup of the migrated disk doesn't work in some edge cases.
 
Hello Fiona,

whats urgent for me is to get my system complete up and running. My 500gb VM is restored and seams OK,
My internal mail system seam OK too.

But my mail gateway the 505 VM seams to have problems since the disk full. That was the VM starting the problem with backup. Perhaps the existing snapshot, now I did a rollback to that snapshot. The cluster log on the bottom says OK, but I wonder the snapshot screen still shows the snapshot and the you are here.

From my test I know the snapshot was gone. This VM is also my internal DNS

Regards
Rainer

P.S: if there is a backup or restore running then reaction of the VMs are quit slow, while the summary of the PVE Server shows a CPU load of 25% +/- and 15GB free RAM, LAN is 1 GigaBit
 
Last edited:
But my mail gateway the 505 VM seams to have problems since the disk full. That was the VM starting the problem with backup. Perhaps the existing snapshot, now I did a rollback to that snapshot. The cluster log on the bottom says OK, but I wonder the snapshot screen still shows the snapshot and the you are here.
A snapshot is not automatically removed after you roll back to it. Like that you can roll back again, should you need to. Since the current state is immediately slightly different from the snapshot, there always will be a separate You are here.
 
Hello Fiona,


as I remember from my tests some weeks ago the snapshots are gone, but is good as it is.
As far as I testet everything is running fine again, still som certificates isues, but I'l figure that out.

Even we haven't found the reason for the stuck backup (did you check if it my be as there was a snapshot active) I will declare this post as SOLVED.

Thank you once more for your help and patience. I learned a lot and that is good so

Best Regards
Rainer
 
Hello Fiona,

sorry to reopen that thread. In the Meantime I updated the 505 VM from Debian 11 to 12 , beside others, without any problem, everything is running fine, no I tried to backup all my VMs after the upgrade. The first three could be backuped (suspend mode) with out any problem.
The 505 VM dis stuck at 99 % again. Fortunately the VM is still runing and working. Next I stop the backup successfull. Now I wanted delete the faulty backup, was not possible

Code:
TASK ERROR: proxmox-backup-client failed: Error: unable to acquire lock on snapshot directory "/backup-pool/vm/505/2024-04-27T04:58:01Z" - possibly running or in use

Here is the backup log

Code:
INFO: starting new backup job: vzdump 505 --remove 0 --node pverh --notification-mode auto --mode suspend --storage backup --notes-template '{{guestname}}'
INFO: Starting Backup of VM 505 (qemu)
INFO: Backup started at 2024-04-27 06:58:01
INFO: status = running
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: VM Name: mail
INFO: include disk 'scsi0' 'VM_Space:vm-505-disk-0' 12G
INFO: suspending guest
INFO: creating Proxmox Backup Server archive 'vm/505/2024-04-27T04:58:01Z'
INFO: skipping guest-agent 'fs-freeze', agent configured but not running?
INFO: started backup task '7ec0a7ef-814f-46c4-b6cd-1de0d1cb8841'
INFO: resuming VM again after 5 seconds
INFO: scsi0: dirty-bitmap status: created new
INFO:   2% (280.0 MiB of 12.0 GiB) in 3s, read: 93.3 MiB/s, write: 68.0 MiB/s
INFO:   4% (508.0 MiB of 12.0 GiB) in 6s, read: 76.0 MiB/s, write: 76.0 MiB/s
INFO:   5% (724.0 MiB of 12.0 GiB) in 9s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:   7% (948.0 MiB of 12.0 GiB) in 12s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:   9% (1.2 GiB of 12.0 GiB) in 15s, read: 78.7 MiB/s, write: 78.7 MiB/s
INFO:  11% (1.4 GiB of 12.0 GiB) in 18s, read: 72.0 MiB/s, write: 70.7 MiB/s
INFO:  13% (1.6 GiB of 12.0 GiB) in 21s, read: 80.0 MiB/s, write: 80.0 MiB/s
INFO:  15% (1.8 GiB of 12.0 GiB) in 24s, read: 78.7 MiB/s, write: 77.3 MiB/s
INFO:  17% (2.0 GiB of 12.0 GiB) in 27s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:  19% (2.3 GiB of 12.0 GiB) in 30s, read: 93.3 MiB/s, write: 74.7 MiB/s
INFO:  20% (2.5 GiB of 12.0 GiB) in 33s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  22% (2.7 GiB of 12.0 GiB) in 36s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  23% (2.9 GiB of 12.0 GiB) in 39s, read: 53.3 MiB/s, write: 53.3 MiB/s
INFO:  25% (3.1 GiB of 12.0 GiB) in 42s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  27% (3.3 GiB of 12.0 GiB) in 45s, read: 68.0 MiB/s, write: 68.0 MiB/s
INFO:  28% (3.4 GiB of 12.0 GiB) in 48s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  29% (3.6 GiB of 12.0 GiB) in 51s, read: 49.3 MiB/s, write: 41.3 MiB/s
INFO:  31% (3.8 GiB of 12.0 GiB) in 54s, read: 57.3 MiB/s, write: 54.7 MiB/s
INFO:  32% (3.9 GiB of 12.0 GiB) in 58s, read: 28.0 MiB/s, write: 27.0 MiB/s
INFO:  33% (4.0 GiB of 12.0 GiB) in 1m 1s, read: 42.7 MiB/s, write: 38.7 MiB/s
INFO:  34% (4.1 GiB of 12.0 GiB) in 1m 4s, read: 50.7 MiB/s, write: 36.0 MiB/s
INFO:  35% (4.2 GiB of 12.0 GiB) in 1m 7s, read: 26.7 MiB/s, write: 20.0 MiB/s
INFO:  36% (4.3 GiB of 12.0 GiB) in 1m 12s, read: 23.2 MiB/s, write: 23.2 MiB/s
INFO:  38% (4.6 GiB of 12.0 GiB) in 1m 15s, read: 82.7 MiB/s, write: 82.7 MiB/s
INFO:  39% (4.8 GiB of 12.0 GiB) in 1m 19s, read: 49.0 MiB/s, write: 49.0 MiB/s
INFO:  40% (4.9 GiB of 12.0 GiB) in 1m 22s, read: 50.7 MiB/s, write: 50.7 MiB/s
INFO:  41% (5.0 GiB of 12.0 GiB) in 1m 25s, read: 33.3 MiB/s, write: 30.7 MiB/s
INFO:  42% (5.1 GiB of 12.0 GiB) in 1m 28s, read: 33.3 MiB/s, write: 29.3 MiB/s
INFO:  43% (5.2 GiB of 12.0 GiB) in 1m 31s, read: 36.0 MiB/s, write: 29.3 MiB/s
INFO:  44% (5.3 GiB of 12.0 GiB) in 1m 35s, read: 19.0 MiB/s, write: 16.0 MiB/s
INFO:  45% (5.4 GiB of 12.0 GiB) in 1m 39s, read: 36.0 MiB/s, write: 31.0 MiB/s
INFO:  46% (5.6 GiB of 12.0 GiB) in 1m 44s, read: 26.4 MiB/s, write: 18.4 MiB/s
INFO:  47% (5.7 GiB of 12.0 GiB) in 1m 50s, read: 20.7 MiB/s, write: 20.0 MiB/s
INFO:  48% (5.8 GiB of 12.0 GiB) in 1m 53s, read: 45.3 MiB/s, write: 45.3 MiB/s
INFO:  49% (5.9 GiB of 12.0 GiB) in 1m 58s, read: 19.2 MiB/s, write: 19.2 MiB/s
INFO:  50% (6.0 GiB of 12.0 GiB) in 2m 1s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  51% (6.2 GiB of 12.0 GiB) in 2m 4s, read: 54.7 MiB/s, write: 33.3 MiB/s
INFO:  52% (6.4 GiB of 12.0 GiB) in 2m 7s, read: 60.0 MiB/s, write: 60.0 MiB/s
INFO:  54% (6.5 GiB of 12.0 GiB) in 2m 10s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  55% (6.6 GiB of 12.0 GiB) in 2m 13s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  56% (6.8 GiB of 12.0 GiB) in 2m 16s, read: 53.3 MiB/s, write: 53.3 MiB/s
INFO:  57% (6.9 GiB of 12.0 GiB) in 2m 19s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  58% (7.0 GiB of 12.0 GiB) in 2m 22s, read: 10.7 MiB/s, write: 10.7 MiB/s
INFO:  59% (7.1 GiB of 12.0 GiB) in 2m 25s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  60% (7.2 GiB of 12.0 GiB) in 2m 29s, read: 41.0 MiB/s, write: 37.0 MiB/s
INFO:  61% (7.4 GiB of 12.0 GiB) in 2m 32s, read: 46.7 MiB/s, write: 37.3 MiB/s
INFO:  62% (7.4 GiB of 12.0 GiB) in 2m 35s, read: 22.7 MiB/s, write: 20.0 MiB/s
INFO:  63% (7.6 GiB of 12.0 GiB) in 2m 38s, read: 54.7 MiB/s, write: 42.7 MiB/s
INFO:  64% (7.7 GiB of 12.0 GiB) in 2m 42s, read: 20.0 MiB/s, write: 15.0 MiB/s
INFO:  65% (7.8 GiB of 12.0 GiB) in 2m 47s, read: 32.8 MiB/s, write: 24.8 MiB/s
INFO:  66% (7.9 GiB of 12.0 GiB) in 2m 50s, read: 26.7 MiB/s, write: 24.0 MiB/s
INFO:  67% (8.1 GiB of 12.0 GiB) in 2m 55s, read: 28.0 MiB/s, write: 25.6 MiB/s
INFO:  68% (8.2 GiB of 12.0 GiB) in 2m 58s, read: 44.0 MiB/s, write: 36.0 MiB/s
INFO:  69% (8.3 GiB of 12.0 GiB) in 3m 1s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  71% (8.5 GiB of 12.0 GiB) in 3m 4s, read: 69.3 MiB/s, write: 69.3 MiB/s
INFO:  72% (8.7 GiB of 12.0 GiB) in 3m 7s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  73% (8.8 GiB of 12.0 GiB) in 3m 10s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  74% (8.9 GiB of 12.0 GiB) in 3m 14s, read: 23.0 MiB/s, write: 22.0 MiB/s
INFO:  75% (9.1 GiB of 12.0 GiB) in 3m 17s, read: 61.3 MiB/s, write: 57.3 MiB/s
INFO:  76% (9.2 GiB of 12.0 GiB) in 3m 20s, read: 54.7 MiB/s, write: 46.7 MiB/s
INFO:  78% (9.4 GiB of 12.0 GiB) in 3m 23s, read: 70.7 MiB/s, write: 65.3 MiB/s
INFO:  80% (9.6 GiB of 12.0 GiB) in 3m 26s, read: 70.7 MiB/s, write: 62.7 MiB/s
INFO:  81% (9.8 GiB of 12.0 GiB) in 3m 29s, read: 69.3 MiB/s, write: 69.3 MiB/s
INFO:  83% (10.0 GiB of 12.0 GiB) in 3m 32s, read: 65.3 MiB/s, write: 50.7 MiB/s
INFO:  85% (10.2 GiB of 12.0 GiB) in 3m 35s, read: 61.3 MiB/s, write: 60.0 MiB/s
INFO:  86% (10.4 GiB of 12.0 GiB) in 3m 38s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:  88% (10.6 GiB of 12.0 GiB) in 3m 41s, read: 56.0 MiB/s, write: 54.7 MiB/s
INFO:  89% (10.7 GiB of 12.0 GiB) in 3m 46s, read: 31.2 MiB/s, write: 28.0 MiB/s
INFO:  91% (11.0 GiB of 12.0 GiB) in 3m 49s, read: 77.3 MiB/s, write: 77.3 MiB/s
INFO:  92% (11.1 GiB of 12.0 GiB) in 3m 52s, read: 60.0 MiB/s, write: 58.7 MiB/s
INFO:  94% (11.3 GiB of 12.0 GiB) in 3m 55s, read: 48.0 MiB/s, write: 48.0 MiB/s
INFO:  95% (11.5 GiB of 12.0 GiB) in 3m 58s, read: 77.3 MiB/s, write: 74.7 MiB/s
INFO:  97% (11.8 GiB of 12.0 GiB) in 4m 1s, read: 81.3 MiB/s, write: 81.3 MiB/s
INFO:  99% (12.0 GiB of 12.0 GiB) in 4m 4s, read: 81.3 MiB/s, write: 81.3 MiB/s
ERROR: interrupted by signal
INFO: aborting backup job

Check for running was via a SSH connection
Now after triying to connect via console I can not access the VM any more
Code:
VM 505 qmp command 'set_password' failed - unable to connect to VM 505 qmp socket - timeout after 51 retries
TASK ERROR: Failed to run vncproxy.

No ssh connection possible
VM shows a disk sign

Thats my mail gateway, so thats critical

what to do unlock that VM

Regards

Rainer

P.S.

after qm unlock and finaly qm stop I got the VM down. Starting it again let it run fine.

Must I live with never be able to backup that VM
 
Last edited:
Hello Fiona,

sorry to reopen that thread. In the Meantime I updated the 505 VM from Debian 11 to 12 , beside others, without any problem, everything is running fine, no I tried to backup all my VMs after the upgrade. The first three could be backuped (suspend mode) with out any problem.
The 505 VM dis stuck at 99 % again. Fortunately the VM is still runing and working. Next I stop the backup successfull. Now I wanted delete the faulty backup, was not possible

Code:
TASK ERROR: proxmox-backup-client failed: Error: unable to acquire lock on snapshot directory "/backup-pool/vm/505/2024-04-27T04:58:01Z" - possibly running or in use

Here is the backup log

Code:
INFO: starting new backup job: vzdump 505 --remove 0 --node pverh --notification-mode auto --mode suspend --storage backup --notes-template '{{guestname}}'
INFO: Starting Backup of VM 505 (qemu)
INFO: Backup started at 2024-04-27 06:58:01
INFO: status = running
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: VM Name: mail
INFO: include disk 'scsi0' 'VM_Space:vm-505-disk-0' 12G
INFO: suspending guest
INFO: creating Proxmox Backup Server archive 'vm/505/2024-04-27T04:58:01Z'
INFO: skipping guest-agent 'fs-freeze', agent configured but not running?
INFO: started backup task '7ec0a7ef-814f-46c4-b6cd-1de0d1cb8841'
INFO: resuming VM again after 5 seconds
INFO: scsi0: dirty-bitmap status: created new
INFO:   2% (280.0 MiB of 12.0 GiB) in 3s, read: 93.3 MiB/s, write: 68.0 MiB/s
INFO:   4% (508.0 MiB of 12.0 GiB) in 6s, read: 76.0 MiB/s, write: 76.0 MiB/s
INFO:   5% (724.0 MiB of 12.0 GiB) in 9s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:   7% (948.0 MiB of 12.0 GiB) in 12s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:   9% (1.2 GiB of 12.0 GiB) in 15s, read: 78.7 MiB/s, write: 78.7 MiB/s
INFO:  11% (1.4 GiB of 12.0 GiB) in 18s, read: 72.0 MiB/s, write: 70.7 MiB/s
INFO:  13% (1.6 GiB of 12.0 GiB) in 21s, read: 80.0 MiB/s, write: 80.0 MiB/s
INFO:  15% (1.8 GiB of 12.0 GiB) in 24s, read: 78.7 MiB/s, write: 77.3 MiB/s
INFO:  17% (2.0 GiB of 12.0 GiB) in 27s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:  19% (2.3 GiB of 12.0 GiB) in 30s, read: 93.3 MiB/s, write: 74.7 MiB/s
INFO:  20% (2.5 GiB of 12.0 GiB) in 33s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  22% (2.7 GiB of 12.0 GiB) in 36s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  23% (2.9 GiB of 12.0 GiB) in 39s, read: 53.3 MiB/s, write: 53.3 MiB/s
INFO:  25% (3.1 GiB of 12.0 GiB) in 42s, read: 66.7 MiB/s, write: 66.7 MiB/s
INFO:  27% (3.3 GiB of 12.0 GiB) in 45s, read: 68.0 MiB/s, write: 68.0 MiB/s
INFO:  28% (3.4 GiB of 12.0 GiB) in 48s, read: 61.3 MiB/s, write: 61.3 MiB/s
INFO:  29% (3.6 GiB of 12.0 GiB) in 51s, read: 49.3 MiB/s, write: 41.3 MiB/s
INFO:  31% (3.8 GiB of 12.0 GiB) in 54s, read: 57.3 MiB/s, write: 54.7 MiB/s
INFO:  32% (3.9 GiB of 12.0 GiB) in 58s, read: 28.0 MiB/s, write: 27.0 MiB/s
INFO:  33% (4.0 GiB of 12.0 GiB) in 1m 1s, read: 42.7 MiB/s, write: 38.7 MiB/s
INFO:  34% (4.1 GiB of 12.0 GiB) in 1m 4s, read: 50.7 MiB/s, write: 36.0 MiB/s
INFO:  35% (4.2 GiB of 12.0 GiB) in 1m 7s, read: 26.7 MiB/s, write: 20.0 MiB/s
INFO:  36% (4.3 GiB of 12.0 GiB) in 1m 12s, read: 23.2 MiB/s, write: 23.2 MiB/s
INFO:  38% (4.6 GiB of 12.0 GiB) in 1m 15s, read: 82.7 MiB/s, write: 82.7 MiB/s
INFO:  39% (4.8 GiB of 12.0 GiB) in 1m 19s, read: 49.0 MiB/s, write: 49.0 MiB/s
INFO:  40% (4.9 GiB of 12.0 GiB) in 1m 22s, read: 50.7 MiB/s, write: 50.7 MiB/s
INFO:  41% (5.0 GiB of 12.0 GiB) in 1m 25s, read: 33.3 MiB/s, write: 30.7 MiB/s
INFO:  42% (5.1 GiB of 12.0 GiB) in 1m 28s, read: 33.3 MiB/s, write: 29.3 MiB/s
INFO:  43% (5.2 GiB of 12.0 GiB) in 1m 31s, read: 36.0 MiB/s, write: 29.3 MiB/s
INFO:  44% (5.3 GiB of 12.0 GiB) in 1m 35s, read: 19.0 MiB/s, write: 16.0 MiB/s
INFO:  45% (5.4 GiB of 12.0 GiB) in 1m 39s, read: 36.0 MiB/s, write: 31.0 MiB/s
INFO:  46% (5.6 GiB of 12.0 GiB) in 1m 44s, read: 26.4 MiB/s, write: 18.4 MiB/s
INFO:  47% (5.7 GiB of 12.0 GiB) in 1m 50s, read: 20.7 MiB/s, write: 20.0 MiB/s
INFO:  48% (5.8 GiB of 12.0 GiB) in 1m 53s, read: 45.3 MiB/s, write: 45.3 MiB/s
INFO:  49% (5.9 GiB of 12.0 GiB) in 1m 58s, read: 19.2 MiB/s, write: 19.2 MiB/s
INFO:  50% (6.0 GiB of 12.0 GiB) in 2m 1s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  51% (6.2 GiB of 12.0 GiB) in 2m 4s, read: 54.7 MiB/s, write: 33.3 MiB/s
INFO:  52% (6.4 GiB of 12.0 GiB) in 2m 7s, read: 60.0 MiB/s, write: 60.0 MiB/s
INFO:  54% (6.5 GiB of 12.0 GiB) in 2m 10s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  55% (6.6 GiB of 12.0 GiB) in 2m 13s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  56% (6.8 GiB of 12.0 GiB) in 2m 16s, read: 53.3 MiB/s, write: 53.3 MiB/s
INFO:  57% (6.9 GiB of 12.0 GiB) in 2m 19s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  58% (7.0 GiB of 12.0 GiB) in 2m 22s, read: 10.7 MiB/s, write: 10.7 MiB/s
INFO:  59% (7.1 GiB of 12.0 GiB) in 2m 25s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  60% (7.2 GiB of 12.0 GiB) in 2m 29s, read: 41.0 MiB/s, write: 37.0 MiB/s
INFO:  61% (7.4 GiB of 12.0 GiB) in 2m 32s, read: 46.7 MiB/s, write: 37.3 MiB/s
INFO:  62% (7.4 GiB of 12.0 GiB) in 2m 35s, read: 22.7 MiB/s, write: 20.0 MiB/s
INFO:  63% (7.6 GiB of 12.0 GiB) in 2m 38s, read: 54.7 MiB/s, write: 42.7 MiB/s
INFO:  64% (7.7 GiB of 12.0 GiB) in 2m 42s, read: 20.0 MiB/s, write: 15.0 MiB/s
INFO:  65% (7.8 GiB of 12.0 GiB) in 2m 47s, read: 32.8 MiB/s, write: 24.8 MiB/s
INFO:  66% (7.9 GiB of 12.0 GiB) in 2m 50s, read: 26.7 MiB/s, write: 24.0 MiB/s
INFO:  67% (8.1 GiB of 12.0 GiB) in 2m 55s, read: 28.0 MiB/s, write: 25.6 MiB/s
INFO:  68% (8.2 GiB of 12.0 GiB) in 2m 58s, read: 44.0 MiB/s, write: 36.0 MiB/s
INFO:  69% (8.3 GiB of 12.0 GiB) in 3m 1s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  71% (8.5 GiB of 12.0 GiB) in 3m 4s, read: 69.3 MiB/s, write: 69.3 MiB/s
INFO:  72% (8.7 GiB of 12.0 GiB) in 3m 7s, read: 41.3 MiB/s, write: 41.3 MiB/s
INFO:  73% (8.8 GiB of 12.0 GiB) in 3m 10s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  74% (8.9 GiB of 12.0 GiB) in 3m 14s, read: 23.0 MiB/s, write: 22.0 MiB/s
INFO:  75% (9.1 GiB of 12.0 GiB) in 3m 17s, read: 61.3 MiB/s, write: 57.3 MiB/s
INFO:  76% (9.2 GiB of 12.0 GiB) in 3m 20s, read: 54.7 MiB/s, write: 46.7 MiB/s
INFO:  78% (9.4 GiB of 12.0 GiB) in 3m 23s, read: 70.7 MiB/s, write: 65.3 MiB/s
INFO:  80% (9.6 GiB of 12.0 GiB) in 3m 26s, read: 70.7 MiB/s, write: 62.7 MiB/s
INFO:  81% (9.8 GiB of 12.0 GiB) in 3m 29s, read: 69.3 MiB/s, write: 69.3 MiB/s
INFO:  83% (10.0 GiB of 12.0 GiB) in 3m 32s, read: 65.3 MiB/s, write: 50.7 MiB/s
INFO:  85% (10.2 GiB of 12.0 GiB) in 3m 35s, read: 61.3 MiB/s, write: 60.0 MiB/s
INFO:  86% (10.4 GiB of 12.0 GiB) in 3m 38s, read: 74.7 MiB/s, write: 74.7 MiB/s
INFO:  88% (10.6 GiB of 12.0 GiB) in 3m 41s, read: 56.0 MiB/s, write: 54.7 MiB/s
INFO:  89% (10.7 GiB of 12.0 GiB) in 3m 46s, read: 31.2 MiB/s, write: 28.0 MiB/s
INFO:  91% (11.0 GiB of 12.0 GiB) in 3m 49s, read: 77.3 MiB/s, write: 77.3 MiB/s
INFO:  92% (11.1 GiB of 12.0 GiB) in 3m 52s, read: 60.0 MiB/s, write: 58.7 MiB/s
INFO:  94% (11.3 GiB of 12.0 GiB) in 3m 55s, read: 48.0 MiB/s, write: 48.0 MiB/s
INFO:  95% (11.5 GiB of 12.0 GiB) in 3m 58s, read: 77.3 MiB/s, write: 74.7 MiB/s
INFO:  97% (11.8 GiB of 12.0 GiB) in 4m 1s, read: 81.3 MiB/s, write: 81.3 MiB/s
INFO:  99% (12.0 GiB of 12.0 GiB) in 4m 4s, read: 81.3 MiB/s, write: 81.3 MiB/s
ERROR: interrupted by signal
INFO: aborting backup job

Check for running was via a SSH connection
Now after triying to connect via console I can not access the VM any more
Code:
VM 505 qmp command 'set_password' failed - unable to connect to VM 505 qmp socket - timeout after 51 retries
TASK ERROR: Failed to run vncproxy.

No ssh connection possible
VM shows a disk sign

Thats my mail gateway, so thats critical

what to do unlock that VM

Regards

Rainer

P.S.

after qm unlock and finaly qm stop I got the VM down. Starting it again let it run fine.

Must I live with never be able to backup that VM
Should it happen again, please try to obtain the debug trace with GDB while the VM/backup is stuck. If we are lucky, the cause of the issue can be seen there.
 
Hello Fiona,

sorry for the delay, but had been very busy, in testing PROXMOX and mirgrating successfull a customers environment from ESXi to PROMOX.

So now I had time again for my production system. The backup problem with the 505 VM still exists and now with the 502 VM too, all other 8 VMs have now problem. I installed the debug code and below the output of the recommended command
Code:
root@pverh:~# gdb --batch --ex 't a a bt' -p $(cat /var/run/qemu-server/502.pid)
[New LWP 2655757]
[New LWP 2655778]
[New LWP 2655779]
[New LWP 2655780]
[New LWP 2655784]
[New LWP 3686088]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x000071a36c955256 in __ppoll (fds=0x565582738080, nfds=79, timeout=<optimized out>, timeout@entry=0x7fff4863f4d0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
42      ../sysdeps/unix/sysv/linux/ppoll.c: No such file or directory.

Thread 7 (Thread 0x71a369eb34c0 (LWP 3686088) "iou-wrk-2655756"):
#0  0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x0

Thread 6 (Thread 0x71a2db4006c0 (LWP 2655784) "vnc_worker"):
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x565582b85bf8) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x565582b85bf8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at ./nptl/futex-internal.c:87
#2  0x000071a36c8deefb in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x565582b85bf8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at ./nptl/futex-internal.c:139
#3  0x000071a36c8e1558 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x565582b85c08, cond=0x565582b85bd0) at ./nptl/pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=cond@entry=0x565582b85bd0, mutex=mutex@entry=0x565582b85c08) at ./nptl/pthread_cond_wait.c:618
#5  0x000056557fba6deb in qemu_cond_wait_impl (cond=0x565582b85bd0, mutex=0x565582b85c08, file=0x56557fc6bcf4 "../ui/vnc-jobs.c", line=248) at ../util/qemu-thread-posix.c:225
#6  0x000056557f632f2b in vnc_worker_thread_loop (queue=queue@entry=0x565582b85bd0) at ../ui/vnc-jobs.c:248
#7  0x000056557f633bc8 in vnc_worker_thread (arg=arg@entry=0x565582b85bd0) at ../ui/vnc-jobs.c:362
#8  0x000056557fba62d8 in qemu_thread_start (args=0x5655826fee80) at ../util/qemu-thread-posix.c:541
#9  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#10 0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 5 (Thread 0x71a363e006c0 (LWP 2655780) "CPU 1/KVM"):
#0  __GI___ioctl (fd=26, request=request@entry=44672) at ../sysdeps/unix/sysv/linux/ioctl.c:36
#1  0x000056557fa0c6cf in kvm_vcpu_ioctl (cpu=cpu@entry=0x5655826f82f0, type=type@entry=44672) at ../accel/kvm/kvm-all.c:3179
#2  0x000056557fa0cba5 in kvm_cpu_exec (cpu=cpu@entry=0x5655826f82f0) at ../accel/kvm/kvm-all.c:2991
#3  0x000056557fa0e08d in kvm_vcpu_thread_fn (arg=arg@entry=0x5655826f82f0) at ../accel/kvm/kvm-accel-ops.c:51
#4  0x000056557fba62d8 in qemu_thread_start (args=0x565582701330) at ../util/qemu-thread-posix.c:541
#5  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6  0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 4 (Thread 0x71a3690006c0 (LWP 2655779) "CPU 0/KVM"):
#0  __GI___ioctl (fd=24, request=request@entry=44672) at ../sysdeps/unix/sysv/linux/ioctl.c:36
#1  0x000056557fa0c6cf in kvm_vcpu_ioctl (cpu=cpu@entry=0x5655826c7690, type=type@entry=44672) at ../accel/kvm/kvm-all.c:3179
#2  0x000056557fa0cba5 in kvm_cpu_exec (cpu=cpu@entry=0x5655826c7690) at ../accel/kvm/kvm-all.c:2991
#3  0x000056557fa0e08d in kvm_vcpu_thread_fn (arg=arg@entry=0x5655826c7690) at ../accel/kvm/kvm-accel-ops.c:51
#4  0x000056557fba62d8 in qemu_thread_start (args=0x5655825ec190) at ../util/qemu-thread-posix.c:541
#5  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6  0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 3 (Thread 0x71a369eb34c0 (LWP 2655778) "vhost-2655756"):
#0  0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x0

Thread 2 (Thread 0x71a369a006c0 (LWP 2655757) "call_rcu"):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x000056557fba745a in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at ./include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x5655804fac68 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:464
#3  0x000056557fbb0d62 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:278
#4  0x000056557fba62d8 in qemu_thread_start (args=0x565582373df0) at ../util/qemu-thread-posix.c:541
#5  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6  0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x71a369eb34c0 (LWP 2655756) "kvm"):
#0  0x000071a36c955256 in __ppoll (fds=0x565582738080, nfds=79, timeout=<optimized out>, timeout@entry=0x7fff4863f4d0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
#1  0x000056557fbbc55e in ppoll (__ss=0x0, __timeout=0x7fff4863f4d0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/poll2.h:64
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=618385549) at ../util/qemu-timer.c:351
#3  0x000056557fbb9e4e in os_host_main_loop_wait (timeout=618385549) at ../util/main-loop.c:308
#4  main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:592
#5  0x000056557f816aa7 in qemu_main_loop () at ../softmmu/runstate.c:732
#6  0x000056557fa16f46 in qemu_default_main () at ../softmmu/main.c:37
#7  0x000071a36c88024a in __libc_start_call_main (main=main@entry=0x56557f607480 <main>, argc=argc@entry=64, argv=argv@entry=0x7fff4863f6e8) at ../sysdeps/nptl/libc_start_call_main.h:58
#8  0x000071a36c880305 in __libc_start_main_impl (main=0x56557f607480 <main>, argc=64, argv=0x7fff4863f6e8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff4863f6d8) at ../csu/libc-start.c:360
#9  0x000056557f6090a1 in _start ()
[Inferior 1 (process 2655756) detached]
root@pverh:~#

that was done after trying a local backup thats stucks at INFO: 99% (12.0 GiB of 12.0 GiB) in 3m 11s, read: 519.6 MiB/s, write: 11.7 MiB/s

Regards
Rainer
 
Hello Fiona,

sorry for the delay, but had been very busy, in testing PROXMOX and mirgrating successfull a customers environment from ESXi to PROMOX.

So now I had time again for my production system. The backup problem with the 505 VM still exists and now with the 502 VM too, all other 8 VMs have now problem. I installed the debug code and below the output of the recommended command
Code:
root@pverh:~# gdb --batch --ex 't a a bt' -p $(cat /var/run/qemu-server/502.pid)
[New LWP 2655757]
[New LWP 2655778]
[New LWP 2655779]
[New LWP 2655780]
[New LWP 2655784]
[New LWP 3686088]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x000071a36c955256 in __ppoll (fds=0x565582738080, nfds=79, timeout=<optimized out>, timeout@entry=0x7fff4863f4d0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
42      ../sysdeps/unix/sysv/linux/ppoll.c: No such file or directory.

Thread 7 (Thread 0x71a369eb34c0 (LWP 3686088) "iou-wrk-2655756"):
#0  0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x0

Thread 6 (Thread 0x71a2db4006c0 (LWP 2655784) "vnc_worker"):
#0  __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x565582b85bf8) at ./nptl/futex-internal.c:57
#1  __futex_abstimed_wait_common (futex_word=futex_word@entry=0x565582b85bf8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0, cancel=cancel@entry=true) at ./nptl/futex-internal.c:87
#2  0x000071a36c8deefb in __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x565582b85bf8, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at ./nptl/futex-internal.c:139
#3  0x000071a36c8e1558 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x565582b85c08, cond=0x565582b85bd0) at ./nptl/pthread_cond_wait.c:503
#4  ___pthread_cond_wait (cond=cond@entry=0x565582b85bd0, mutex=mutex@entry=0x565582b85c08) at ./nptl/pthread_cond_wait.c:618
#5  0x000056557fba6deb in qemu_cond_wait_impl (cond=0x565582b85bd0, mutex=0x565582b85c08, file=0x56557fc6bcf4 "../ui/vnc-jobs.c", line=248) at ../util/qemu-thread-posix.c:225
#6  0x000056557f632f2b in vnc_worker_thread_loop (queue=queue@entry=0x565582b85bd0) at ../ui/vnc-jobs.c:248
#7  0x000056557f633bc8 in vnc_worker_thread (arg=arg@entry=0x565582b85bd0) at ../ui/vnc-jobs.c:362
#8  0x000056557fba62d8 in qemu_thread_start (args=0x5655826fee80) at ../util/qemu-thread-posix.c:541
#9  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#10 0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 5 (Thread 0x71a363e006c0 (LWP 2655780) "CPU 1/KVM"):
#0  __GI___ioctl (fd=26, request=request@entry=44672) at ../sysdeps/unix/sysv/linux/ioctl.c:36
#1  0x000056557fa0c6cf in kvm_vcpu_ioctl (cpu=cpu@entry=0x5655826f82f0, type=type@entry=44672) at ../accel/kvm/kvm-all.c:3179
#2  0x000056557fa0cba5 in kvm_cpu_exec (cpu=cpu@entry=0x5655826f82f0) at ../accel/kvm/kvm-all.c:2991
#3  0x000056557fa0e08d in kvm_vcpu_thread_fn (arg=arg@entry=0x5655826f82f0) at ../accel/kvm/kvm-accel-ops.c:51
#4  0x000056557fba62d8 in qemu_thread_start (args=0x565582701330) at ../util/qemu-thread-posix.c:541
#5  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6  0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 4 (Thread 0x71a3690006c0 (LWP 2655779) "CPU 0/KVM"):
#0  __GI___ioctl (fd=24, request=request@entry=44672) at ../sysdeps/unix/sysv/linux/ioctl.c:36
#1  0x000056557fa0c6cf in kvm_vcpu_ioctl (cpu=cpu@entry=0x5655826c7690, type=type@entry=44672) at ../accel/kvm/kvm-all.c:3179
#2  0x000056557fa0cba5 in kvm_cpu_exec (cpu=cpu@entry=0x5655826c7690) at ../accel/kvm/kvm-all.c:2991
#3  0x000056557fa0e08d in kvm_vcpu_thread_fn (arg=arg@entry=0x5655826c7690) at ../accel/kvm/kvm-accel-ops.c:51
#4  0x000056557fba62d8 in qemu_thread_start (args=0x5655825ec190) at ../util/qemu-thread-posix.c:541
#5  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6  0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 3 (Thread 0x71a369eb34c0 (LWP 2655778) "vhost-2655756"):
#0  0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x0

Thread 2 (Thread 0x71a369a006c0 (LWP 2655757) "call_rcu"):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x000056557fba745a in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at ./include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x5655804fac68 <rcu_call_ready_event>) at ../util/qemu-thread-posix.c:464
#3  0x000056557fbb0d62 in call_rcu_thread (opaque=opaque@entry=0x0) at ../util/rcu.c:278
#4  0x000056557fba62d8 in qemu_thread_start (args=0x565582373df0) at ../util/qemu-thread-posix.c:541
#5  0x000071a36c8e2134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#6  0x000071a36c9627dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x71a369eb34c0 (LWP 2655756) "kvm"):
#0  0x000071a36c955256 in __ppoll (fds=0x565582738080, nfds=79, timeout=<optimized out>, timeout@entry=0x7fff4863f4d0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:42
#1  0x000056557fbbc55e in ppoll (__ss=0x0, __timeout=0x7fff4863f4d0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/poll2.h:64
#2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=timeout@entry=618385549) at ../util/qemu-timer.c:351
#3  0x000056557fbb9e4e in os_host_main_loop_wait (timeout=618385549) at ../util/main-loop.c:308
#4  main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:592
#5  0x000056557f816aa7 in qemu_main_loop () at ../softmmu/runstate.c:732
#6  0x000056557fa16f46 in qemu_default_main () at ../softmmu/main.c:37
#7  0x000071a36c88024a in __libc_start_call_main (main=main@entry=0x56557f607480 <main>, argc=argc@entry=64, argv=argv@entry=0x7fff4863f6e8) at ../sysdeps/nptl/libc_start_call_main.h:58
#8  0x000071a36c880305 in __libc_start_main_impl (main=0x56557f607480 <main>, argc=64, argv=0x7fff4863f6e8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff4863f6d8) at ../csu/libc-start.c:360
#9  0x000056557f6090a1 in _start ()
[Inferior 1 (process 2655756) detached]
root@pverh:~#

that was done after trying a local backup thats stucks at INFO: 99% (12.0 GiB of 12.0 GiB) in 3m 11s, read: 519.6 MiB/s, write: 11.7 MiB/s

Regards
Rainer
Unfortunately, the backtrace does not look special. However, what's interesting is that there are no threads for the backup anymore so seemingly it already finished on the QEMU side. What does qm status 502 --verbose show while it's stuck?
 
Code:
root@pverh:~# qm status 502 --verbose
balloon: 1073741824
balloon_min: 1073741824
ballooninfo:
        actual: 1073741824
        free_mem: 145195008
        last_update: 1716888808
        major_page_faults: 3045
        max_mem: 2147483648
        mem_swapped_in: 1888256
        mem_swapped_out: 174514176
        minor_page_faults: 130932762
        total_mem: 988774400
blockstat:
        scsi0:
                account_failed: 1
                account_invalid: 1
                failed_flush_operations: 0
                failed_rd_operations: 0
                failed_unmap_operations: 0
                failed_wr_operations: 0
                failed_zone_append_operations: 0
                flush_operations: 47542
                flush_total_time_ns: 2208013036959
                idle_time_ns: 22463384470
                invalid_flush_operations: 0
                invalid_rd_operations: 0
                invalid_unmap_operations: 0
                invalid_wr_operations: 0
                invalid_zone_append_operations: 0
                rd_bytes: 570470912
                rd_merged: 0
                rd_operations: 17083
                rd_total_time_ns: 6257453433
                timed_stats:
                unmap_bytes: 0
                unmap_merged: 0
                unmap_operations: 0
                unmap_total_time_ns: 0
                wr_bytes: 10349387776
                wr_highest_offset: 7298482176
                wr_merged: 0
                wr_operations: 738425
                wr_total_time_ns: 1707252689325
                zone_append_bytes: 0
                zone_append_merged: 0
                zone_append_operations: 0
                zone_append_total_time_ns: 0
cpus: 2
disk: 0
diskread: 570470912
diskwrite: 10349387776
freemem: 145195008
lock: backup
maxdisk: 12884901888
maxmem: 2147483648
mem: 843579392
name: isp-master
netin: 294899091
netout: 142968903
nics:
        tap502i0:
                netin: 294899091
                netout: 142968903
pid: 2655756
proxmox-support:
        backup-fleecing: 1
        backup-max-workers: 1
        pbs-dirty-bitmap: 1
        pbs-dirty-bitmap-migration: 1
        pbs-dirty-bitmap-savevm: 1
        pbs-library-version: 1.4.1 (UNKNOWN)
        pbs-masterkey: 1
        query-bitmap-info: 1
qmpstatus: running
running-machine: pc-i440fx-8.1+pve0
running-qemu: 8.1.5
shares: 1000
status: running
uptime: 240303
vmid: 502
root@pverh:~#

by the way, my PVE has latest updates


Kind Regards

Rainer


P.S. VM 502 shows still locked
 
Last edited:
Additional information:
The VM itself works without any problem, I have done an fsck too, did not found or fix any problem Filesystem of VM is EXT4.
VM had last dist-upgrades on last saturday.

MK
 
Hello Fion,

did not here anything from you since I send the status information.
After the last updates, including some updates for backup I tried a backup again, still the same stuck at 99%.
As the two servers, under ten total, are not so easy to setup again, one is a mail relay, the other holds a lot of configuration for the whole system, it is critical too not haveing a backup.

Can I give you any further information, to find the problem.

Kind Regards

Rainer
 
What does zpool status -v show?
Does the following complete without errors?
Code:
pvesm path VM_Space:vm-505-disk-0
qemu-img dd if=/path/from/previous/command/vm-505-disk-0 > /dev/null
Please also share the output of qemu-img info /path/from/previous/command/vm-505-disk-0 --output=json

While the backup is stuck, please share the output of
Code:
echo '{"execute": "qmp_capabilities"}{"execute": "query-backup"}' | socat - /var/run/qemu-server/505.qmp | jq
You probably need to run apt install socat jq first.
 
Code:
 zpool status -v
  pool: VMs
 state: ONLINE
  scan: scrub canceled on Mon May 13 13:29:09 2024
config:

    NAME        STATE     READ WRITE CKSUM
    VMs         ONLINE       0     0     0
      mirror-0  ONLINE       0     0     0
        sdb     ONLINE       0     0     0
        sdc     ONLINE       0     0     0

errors: No known data errors

I tried a backup of the second problem VM 502 this morning and it stucks, did not try VM505 since last post.
Did the dd to the 505 VM and it went well, so I tried a backup, was fine without errors, don'ask me why.

After haveing cleared the 502 VM, unlock delete 1B size backup in started with the dd on the 502 VM, runing now nearly have an hour for size of 12 GiB does not finish. Even the 505 VM lasted longer than I assumed for 12 GiB to /dev/null but after about 15 minutes it finished.
502 still copies. I will wait another 15 minutes and then stop it, to start a new backup, so that I can do your last check


Kind Regards

Rainer
 
Did abort the dd, tried a new backup VM 502 got
Code:
INFO: starting new backup job: vzdump 502 --notes-template '{{guestname}} test 2' --node pverh --notification-mode auto --storage backup --mode snapshot --remove 0
INFO: Starting Backup of VM 502 (qemu)
INFO: Backup started at 2024-06-07 15:20:58
INFO: status = running
INFO: VM Name: isp-master-172-16-1-240
INFO: include disk 'scsi0' 'VM_Space:vm-502-disk-0' 12G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/502/2024-06-07T13:20:58Z'
INFO: skipping guest-agent 'fs-freeze', agent configured but not running?
ERROR: VM 502 qmp command 'backup' failed - got timeout
INFO: aborting backup job

But abort the job does not finish

Code:
root@pverh:~# echo '{"execute": "qmp_capabilities"}{"execute": "query-backup"}' | socat - /var/run/qemu-server/502.qmp | jq

Does not finish too


Backup log from this morning VM 502
Code:
INFO: starting new backup job: vzdump 502 --storage backup --notification-mode legacy-sendmail --node pverh --notes-template '{{guestname}} test' --remove 0 --mode snapshot --mailto mk@muekno.de
INFO: Starting Backup of VM 502 (qemu)
INFO: Backup started at 2024-06-07 07:39:24
INFO: status = running
INFO: VM Name: isp-master-172-16-1-240
INFO: include disk 'scsi0' 'VM_Space:vm-502-disk-0' 12G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: snapshots found (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/502/2024-06-07T05:39:24Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '342682a1-f9cd-4f51-8054-e53bca133d1a'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: created new
INFO:   1% (244.0 MiB of 12.0 GiB) in 3s, read: 81.3 MiB/s, write: 74.7 MiB/s
INFO:   3% (396.0 MiB of 12.0 GiB) in 6s, read: 50.7 MiB/s, write: 50.7 MiB/s
INFO:   4% (508.0 MiB of 12.0 GiB) in 10s, read: 28.0 MiB/s, write: 28.0 MiB/s
INFO:   5% (684.0 MiB of 12.0 GiB) in 13s, read: 58.7 MiB/s, write: 58.7 MiB/s
INFO:   6% (820.0 MiB of 12.0 GiB) in 16s, read: 45.3 MiB/s, write: 45.3 MiB/s
INFO:   8% (1.0 GiB of 12.0 GiB) in 19s, read: 70.7 MiB/s, write: 70.7 MiB/s
INFO:  10% (1.2 GiB of 12.0 GiB) in 22s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:  11% (1.4 GiB of 12.0 GiB) in 25s, read: 70.7 MiB/s, write: 69.3 MiB/s
INFO:  12% (1.5 GiB of 12.0 GiB) in 28s, read: 34.7 MiB/s, write: 34.7 MiB/s
INFO:  13% (1.7 GiB of 12.0 GiB) in 31s, read: 46.7 MiB/s, write: 46.7 MiB/s
INFO:  15% (1.8 GiB of 12.0 GiB) in 34s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  16% (1.9 GiB of 12.0 GiB) in 39s, read: 27.2 MiB/s, write: 27.2 MiB/s
INFO:  17% (2.1 GiB of 12.0 GiB) in 43s, read: 28.0 MiB/s, write: 23.0 MiB/s
INFO:  18% (2.3 GiB of 12.0 GiB) in 46s, read: 70.7 MiB/s, write: 70.7 MiB/s
INFO:  20% (2.4 GiB of 12.0 GiB) in 49s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  21% (2.6 GiB of 12.0 GiB) in 53s, read: 39.0 MiB/s, write: 39.0 MiB/s
INFO:  22% (2.7 GiB of 12.0 GiB) in 56s, read: 50.7 MiB/s, write: 50.7 MiB/s
INFO:  23% (2.8 GiB of 12.0 GiB) in 59s, read: 26.7 MiB/s, write: 26.7 MiB/s
INFO:  25% (3.0 GiB of 12.0 GiB) in 1m 2s, read: 80.0 MiB/s, write: 80.0 MiB/s
INFO:  26% (3.2 GiB of 12.0 GiB) in 1m 5s, read: 72.0 MiB/s, write: 72.0 MiB/s
INFO:  28% (3.4 GiB of 12.0 GiB) in 1m 8s, read: 56.0 MiB/s, write: 56.0 MiB/s
INFO:  29% (3.5 GiB of 12.0 GiB) in 1m 11s, read: 53.3 MiB/s, write: 53.3 MiB/s
INFO:  30% (3.6 GiB of 12.0 GiB) in 1m 15s, read: 24.0 MiB/s, write: 23.0 MiB/s
INFO:  31% (3.7 GiB of 12.0 GiB) in 1m 19s, read: 21.0 MiB/s, write: 21.0 MiB/s
INFO:  32% (3.9 GiB of 12.0 GiB) in 1m 23s, read: 34.0 MiB/s, write: 34.0 MiB/s
INFO:  33% (4.0 GiB of 12.0 GiB) in 1m 29s, read: 18.7 MiB/s, write: 18.7 MiB/s
INFO:  34% (4.1 GiB of 12.0 GiB) in 1m 35s, read: 20.7 MiB/s, write: 17.3 MiB/s
INFO:  35% (4.3 GiB of 12.0 GiB) in 1m 38s, read: 68.0 MiB/s, write: 66.7 MiB/s
INFO:  37% (4.5 GiB of 12.0 GiB) in 1m 41s, read: 58.7 MiB/s, write: 58.7 MiB/s
INFO:  38% (4.7 GiB of 12.0 GiB) in 1m 44s, read: 70.7 MiB/s, write: 70.7 MiB/s
INFO:  39% (4.7 GiB of 12.0 GiB) in 1m 47s, read: 25.3 MiB/s, write: 25.3 MiB/s
INFO:  40% (4.8 GiB of 12.0 GiB) in 1m 50s, read: 25.3 MiB/s, write: 24.0 MiB/s
INFO:  41% (4.9 GiB of 12.0 GiB) in 1m 53s, read: 37.3 MiB/s, write: 37.3 MiB/s
INFO:  42% (5.1 GiB of 12.0 GiB) in 1m 57s, read: 37.0 MiB/s, write: 37.0 MiB/s
INFO:  43% (5.2 GiB of 12.0 GiB) in 2m 1s, read: 31.0 MiB/s, write: 31.0 MiB/s
INFO:  44% (5.3 GiB of 12.0 GiB) in 2m 5s, read: 29.0 MiB/s, write: 29.0 MiB/s
INFO:  45% (5.4 GiB of 12.0 GiB) in 2m 11s, read: 24.7 MiB/s, write: 24.0 MiB/s
INFO:  46% (5.5 GiB of 12.0 GiB) in 2m 14s, read: 29.3 MiB/s, write: 29.3 MiB/s
INFO:  47% (5.6 GiB of 12.0 GiB) in 2m 17s, read: 40.0 MiB/s, write: 40.0 MiB/s
INFO:  48% (5.8 GiB of 12.0 GiB) in 2m 20s, read: 45.3 MiB/s, write: 45.3 MiB/s
INFO:  49% (5.9 GiB of 12.0 GiB) in 2m 23s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  50% (6.0 GiB of 12.0 GiB) in 2m 26s, read: 25.3 MiB/s, write: 21.3 MiB/s
INFO:  51% (6.2 GiB of 12.0 GiB) in 2m 29s, read: 57.3 MiB/s, write: 53.3 MiB/s
INFO:  52% (6.3 GiB of 12.0 GiB) in 2m 32s, read: 33.3 MiB/s, write: 33.3 MiB/s
INFO:  53% (6.4 GiB of 12.0 GiB) in 2m 35s, read: 38.7 MiB/s, write: 33.3 MiB/s
INFO:  54% (6.5 GiB of 12.0 GiB) in 2m 39s, read: 26.0 MiB/s, write: 25.0 MiB/s
INFO:  55% (6.6 GiB of 12.0 GiB) in 2m 43s, read: 36.0 MiB/s, write: 36.0 MiB/s
INFO:  56% (6.8 GiB of 12.0 GiB) in 2m 46s, read: 53.3 MiB/s, write: 53.3 MiB/s
INFO:  57% (6.9 GiB of 12.0 GiB) in 2m 49s, read: 49.3 MiB/s, write: 49.3 MiB/s
INFO:  59% (7.1 GiB of 12.0 GiB) in 2m 52s, read: 57.3 MiB/s, write: 57.3 MiB/s
INFO:  60% (7.3 GiB of 12.0 GiB) in 2m 55s, read: 56.0 MiB/s, write: 49.3 MiB/s
INFO:  61% (7.3 GiB of 12.0 GiB) in 2m 58s, read: 32.0 MiB/s, write: 21.3 MiB/s
INFO:  62% (7.4 GiB of 12.0 GiB) in 3m 3s, read: 20.8 MiB/s, write: 20.0 MiB/s
INFO:  63% (7.6 GiB of 12.0 GiB) in 3m 7s, read: 32.0 MiB/s, write: 31.0 MiB/s
INFO:  64% (7.7 GiB of 12.0 GiB) in 3m 10s, read: 37.3 MiB/s, write: 28.0 MiB/s
INFO:  65% (7.8 GiB of 12.0 GiB) in 3m 13s, read: 40.0 MiB/s, write: 28.0 MiB/s
INFO:  66% (7.9 GiB of 12.0 GiB) in 3m 18s, read: 24.8 MiB/s, write: 23.2 MiB/s
INFO:  68% (8.2 GiB of 12.0 GiB) in 3m 21s, read: 90.7 MiB/s, write: 37.3 MiB/s
INFO:  70% (8.5 GiB of 12.0 GiB) in 3m 24s, read: 97.3 MiB/s, write: 36.0 MiB/s
INFO:  72% (8.7 GiB of 12.0 GiB) in 3m 27s, read: 86.7 MiB/s, write: 36.0 MiB/s
INFO:  73% (8.8 GiB of 12.0 GiB) in 3m 30s, read: 34.7 MiB/s, write: 33.3 MiB/s
INFO:  75% (9.0 GiB of 12.0 GiB) in 3m 33s, read: 68.0 MiB/s, write: 28.0 MiB/s
INFO:  76% (9.1 GiB of 12.0 GiB) in 3m 36s, read: 32.0 MiB/s, write: 2.7 MiB/s
INFO:  77% (9.2 GiB of 12.0 GiB) in 3m 40s, read: 31.0 MiB/s, write: 4.0 MiB/s
INFO:  78% (9.4 GiB of 12.0 GiB) in 3m 43s, read: 50.7 MiB/s, write: 17.3 MiB/s
INFO:  79% (9.5 GiB of 12.0 GiB) in 3m 46s, read: 49.3 MiB/s, write: 18.7 MiB/s
INFO:  81% (9.7 GiB of 12.0 GiB) in 3m 49s, read: 64.0 MiB/s, write: 28.0 MiB/s
INFO:  82% (9.9 GiB of 12.0 GiB) in 3m 52s, read: 53.3 MiB/s, write: 8.0 MiB/s
INFO:  89% (10.8 GiB of 12.0 GiB) in 3m 56s, read: 229.0 MiB/s, write: 27.0 MiB/s
INFO:  99% (11.9 GiB of 12.0 GiB) in 3m 59s, read: 401.3 MiB/s, write: 118.7 MiB/s
ERROR: VM 502 qmp command 'query-backup' failed - got timeout
INFO: aborting backup job
ERROR: VM 502 qmp command 'backup-cancel' failed - interrupted by signal
INFO: resuming VM again
What might be of interest, this VM has 2 snaphost I can see under VM DISKs, but not under snapshots if the VM is selected

Kind Regards


Rainer
 
Hello Fiona,

after beeing unable to do a new backup of VM 502, I did shut down PVE complete powered off, restart with power on and each VM came up fine. Then I restarted the backup of VM 502 and , what a wonder, it rans through complete to the end.

Maybe the latest patches had something to do with it, I did not Reboot the PVE as there was no notice to do, or whatever else.
I'am in that business since 1983, and I learned "If Something works and you do not have an obviously explanation, do not think about or search for it, it is a waste of time".

Many many thanks for your time and helping me. Have a nice weekend.

Kind Regards

Rainer

If I do not hear from you till monday evening I will declare the case as SOLVED

mk
 
My disks are enterprise grade HDs, myserver is a enterprise grade Fujitsu server.
I have 10 VMs on that server, some are Debian 12 latest patches, some ar SUSE SELS 15 SP5, some including the 2 problem VMs are imported from ESXi, some are setup new.

The problem exitsted since weeks, no backup mode, even VM off did help.

These days there was an update to PVE including a new backup client. Thats why I tried again.

My ZFs is fine, runing on separate disks, no Hardware Raid.

Why reboot help, I think running only the last update without reboot did not update everything, the reboot, while loading everything does. So I think there was a problem in the code comming up only in special conditions and is now fiex in the latest update.
 
The problem exitsted since weeks, no backup mode, even VM off did help.
Sorry for my english, can you rewrite this ?

BTW, VMware Filesystem (VMFS) used by ESXi isn't comparable to ZFS beast.
Lvmthin (used with the default ext4 PVE install) will.
ZFS slowdown whole things (when poorly sized)
 
Last edited:
VMFS does not conflict with Proxmox, whatever filesystem you use on Proxmox. Relevant is is the guests filesytem. And the guest filesystem is indepent from the filesystem you choose for Proxmox itself. Important is that you use a filesystem for Proxmox supporting snapshots. like LVM pr ZFS. While ZFS is is the better filesystem as it is easy to handle, extrem stable and failure tolerant, expandable and so on.

BTW I do not understand at all what you will say with your posts
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!