[SOLVED] Extrem slow backup

muekno

Member
Dec 15, 2023
164
11
18
I have a VM (Linux Debian) 600GB used as archive. There were no changes at least the last 4 days. A automatic backup is configured at 3:00, retain last 3.
The backup from 14.10, 15.10, 16.10, and befor did run fine, time needed by 10 to 15 second.
this morning at about 6:00 I missed the mail from the backup. I looked at it and saw it still running, not even half ready. I stoped it.
Now I started it by hand, and it was slow again. And unexpected there is a high netin rate and a high disk write, and a high io delay, while the PBS CPU does nothing 0,2% Load.
I expected while backup a high disk read and a high netout rate.
I have another 10 VM in 2 backup jobs at 1:00 and 2:00 they ran fine in mostly less then 20 seconds up to 2 minutes on the VMs with mor modified data. The same I had it with the archive VM, since existing about 15 days

what can be wrong
 
The log from today try:
Code:
vzdump 900 --node pve-rh --mode snapshot --mailto mk@muekno.de --remove 0 --notes-template '{{guestname}}, {{node}}, {{vmid}}' --storage PBSc --notification-mode auto


900: 2024-10-17 11:59:53 INFO: Starting Backup of VM 900 (qemu)
900: 2024-10-17 11:59:53 INFO: status = running
900: 2024-10-17 11:59:53 INFO: VM Name: archive-172-16-1-80
900: 2024-10-17 11:59:53 INFO: include disk 'scsi0' 'VM-USB-Store:vm-900-disk-0' 600G
900: 2024-10-17 11:59:53 INFO: backup mode: snapshot
900: 2024-10-17 11:59:53 INFO: ionice priority: 7
900: 2024-10-17 11:59:53 INFO: creating Proxmox Backup Server archive 'vm/900/2024-10-17T09:59:53Z'
900: 2024-10-17 11:59:53 INFO: issuing guest-agent 'fs-freeze' command
900: 2024-10-17 11:59:54 INFO: issuing guest-agent 'fs-thaw' command
900: 2024-10-17 11:59:54 INFO: started backup task '11a7b072-185c-43fb-b81d-65e574beb0b8'
900: 2024-10-17 11:59:54 INFO: resuming VM again
900: 2024-10-17 11:59:54 INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared
900: 2024-10-17 11:59:57 INFO:   0% (3.0 GiB of 600.0 GiB) in 3s, read: 1.0 GiB/s, write: 2.7 MiB/s
900: 2024-10-17 12:03:23 INFO:   1% (6.1 GiB of 600.0 GiB) in 3m 29s, read: 15.5 MiB/s, write: 318.1 KiB/s
900: 2024-10-17 12:10:44 INFO:   2% (12.1 GiB of 600.0 GiB) in 10m 50s, read: 14.0 MiB/s, write: 18.6 KiB/s
900: 2024-10-17 12:13:16 ERROR: interrupted by signal
900: 2024-10-17 12:13:16 INFO: aborting backup job
900: 2024-10-17 12:13:20 INFO: resuming VM again
900: 2024-10-17 12:13:20 ERROR: Backup of VM 900 failed - interrupted by signal

Yesterday last good backup:
Code:
vzdump 900 --prune-backups 'keep-last=3' --fleecing 0 --mode snapshot --storage PBSc --mailto mk@muekno.de --quiet 1 --notes-template '{{guestname}}' --mailnotification always


900: 2024-10-16 03:00:00 INFO: Starting Backup of VM 900 (qemu)
900: 2024-10-16 03:00:00 INFO: status = running
900: 2024-10-16 03:00:00 INFO: VM Name: archive-172-16-1-80
900: 2024-10-16 03:00:00 INFO: include disk 'scsi0' 'VM-USB-Store:vm-900-disk-0' 600G
900: 2024-10-16 03:00:00 INFO: backup mode: snapshot
900: 2024-10-16 03:00:00 INFO: ionice priority: 7
900: 2024-10-16 03:00:00 INFO: creating Proxmox Backup Server archive 'vm/900/2024-10-16T01:00:00Z'
900: 2024-10-16 03:00:00 INFO: issuing guest-agent 'fs-freeze' command
900: 2024-10-16 03:00:01 INFO: issuing guest-agent 'fs-thaw' command
900: 2024-10-16 03:00:03 INFO: started backup task '2125234f-c36a-41f0-a30f-23486ee6b573'
900: 2024-10-16 03:00:03 INFO: resuming VM again
900: 2024-10-16 03:00:03 INFO: scsi0: dirty-bitmap status: OK (336.0 MiB of 600.0 GiB dirty)
900: 2024-10-16 03:00:03 INFO: using fast incremental mode (dirty-bitmap), 336.0 MiB dirty of 600.0 GiB total
900: 2024-10-16 03:00:06 INFO:  34% (116.0 MiB of 336.0 MiB) in 3s, read: 38.7 MiB/s, write: 36.0 MiB/s
900: 2024-10-16 03:00:09 INFO:  42% (144.0 MiB of 336.0 MiB) in 6s, read: 9.3 MiB/s, write: 9.3 MiB/s
900: 2024-10-16 03:00:12 INFO:  52% (176.0 MiB of 336.0 MiB) in 9s, read: 10.7 MiB/s, write: 10.7 MiB/s
900: 2024-10-16 03:00:15 INFO:  59% (200.0 MiB of 336.0 MiB) in 12s, read: 8.0 MiB/s, write: 8.0 MiB/s
900: 2024-10-16 03:00:18 INFO:  97% (328.0 MiB of 336.0 MiB) in 15s, read: 42.7 MiB/s, write: 42.7 MiB/s
900: 2024-10-16 03:00:21 INFO: 100% (336.0 MiB of 336.0 MiB) in 18s, read: 2.7 MiB/s, write: 2.7 MiB/s
900: 2024-10-16 03:00:21 INFO: Waiting for server to finish backup validation...
900: 2024-10-16 03:00:23 INFO: backup was done incrementally, reused 599.68 GiB (99%)
900: 2024-10-16 03:00:23 INFO: transferred 336.00 MiB in 20 seconds (16.8 MiB/s)
900: 2024-10-16 03:00:23 INFO: adding notes to backup
900: 2024-10-16 03:00:23 INFO: prune older backups with retention: keep-last=3
900: 2024-10-16 03:00:23 INFO: running 'proxmox-backup-client prune' for 'vm/900'
900: 2024-10-16 03:00:24 INFO: pruned 1 backup(s) not covered by keep-retention policy
900: 2024-10-16 03:00:24 INFO: Finished Backup of VM 900 (00:00:24)
 
What else I found out. I have 10 productiv VMs, some are Debian Bookworm, some are SUSE SLES 15 SP5
3 of them (one of the ist the problem VM) have virtIO SCSI single, the most others virtIO SCSI, one still VMware PVSCSI

Comparing the Disk Write of the VMs the virtIO SCSI single have typical higher Disk Write than te others. Especialy the problem VM which is doing nothing. Top shows 99,9 to 100% idle, it is just a simple DEBIAN text installation, no services, it is just to archive Data from time to time, no clients. Just samba ist installed to be able to access it from my Mac. the Mac has no connection to VM, beside I need it.
 
  • Like
Reactions: Johannes S
900: 2024-10-17 11:59:54 INFO: scsi0: dirty-bitmap status: existing bitmap was invalid and has been cleared

The "slow" backup could not use the disk(s) dirty map from some reason (maybe the VM was stopped/started?). The VMs disks will have to be read in full, compressing each chunk and sending it's hash to PBS. Chunk upload will only happen if the chunk isn't already in the PBS and given this VM has very few changes to it's disk(s), very few network traffic will flow. A new backup is needed for the dirty map to be recreated.
 
does that mean I have to wait until a (just the running) backup will be finished?

I tried to open the log, it is still totally empty, the status says running since over 3 hours now, as it started as scheduled at 3:00.
There are no data on the server, I do not have on a desktop, so even if I destroy the VM there will be nothing lost than some time, for setup the VM, quick done just a Debian text install and the samba and some copy time.
As there is noch data change I can restore the last good backup. What do you mean is better than waiting for the backup to the nd of time.
 
Last edited:
the first backup of a VM and every backup after the VM has been stopped will be "slow" like that. there are some other factors that can also affect the bitmap (like the last snapshot on the PBS side being corrupt, or switching encryption mode or key, or ..).
 
Hello Fabian.

thanks for the response. While testing and building up the PVE and PBS, first on my HP and Fujitsu Servers, and then move to my less power consuming hardwareI I restarted the PVE and the PBS quite often. And if updates to the PVE or PBS requested a reboot I never saw such slow backup. Even the first backup of a 600GB VM (I have a second one beside this archive VM) was a maximum of about 4 hours, this on is ow running more then 7 hours, nearly the double of of the first back of this VM.
I never saw a significant longer backup of any other VMs even if I moved them to an other data store or hat a restart of the VM in case of upgrade needed a reboot, what is needed nearly weekly by my three SUSE SLES VMs.
I get mailed the logs of the daily backups, so I have a good overview about backs and the time they need.

Regards
Rainer
 
the only thing that counts is whether the VM has been running (live migration is okay as well) since the last snapshot was made. whether you reboot the PVE node or the PBS system does not matter.

you can see in your log:

Code:
900: 2024-10-17 12:03:23 INFO:   1% (6.1 GiB of 600.0 GiB) in 3m 29s, read: 15.5 MiB/s, write: 318.1 KiB/s

that the read speed is just very low, and it has to read all the 600GB to determine that almost none of it has to be backed up.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!