Virtual machine slows down during backup

Maksimus

Member
May 16, 2022
78
3
13
While the VM is being backed up, the virtual machine starts to slow down, in some cases this leads to the virtual machine hanging completely.
The VM is running Windows Server 2019
problem noticed in PVE Linux 5.15.107-2-pve #1 SMP PVE 5.15.107-2 (2023-05-10T09:10Z) and Linux 6.5.11-7-pve (2023-12-05T09:44Z) 8.1.4
 
Hi,
please share the task log for the backup job in question as well as the VM config, qm config <VMID> --current. What storage backend is used for the backup as target? Note that VM backups in snapshot and supend mode use copy-before-write, meaning that blocks written to by the VM will be written to the backup target before writing the new data to the VM disk, in order to keep the backups consistent. This comes however with the limitation of being bottlenecked by the backup target.
 
The backup log for the day when the virtual machine crashed was not saved. Attached is the log of the latest backup.
PVE server is installed on nvme disk, PBS is installed on HDD

Target Server (PBS) Linux 6.5.11-7-pve (2023-12-05T09:44Z)

Code:
agent: 1,fstrim_cloned_disks=1
balloon: 0
boot: order=virtio0
cores: 8
cpu: host
hotplug: disk,network,usb,memory,cpu
machine: pc-q35-8.1
memory: 18432
meta: creation-qemu=8.1.2,ctime=1701860077
name: IT
net0: virtio=BC:24:11:1D:C9:05,bridge=vmbr0,firewall=1
numa: 1
onboot: 1
ostype: win10
scsihw: virtio-scsi-single
smbios1: uuid=963ca8f6-06a7-487f-8da3-a41443f2ed4d
sockets: 1
tags: pbs36-4
virtio0: local-zfs:vm-9023-disk-0,format=raw,size=90G
virtio1: local-zfs:vm-9023-disk-1,backup=0,format=raw,size=76G
virtio2: local-zfs:vm-9023-disk-2,backup=0,format=raw,size=40G
virtio3: local-zfs:vm-9023-disk-3,format=raw,size=20G
vmgenid: e1d9bb30-91b0-4521-8e36-402ce2f8598a

Do I understand you correctly?
That when a backup starts, everything that changes in the VM while the backup is in progress should also be added to the current backup?
 

Attachments

  • task-Host402-vzdump-2024-02-06T18_00_03Z.log
    7.2 KB · Views: 1
Last edited:
Do I understand you correctly?
That when a backup starts, everything that changes in the VM while the backup is in progress should also be added to the current backup?
No, the other way around: as the VM wants to write new data to the VM disk, while the backup is in progress, the current data is written of to the backup before updating the data block on the disk. That makes sure to keep the backup state consistent.

A possible workaround for this is to backup to a local PBS instance and only then pull the backups to a remote PBS instance via a remote sync job. That is until the backup fleecing, currently work in progress (see [0]), will improve the backup performance by writing changes to a fleecing image first.

[0] https://lists.proxmox.com/pipermail/pve-devel/2024-January/061470.html
 
I read everything your colleagues wrote in the link. Do I understand correctly that it is planned to create a mechanism that, when starting a backup, will create a temporary disk where the data will be written while the backup is in progress? That is, something like a snapshot in pure kvm (where all the data is written to a separate file in the snapshot and then merged into the main image).
 
I read everything your colleagues wrote in the link. Do I understand correctly that it is planned to create a mechanism that, when starting a backup, will create a temporary disk where the data will be written while the backup is in progress? That is, something like a snapshot in pure kvm (where all the data is written to a separate file in the snapshot and then merged into the main image).
Well, the disk is used more like a buffer, not like a snapshot. In an ideal case it will not be needed, in the worst case the fleecing image will however grow to the full size of the disk.
 
  • Like
Reactions: Dunuin
Well, the disk is used more like a buffer, not like a snapshot. In an ideal case it will not be needed, in the worst case the fleecing image will however grow to the full size of the disk.
When approximately can we expect these changes? or maybe there is update data in the test repository.
 
Unfortunately I cannot give you an ETA, the patches are still open for discussion. As previously suggested, you might want to try to see if a local PBS instance does solve your performance issues, ee did not fully establish that the PBS target is indeed your performance bottleneck. Could you maybe explain what workloads are running within the VM and how you determine the slowdown? Do you have some system logs from within the VM? Also, do you get some unexpected load, cpu, io values on the host while the backup is running?

On a side note: for PBS it is recommended to use a dedicated host with SSDs, in order to limit failure modes and get the desired performance, especially for the random IO intensive jobs such as chunk integrity verification and garbage collection. For further details please see https://pbs.proxmox.com/docs/installation.html#recommended-server-system-requirements
 
Could you maybe explain what workloads are running within the VM and how you determine the slowdown?
The VM is a Windows Server 2019 terminal server, 20-30 people work simultaneously, inside there is an accounting program that works with the MSKL database, Google Chrome, Office, the database is placed on a separate disk which is not included in the backup.

The slowdown in work is visible to the eye; what normally opens in a fraction of a second, during backup it can take 2-3 seconds to open (Excel), and any other operations occur with a delay.

Do you have some system logs from within the VM?
What logs need to be downloaded?

CPU load i\o on the PVE side is not recorded. Attached is a screenshot_142.
CPU load i\o on the PBS side. Attached is a screenshot_143,144. We use 2 PBS with 4 HDD disks in each. Backups occur evenly from 20:00 to 01:00 during this time 80vm are backed up

Are there any recommendations on the file system for PBS disks? or maybe the size of these disks or some other parameters?
 

Attachments

  • Screenshot_142.png
    Screenshot_142.png
    150.4 KB · Views: 11
  • Screenshot_143.png
    Screenshot_143.png
    418.3 KB · Views: 9
  • Screenshot_144.png
    Screenshot_144.png
    149.2 KB · Views: 9
Last edited:
We use 2 PBS with 4 HDD disks in each.
Seems like you have a lot of IO delay on the PBS side, what hard disk are you using and in what storage configuration? It seems like the storage might not be up for the task.
Are there any recommendations on the file system for PBS disks? or maybe the size of these disks or some other parameters?
We recommend internal enterprise class SSDs which provide enough IOPS for all the random IO operations PBS requires, especially during verification and garbage collection tasks. For the filesystem I suggest to stick to either ZFS for raid configurations with multiple disks or ext4/xfs, as these are the officially supported ones.
 
Seems like you have a lot of IO delay on the PBS side, what hard disk are you using and in what storage configuration? It seems like the storage might not be up for the task.
Now our PBS is a virtual machine, disks in PBS are in ZFS, on the host there is PVE there, too, ZFS. Compression and checksums are disabled on ZFS
MB6000GEBTP HP 6TB 7.2K RPM SATA 6GBPS Lff 3.5inchs
We recommend internal enterprise class SSDs which provide enough IOPS for all the random IO operations PBS requires, especially during verification and garbage collection tasks. For the filesystem I suggest to stick to either ZFS for raid configurations with multiple disks or ext4/xfs, as these are the officially supported ones.
What is better, ext4 or xfs? We are already thinking about enterprise class SSDs, but so far we only have HDD
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!