[SOLVED] Windows VM freezes in the backup process

Aug 3, 2023
9
2
3
We're having a problem with the VM freezing when the backup process is running. The VM in question is Windows Server 2016 with IIS. We have more than 500 VMs running, with different types of OS, and none of them have this symptom.

Below is the VM configuration and PVE host version

Code:
agent: 1
boot: order=scsi0;ide2
ciuser: administrador
cores: 8
cpu: host
ide2: none,media=cdrom
kvm: 1
machine: pc-q35-8.0
memory: 24576
meta: creation-qemu=7.2.0,ctime=1696525080
net0: virtio=AE:0F:B8:85:D4:4C,bridge=vmbr0,tag=144
numa: 0
ostype: win10
scsi0: stor01-vms:vm-100-disk-0,cache=writeback,discard=on,iothread=1,size=1300G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=d7df60d3-6c6b-4ed9-bbd2-8cd79ab62cdf
sockets: 1
vga: std
vmgenid: 25447e49-c538-480d-b08f-15c000058ae4


Code:
proxmox-ve: 8.0.2 (running kernel: 6.2.16-15-pve)
pve-manager: 8.0.4 (running version: 8.0.4/d258a813cfa6b390)
pve-kernel-6.2: 8.0.5
proxmox-kernel-helper: 8.0.3
proxmox-kernel-6.2.16-15-pve: 6.2.16-15
proxmox-kernel-6.2: 6.2.16-15
pve-kernel-6.2.16-3-pve: 6.2.16-3
ceph: 17.2.6-pve1+3
ceph-fuse: 17.2.6-pve1+3
corosync: 3.1.7-pve3
criu: 3.17.1-2
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx4
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-4
libknet1: 1.25-pve1
libproxmox-acme-perl: 1.4.6
libproxmox-backup-qemu0: 1.4.0
libproxmox-rs-perl: 0.3.0
libpve-access-control: 8.0.4
libpve-apiclient-perl: 3.3.1
libpve-common-perl: 8.0.7
libpve-guest-common-perl: 5.0.3
libpve-http-server-perl: 5.0.4
libpve-network-perl: 0.8.1
libpve-rs-perl: 0.8.4
libpve-storage-perl: 8.0.2
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 5.0.2-4
lxcfs: 5.0.3-pve3
novnc-pve: 1.4.0-2
proxmox-backup-client: 3.0.2-1
proxmox-backup-file-restore: 3.0.2-1
proxmox-kernel-helper: 8.0.3
proxmox-mail-forward: 0.2.0
proxmox-mini-journalreader: 1.4.0
proxmox-widget-toolkit: 4.0.9
pve-cluster: 8.0.2
pve-container: 5.0.4
pve-docs: 8.0.5
pve-edk2-firmware: 3.20230228-4
pve-firewall: 5.0.3
pve-firmware: 3.8-2
pve-ha-manager: 4.0.2
pve-i18n: 3.0.7
pve-qemu-kvm: 8.0.2-6
pve-xtermjs: 4.16.0-3
qemu-server: 8.0.7
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.1.13-pve1

All backups are done on a PBS host, the backup process is done without errors, but while it's doing it, the VM crashes, the services stop responding and when the backup finishes or is stopped, the VM comes back instantly. The tests I've already done and haven't succeeded:

Backup to a repository NOT pbs;
Change machine type to q35;
Use VirtIO SCSI single controller;
Disable QEMU Guest Agent;

This VM was migrated from a VMware cluster, through the 'qm importdisk' process, just like other VMs that work normally. The only difference is that it has a relatively large disk: 1.3T. The Proxmox Cluster consists of 4 Dell R640 hosts with Ceph and high-performance SSDs
 
In Application Log:

Code:
Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface.  hr = 0x80070005, Access is denied.
. This is often caused by incorrect security settings in either the writer or requestor process.

Operation:
   Gathering Writer Data

Context:
   Writer Class Id: {e8132975-6f93-4464-a53e-1050253ae220}
   Writer Name: System Writer
   Writer Instance ID: {ec5c286d-ff27-480f-aca2-753bc6e6b28c}

In System Log:

Code:
The IO operation at logical block address 0x20275630 for Disk 0 (PDO name: \Device\0000001e) was retried.
 
Have you installed all drivers from the VirtIO ISO in the VM? Or maybe there are still unknown devices in the device manager? Were you able to find anything in the Event Viewer on the VM?
 
1.) allow local permission for Network Service in Windows Component Service/COM-Security/Access Permission
2.) run a full offline chkdsk on the related disk.
3.) update the AHCI-driver
 
Last edited:
@sb-jw I've installed all the VirtIO drivers version 0.1.240, QEMU Guest Agent is running too. The only unknown device in the device manager is the 'HID Button over Interrupt Driver' (ACPI\VEN_ACPI&DEV_0010), but this seems to be common in Windows Server 2016, even a clean installation with VirtIO drivers makes it unknown.

The Event Viewer showed some VSS errors, I followed the suggestion of @ITT:
Code:
1.) allow local permission for Network Service in Windows Component Service/COM-Security/Access Permission

Then the error was displayed:

Code:
Cryptographic Services failed while processing the OnIdentity() call in the System Writer Object.

Details:
AddLegacyDriverFiles: Microsoft Link-Layer Discovery Protocol binary image could not be backed up.

System error:
Access was denied.

Where I managed to solve it by following this Microsoft procedure: https://learn.microsoft.com/en-us/t...p-and-storage/event-id-513-vss-windows-server

Now no error log is shown in the Event Viewer when running Backup, I can see the VSS taking action, but it's still freezing the VM

Code:
2.) run a full offline chkdsk on the related disk.

I've done this step before with the offline disk, without result

Code:
3.) update the AHCI-driver

This type of driver is usually supplied by the motherboard manufacturer, but how does VM need to be updated? even when running as SCSI? I couldn't find a driver to update
 
After much investigation, I found a task schedule that triggered a script every 1 minute. This script was changing the permissions of a certain directory that contained a lot of files, which was affecting the VM's performance and during the backup process there was a lot of IO competition
 
  • Like
Reactions: ITT

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!