Issue with backup : guest Windows 2008r2 become unresponsive after backup.

Mar 2, 2020
1
0
1
Cannes
Hi,

I've got an issue about a Windows 2008r2 Server with PVE 6.x :

*** THIS ISSUE APPEARED AFTER UPGRADING PROMOX v5 To PROXMOX v6 ***

Here is the information concerning the host:

proxmox-ve: 6.1-2 (running kernel: 5.3.13-1-pve)
pve-manager: 6.1-7 (running version: 6.1-7/13e58d5e)
pve-kernel-5.3: 6.1-3
pve-kernel-helper: 6.1-3
pve-kernel-4.15: 5.4-11
pve-kernel-5.3.13-3-pve: 5.3.13-3
pve-kernel-5.3.13-1-pve: 5.3.13-1
pve-kernel-5.3.10-1-pve: 5.3.10-1
pve-kernel-4.15.18-23-pve: 4.15.18-51
pve-kernel-4.15.18-10-pve: 4.15.18-32
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.14-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-2
libpve-common-perl: 6.0-11
libpve-guest-common-perl: 3.0-3
libpve-http-server-perl: 3.0-4
libpve-storage-perl: 6.1-4
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 3.0.3-pve60
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-19
pve-docs: 6.1-4
pve-edk2-firmware: 2.20191127-1
pve-firewall: 4.0-10
pve-firmware: 3.0-4
pve-ha-manager: 3.0-8
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-2
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-5
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

Here is the configuration of the involved VM for the issue:

root@pegasus:~# more /etc/pve/qemu-server/100.conf
agent: 1
balloon: 40960
bootdisk: virtio0
cores: 4
cpu: host
ide0: ISOs:iso/virtio-win-0.1.173.iso,media=cdrom,size=385062K
memory: 51200
name: SBSAB
net0: e1000=26:52:11:19:37:26,bridge=vmbr0
numa: 0
onboot: 1
ostype: win7
scsihw: virtio-scsi-single
smbios1: uuid=a5ae47d4-bd51-464a-8664-b98275699763
sockets: 2
vga: memory=16
virtio0: local-lvm:vm-100-disk-0,cache=writeback,size=220G
virtio1: local-lvm:vm-100-disk-1,cache=writeback,size=400G
virtio2: local-lvm:vm-100-disk-2,cache=writeback,size=76G
virtio3: local-lvm:vm-100-disk-3,cache=writeback,size=126G
virtio4: local-lvm:vm-100-disk-4,cache=writeback,size=106G
virtio5: local-lvm:vm-100-disk-5,cache=writeback,size=76G
virtio6: local-lvm:vm-100-disk-6,cache=writeback,size=70G
virtio8: local-lvm:vm-100-disk-7,cache=writeback,size=280G

Here is the description of the issue:

The guest OS is : Windows SBS 2008r2 (2011)

During the backup (snapshot mode with guest agent installed because we need to keep the VM alive because there are connections from many countries around the world 24/7), the VM becomes unresponsive.
The BACKUP system is on a Synology shared with NFS (mount point is /mnt/pve/NAS/pegasus), connection 1 Gbe (will be updated soon with a 10gbe NIC).

We understand that the backup slow speed is relative of the speed of the NIC.

The problem is when the backup is finished, the VM stays unresponsive. All connections from computers (RDP, Authentication, File-sharing, or anything else) are not working.
When trying to open a session from the console in Proxmox GUI, the prompt works but the server hangs with "opening Desktop".
Nothing works anymore. The only available option left is to STOP the VM and start it again (HARD RESET).

We tried many things:

  • Update Proxmox packages
  • Update all the Virtio Drivers in the guest OS (including virtstor)
  • Change SCSI single for the drives to Virtio Block
  • Remove C: from the backup in case of a lock problem
Here is the log of the backup:

INFO: starting new backup job: vzdump 201 100 --compress lzo --quiet 1 --mailnotification always --mailto alert.backup@xxxxxxxx.com,admin.xxxxxx@xxxxxxx.com --mode snapshot --storage NAS
INFO: Starting Backup of VM 100 (qemu)
INFO: Backup started at 2020-03-01 22:00:02
INFO: status = running
INFO: update VM 100: -lock backup
INFO: VM Name: SBSAB
INFO: include disk 'virtio0' 'local-lvm:vm-100-disk-0' 220G
INFO: include disk 'virtio1' 'local-lvm:vm-100-disk-1' 400G
INFO: include disk 'virtio2' 'local-lvm:vm-100-disk-2' 76G
INFO: include disk 'virtio3' 'local-lvm:vm-100-disk-3' 126G
INFO: include disk 'virtio4' 'local-lvm:vm-100-disk-4' 106G
INFO: include disk 'virtio5' 'local-lvm:vm-100-disk-5' 76G
INFO: include disk 'virtio6' 'local-lvm:vm-100-disk-6' 70G
INFO: include disk 'virtio8' 'local-lvm:vm-100-disk-7' 280G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/NAS/dump/vzdump-qemu-100-2020_03_01-22_00_02.vma.lzo'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'ebaca0bb-8ae4-45c5-8fcc-864390d33b80'
INFO: status: 0% (950665216/1453846429696), sparse 0% (43991040), duration 3, read/write 316/302 MB/s
INFO: status: 1% (14573436928/1453846429696), sparse 0% (231915520), duration 236, read/write 58/57 MB/s
INFO: status: 2% (29124001792/1453846429696), sparse 0% (558915584), duration 444, read/write 69/68 MB/s
[.....]
INFO: status: 98% (1424931749888/1453846429696), sparse 37% (544851730432), duration 16046, read/write 312/0 MB/s
INFO: status: 99% (1439552110592/1453846429696), sparse 38% (559472091136), duration 16092, read/write 317/0 MB/s
INFO: status: 100% (1453846429696/1453846429696), sparse 39% (573766406144), duration 16137, read/write 317/0 MB/s
INFO: transferred 1453846 MB in 16137 seconds (90 MB/s)
INFO: archive file size: 593.23GB
INFO: delete old backup '/mnt/pve/NAS/dump/vzdump-qemu-100-2020_02_16-00_00_02.vma.lzo'
INFO: Finished Backup of VM 100 (04:29:17)
INFO: Backup finished at 2020-03-02 02:29:19

--- Screenshot of `vssadmin list providers` command output:

Capture d’écran 2020-03-02 à 17.03.54.png

We are available for futher information you may need,

Thank you for your help,

Laurent
 
Last edited:

tim

Proxmox Staff Member
Oct 1, 2018
330
37
33
As a quick fix, I would suggest to use CIFS or another storage e.g. local and then move it your backup share. There are different problems and we a investigating them currently, but in general a slow nfs connection will slow down your vm while backup. Depending on what's going on in your VM during backups this can lead to significant performance problems for your users.

https://bugzilla.proxmox.com/show_bug.cgi?id=2554
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!