VM Backup Failure - Repeatable

RussellNS

Member
Feb 17, 2021
8
0
6
I'm experiencing a repeatable failure when trying to backup a VM Template. I'm using Proxmox v6.3-3, latest updates applied this morning.

I'm performing these steps:

  1. Create a VM, any VM OS, any VM configuration (in my case: Win2019, OMVF/UEFI).
  2. Backup the VM (succeeds 100% of the time)
  3. Convert the VM to a template.
  4. Backup the VM template (succeeds 100%).
  5. Create Notes for the VM Template.
    1. VM Template -> Summary -> Notes -> Gear
    2. Type 'foo' and click OK.
  6. Backup the VM template (fails 100% of the time).
  7. Delete the note that was created.
  8. Backup the VM template (fails 100%).
  9. Clone VM from template.
  10. Backup cloned VM (succeeds 100%).
When I get a backup failure, this is the output:

Code:
INFO: starting new backup job: vzdump 102 --compress zstd --mode stop --remove 0 --node proxmox --storage Tier3-NAS
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2021-02-17 13:46:28
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: Win2019-2102-Std-Template
INFO: include disk 'scsi0' 'Tier1-NVMe:102/base-102-disk-1.qcow2' 16G
INFO: include disk 'efidisk0' 'Tier1-NVMe:102/base-102-disk-0.qcow2' 128K
INFO: creating vzdump archive '/vm/nas/dump/vzdump-qemu-102-2021_02_17-13_46_28.vma.zst'
INFO: starting template backup
INFO: /usr/bin/vma create -v -c /vm/nas/dump/vzdump-qemu-102-2021_02_17-13_46_28.tmp/qemu-server.conf exec:zstd --rsyncable --threads=1 > /vm/nas/dump/vzdump-qemu-102-2021_02_17-13_46_28.vma.dat drive-efidisk0=/vm/nvme/images/102/base-102-disk-0.qcow2 drive-scsi0=/vm/nvme/images/102/base-102-disk-1.qcow2
INFO: vma: vma_writer_register_stream 'drive-scsi0' failed
ERROR: Backup of VM 102 failed - command '/usr/bin/vma create -v -c /vm/nas/dump/vzdump-qemu-102-2021_02_17-13_46_28.tmp/qemu-server.conf 'exec:zstd --rsyncable --threads=1 > /vm/nas/dump/vzdump-qemu-102-2021_02_17-13_46_28.vma.dat' 'drive-efidisk0=/vm/nvme/images/102/base-102-disk-0.qcow2' 'drive-scsi0=/vm/nvme/images/102/base-102-disk-1.qcow2'' failed: got signal 5
INFO: Failed at 2021-02-17 13:46:28
INFO: Backup job finished with errors
TASK ERROR: job errors

I don't know why adding notes to a VM template before backing it up would cause a backup to fail. Also, I don't know what's changing behind the scenes that removing the note doesn't fix the issue. But I have found that I can create a clone from the template, and that clone will backup fine. I'm a bit lost.

Any help or direction would be appreciated. Thank you in advance.
 
hi,

i've tried with the steps you've provided, same setup with win2019 and ovmf uefi, however it seems to all work fine here.

could you please post:
* pveversion -v
* qm config VMID
* and the backup logs from tasklog
 
I thought I was performing due diligence to distil the issue down to it's basic steps. So the step 1, "Create a VM", actually includes multiple pre-steps (multiple snapshots and a clone to test). These extra steps are so I can "rollback" to one of multiple points during the Windows sysprep process to make changes before cloning a new VM to use as a template.

The only other technical complexity in the process that I can think of at this time is that the VM resides on NVMe storage, and the backup is going to NAS storage. I didn't think that mattered, so I didn't include it in the original steps.

I'll work a bit harder at modifying the original steps to recreate the issue 100% of the time.

In the mean time, here's the information you requested:

pveversion -v
Code:
root@proxmox:~# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.1.0-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: residual config
ifupdown2: 3.0.0-1+pve3
libjs-extjs: 6.0.1-10
libknet1: 1.20-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-3
libpve-guest-common-perl: 3.1-4
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-6
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.6-2
lxcfs: 4.0.6-pve1
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.8-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-5
pve-cluster: 6.2-1
pve-container: 3.3-3
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.2-1
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-8
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-5
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

qm config VMID
Code:
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 1
description: VM Template for Windows Server 2019 Standard.  Includes%3A%0A  *  Updates for Feb, 2021 (2102)%0A  *  Google Chrome%0A  *  Notepad++%0A  *  7-Zip%0A  *  Java 8 (32-bit & 64-bit)%0A  *  Acrobat Reader
efidisk0: Tier1-NVMe:102/base-102-disk-0.qcow2,size=128K
ide2: none,media=cdrom
machine: q35
memory: 2048
name: Win2019-2102-Std-Template
net0: virtio=FE:4D:C3:5E:A2:96,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: Tier1-NVMe:102/base-102-disk-1.qcow2,cache=writeback,discard=on,size=16G
scsihw: virtio-scsi-pci
smbios1: uuid=9970b979-5b8c-4e36-89f2-72714429224b
sockets: 1
template: 1
vga: virtio
vmgenid: a28cab09-e275-4475-a3f8-68646f8ebdcc

backup logs from tasklog
  • I think I'm doing this correctly - using 'cat /var/log/pve/tasks/UPID:proxmox:####:####:####:vzdump:102:acct@pam:'
Code:
INFO: starting new backup job: vzdump 102 --compress zstd --storage Tier3-NAS --node proxmox --remove 0 --mode stop
INFO: Starting Backup of VM 102 (qemu)
INFO: Backup started at 2021-02-18 08:38:44
INFO: status = stopped
INFO: backup mode: stop
INFO: ionice priority: 7
INFO: VM Name: Win2019-2102-Std-Template
INFO: include disk 'scsi0' 'Tier1-NVMe:102/base-102-disk-1.qcow2' 16G
INFO: include disk 'efidisk0' 'Tier1-NVMe:102/base-102-disk-0.qcow2' 128K
INFO: creating vzdump archive '/vm/nas/dump/vzdump-qemu-102-2021_02_18-08_38_44.vma.zst'
INFO: starting template backup
INFO: /usr/bin/vma create -v -c /vm/nas/dump/vzdump-qemu-102-2021_02_18-08_38_44.tmp/qemu-server.conf exec:zstd --rsyncable --threads=1 > /vm/nas/dump/vzdump-qemu-102-2021_02_18-08_38_44.vma.dat drive-efidisk0=/vm/nvme/images/102/base-102-disk-0.qcow2 drive-scsi0=/vm/nvme/images/102/base-102-disk-1.qcow2
INFO: vma: vma_writer_register_stream 'drive-scsi0' failed
ERROR: Backup of VM 102 failed - command '/usr/bin/vma create -v -c /vm/nas/dump/vzdump-qemu-102-2021_02_18-08_38_44.tmp/qemu-server.conf 'exec:zstd --rsyncable --threads=1 > /vm/nas/dump/vzdump-qemu-102-2021_02_18-08_38_44.vma.dat' 'drive-efidisk0=/vm/nvme/images/102/base-102-disk-0.qcow2' 'drive-scsi0=/vm/nvme/images/102/base-102-disk-1.qcow2'' failed: got signal 5
INFO: Failed at 2021-02-18 08:38:44
INFO: Backup job finished with errors
TASK ERROR: job errors
 
Ok, so it turns out that the VM Template Notes was a red herring. It just so happened that I had 2 templates that were on the edge of this issue, and updating the notes was enough to trigger whatever the root cause is.

So I tried to recreate a set of steps where I can replicate the issue 100% of the time from scratch. Here's what I'm doing:

  1. Create a VM with these settings:
    • General
      • Name: Your VM Name
      • Start at boot: Unchecked
      • Defaults for the rest
    • OS
      • Your OS ISO
      • Match Guest OS Settings
    • System
      • SCSI Controller: VirtIO SCSI
      • BIOS: OVMF (UEFI)
      • Add EFI Disk: Checked
      • Storage: Same as VM
      • Format: qcow2
      • Machine: q35
      • Defaults for the rest
    • Hard Disk
      • Bus/Device: SCSI:0
      • Storage: Same as EFI Disk
      • Format: qcow2
      • Backup: Checked
      • Defaults for the rest
    • CPU
      • All Defaults
    • Memory
      • All Defaults
    • Network
      • All Defaults
    • Confirm
      • Start after created: Unchecked
  2. Immediately Backup the VM (succeeds 100%).
    • No need to install the OS, or to even turn the VM on at all.
    • Backup now button
      • Mode: Stop
      • Compression: ZSTD
  3. Convert the VM to template.
  4. Backup the VM template (fails 100%, always with the error 'signal 5').
    • Backup now button
      • Mode: Stop
      • Compression: ZSTD
I've tried that with Win2019 and CentOS 8.2. The OS doesn't seem to matter. It doesn't even seem to matter if the VM is ever powered on.

Also, I've tried this exact same thing using SeaBIOS, and it never fails. Step 4 will succeed 100% when the VM BIOS is set to SeaBIOS. It only ever seems to fail for me in this configuration using UEFI and VirtIO SCSI. I haven't tried other storage controllers with the UEFI BIOS, just to see if they succeed the way using SeaBIOS succeeds. But for me, this seems like a pretty small set of steps that reliably reproduce the issue I'm seeing

Lastly, I forgot to specifically say thank you to @oguz. Thank you very much for any and all input.
 
ISSUE FIXED!

Ok, so final update, I guess. I've recently updated my Proxmox server node to Proxmox v6.3-4, and that seems to have fixed the issue. I can backup the very same VM Templates that had previously failed under Proxmox v6.3-3.

Thank you to @oguz, and the entire Proxmox team!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!