Windows server VM - restore won't boot

May 24, 2019
9
1
8
New York, USA
I have a Windows 2016 Server VM running on Proxmox VE 6.2-12 which was backed up to PBS successfully using "snaphot" mode. No errors during original backup or incremental backups. PBS is running on a dedicated server and the datastore in a directory on a hardware RAID volume (not ZFS). All backups for this VM verify successfully. A test restore completes successfully, however when booting the VM I see the blue Windows logo and the spinning dots, but then all I get is a black screen. I've tried all usual repair (disk repair, sfc /scannow, rebuilding BCD, MBR, etc.) but I'm not able to get the VM to work by any means other than reinstalling Windows.

The VM was stopped and started after the last qemu-server update and that didn't make a difference.

On Proxmox VE, the following proxmox-backup-client versions have been used to back up this VM:
0.8.21-1
0.9.0-2
0.9.1-1

On PBS, the following proxmox-backup-server versions have been used to back up this VM:
0.8.16-1
0.9.0-2
0.9.1-1

No combination of client or backup versions creates a backup that I can restore and run successfully.

I did a vzdump backup in snapshot mode to an NFS volume. I was able to restore that backup and run it successfully without any problems.

I tried backing up to PBS using a separate, new datastore but had the same "black screen" boot failure.

There seems to be a difference in how this VM is being back up to PBS vs vzdump to an NFS share.

Proxmox VE server:
Code:
proxmox-ve: 6.2-2 (running kernel: 5.4.65-1-pve)
pve-manager: 6.2-12 (running version: 6.2-12/b287dd27)
pve-kernel-5.4: 6.2-7
pve-kernel-helper: 6.2-7
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 14.2.11-pve1
ceph-fuse: 14.2.11-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: not correctly installed
ifupdown2: 3.0.0-1+pve3
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.2-9
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 0.9.1-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.3-1
pve-cluster: 6.2-1
pve-container: 3.2-2
pve-docs: 6.2-6
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-1
pve-qemu-kvm: 5.1.0-3
pve-xtermjs: 4.7.0-2
qemu-server: 6.2-15
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.4-pve2

PBS server:
Code:
proxmox backup server: 0.9-1 (running kernel: 5.4.65-1-pve)
proxmox-backup: 1.0-4
proxmox-backup-client: 0.9.1-1
proxmox-backup-server: 0.9.1-1
proxmox-widget-toolkit: 2.3-2

There appears to be a problem with PBS making either a corrupt backup or a corrupt restore. The VM fails to boot properly from a PBS backup/restore but boots and runs properly from a vzdump to NFS backup and restore. Let me know if there are specific troubleshooting steps or information that I can provide to help figure out this PBS backup bug.
 
Last edited:
I am surprised to not have any input from the Proxmox staff yet, and I'm concerned that PBS will make unreliable and unbootable Windows backups in the future. Does anyone have suggestions for things to do or look at?
 
works here.

any special VM config? must be something differently on you side.
 
I have plenty of Windows VMs that I can back up using PBS and restore and they run fine. One (so far) that won't work after restoring when using PBS, but works fine any time (and I've tried both multiple times) I do a vzdump backup to regular backup storage (NFS, in my case).

So for this one VM, there's a difference in how a PBS backup or restore is (not) working vs using vzdump "the old way".
 
can you post the config of that special vm? (qm config ID) ?
 
Code:
root@proxmox01:~# qm config 142
agent: 1
balloon: 2048
bootdisk: scsi0
cores: 2
ide2: none,media=cdrom
memory: 8192
name: TestVM
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr0,firewall=1,tag=17
numa: 0
onboot: 1
ostype: win10
protection: 1
scsi0: ssd-pool:vm-142-disk-0,cache=writeback,discard=on,size=100G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=04cb5f6f-e4f1-4f94-a562-6ddfdf465296
sockets: 1
vmgenid: 5dbfd998-6da6-41da-8e42-b9cc8381fd4f
 
Last edited:
I did new backups of the same VM142 over the weekend. The VM has been rebooted since the last backup/restore attempts, but the results are the same. All backups to PBS verify successfully.

Backup of VM to PBS:
Code:
Detailed backup logs:

vzdump 142 --mode snapshot --node proxmox01 --remove 0 --storage pbs01-store1


142: 2020-10-24 14:12:29 INFO: Starting Backup of VM 142 (qemu)
142: 2020-10-24 14:12:29 INFO: status = running
142: 2020-10-24 14:12:29 INFO: VM Name: TestVM
142: 2020-10-24 14:12:29 INFO: include disk 'scsi0' 'ssd-pool:vm-142-disk-0' 100G
142: 2020-10-24 14:12:29 INFO: backup mode: snapshot
142: 2020-10-24 14:12:29 INFO: ionice priority: 7
142: 2020-10-24 14:12:29 INFO: creating Proxmox Backup Server archive 'vm/142/2020-10-24T18:12:29Z'
142: 2020-10-24 14:12:29 INFO: issuing guest-agent 'fs-freeze' command
142: 2020-10-24 14:12:34 INFO: enabling encryption
142: 2020-10-24 14:12:34 INFO: issuing guest-agent 'fs-thaw' command
142: 2020-10-24 14:12:36 INFO: started backup task 'cb7b4475-a06c-4329-a54e-d5f6e5ce48cf'
142: 2020-10-24 14:12:36 INFO: resuming VM again
142: 2020-10-24 14:12:36 INFO: scsi0: dirty-bitmap status: OK (10.8 GiB of 100.0 GiB dirty)
142: 2020-10-24 14:12:36 INFO: using fast incremental mode (dirty-bitmap), 10.8 GiB dirty of 100.0 GiB total
142: 2020-10-24 14:12:39 INFO:   2% (320.0 MiB of 10.8 GiB) in  3s, read: 106.7 MiB/s, write: 93.3 MiB/s
[progress info deleted for brevity]
142: 2020-10-24 14:14:55 INFO: 100% (10.8 GiB of 10.8 GiB) in  2m 19s, read: 12.0 MiB/s, write: 12.0 MiB/s
142: 2020-10-24 14:14:55 INFO: backup is sparse: 828.00 MiB (7%) total zero data
142: 2020-10-24 14:14:55 INFO: backup was done incrementally, reused 92.27 GiB (92%)
142: 2020-10-24 14:14:55 INFO: transferred 10.83 GiB in 139 seconds (79.8 MiB/s)
142: 2020-10-24 14:14:55 INFO: Finished Backup of VM 142 (00:02:26)

And the restore from PBS to a new VM ID:
Code:
new volume ID is 'hdd-pool:vm-199-disk-0'
restore proxmox backup image: /usr/bin/pbs-restore --repository pvebackup@pbs@10.22.44.119:store1 vm/142/2020-10-24T18:12:29Z drive-scsi0.img.fidx 'rbd:hdd-pool/vm-199-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/hdd-pool.keyring' --verbose --format raw --keyfile /etc/pve/priv/storage/pbs01-store1.enc --skip-zero
connecting to repository 'pvebackup@pbs@10.22.44.119:store1'
open block backend for target 'rbd:hdd-pool/vm-199-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/hdd-pool.keyring'
starting to restore snapshot 'vm/142/2020-10-24T18:12:29Z'
download and verify backup index
progress 1% (read 1073741824 bytes, zeroes = 13% (146800640 bytes), duration 27 sec)
[progress info deleted for brevity]
progress 100% (read 107374182400 bytes, zeroes = 36% (39304822784 bytes), duration 1666 sec)
restore image complete (bytes=107374182400, duration=1666.82s, speed=61.43MB/s)
rescan volumes...
TASK OK

Here is that VM's config: (Note: the net0 link is down so that booting doesn't interfere with the original VM)
Code:
root@proxmox01:~# qm config 199
agent: 1
balloon: 2048
bootdisk: scsi0
cores: 2
ide2: none,media=cdrom
memory: 8192
name: TestVM-TEST1
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr0,link_down=1,tag=17
numa: 0
ostype: win10
protection: 1
scsi0: hdd-pool:vm-199-disk-0,cache=writeback,discard=on,size=100G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=04cb5f6f-e4f1-4f94-a562-6ddfdf465296
sockets: 1
vmgenid: 2e2f0eda-6ca0-4494-9bb1-e92d714866c2

Booting that VM above (VM199) shows the initial Windows logo and spinning dots, but then a blank screen.

Below is a backup of the same VM to NFS storage:
Code:
Detailed backup logs:

vzdump 142 --compress zstd --mode snapshot --storage Synology --node proxmox01 --remove 0


142: 2020-10-24 13:35:51 INFO: Starting Backup of VM 142 (qemu)
142: 2020-10-24 13:35:51 INFO: status = running
142: 2020-10-24 13:35:51 INFO: VM Name: TestVM
142: 2020-10-24 13:35:51 INFO: include disk 'scsi0' 'ssd-pool:vm-142-disk-0' 100G
142: 2020-10-24 13:35:51 INFO: backup mode: snapshot
142: 2020-10-24 13:35:51 INFO: ionice priority: 7
142: 2020-10-24 13:35:51 INFO: creating vzdump archive '/mnt/pve/Synology/dump/vzdump-qemu-142-2020_10_24-13_35_51.vma.zst'
142: 2020-10-24 13:35:51 INFO: issuing guest-agent 'fs-freeze' command
142: 2020-10-24 13:35:57 INFO: issuing guest-agent 'fs-thaw' command
142: 2020-10-24 13:35:59 INFO: started backup task 'b2172c2a-b21c-4f3a-9605-6c89616c80db'
142: 2020-10-24 13:35:59 INFO: resuming VM again
142: 2020-10-24 13:36:02 INFO:   0% (493.0 MiB of 100.0 GiB) in  3s, read: 164.3 MiB/s, write: 116.3 MiB/s
142: 2020-10-24 13:36:08 INFO:   1% (1.1 GiB of 100.0 GiB) in  9s, read: 97.3 MiB/s, write: 89.1 MiB/s
[progress info deleted for brevity]
142: 2020-10-24 13:49:46 INFO: 100% (100.0 GiB of 100.0 GiB) in 13m 47s, read: 165.3 MiB/s, write: 51.0 MiB/s
142: 2020-10-24 13:49:46 INFO: backup is sparse: 42.24 GiB (42%) total zero data
142: 2020-10-24 13:49:46 INFO: transferred 100.00 GiB in 827 seconds (123.8 MiB/s)
142: 2020-10-24 13:49:57 INFO: archive file size: 42.87GB
142: 2020-10-24 13:49:57 INFO: Finished Backup of VM 142 (00:14:06)

And the restore of that backup from NFS as VM200:
Code:
applying read rate limit: 20480
restore vma archive: cstream -t 20971520 -- /mnt/pve/Synology/dump/vzdump-qemu-142-2020_10_24-13_35_51.vma.zst | zstd -q -d -c - | vma extract -v -r /var/tmp/vzdumptmp743423.fifo - /var/tmp/vzdumptmp743423
CFG: size: 449 name: qemu-server.conf
DEV: dev_id=1 size: 107374182400 devname: drive-scsi0
CTIME: Sat Oct 24 13:35:57 2020
new volume ID is 'hdd-pool:vm-200-disk-0'
map 'drive-scsi0' to 'rbd:hdd-pool/vm-200-disk-0:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/hdd-pool.keyring' (write zeros = 0)
progress 1% (read 1073741824 bytes, duration 28 sec)
[progress info deleted for brevity]
progress 100% (read 107374182400 bytes, duration 2194 sec)
total bytes read 107374182400, sparse bytes 45354340352 (42.2%)
space reduction due to 4K zero blocks 0.225%
rescan volumes...
TASK OK

Here is that VM's config: (Note: the net0 link is down so that booting doesn't interfere with the original VM)
Code:
root@proxmox01:~# qm config 200
agent: 1
balloon: 2048
bootdisk: scsi0
cores: 2
ide2: none,media=cdrom
memory: 8192
name: TestVM-TEST2
net0: virtio=XX:XX:XX:XX:XX:XX,bridge=vmbr0,link_down=1,tag=17
numa: 0
ostype: win10
protection: 1
scsi0: hdd-pool:vm-200-disk-0,cache=writeback,discard=on,size=100G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=04cb5f6f-e4f1-4f94-a562-6ddfdf465296
sockets: 1
vmgenid: 3c1c7664-a4ec-4537-8152-dca3f96bdd2c

Booting VM200 is successful - the VM fully loads, I get to the Windows login screen, and can log in.
 
Last edited:
I updated the available packages on a PVE host and physical PBS server yesterday and did another backup/restore test.

On PVE, these were the package changes:
Code:
2020-11-03 17:20:21 upgrade tzdata:all 2020a-0+deb10u1 2020d-0+deb10u1
2020-11-03 17:20:28 upgrade libldap-common:all 2.4.47+dfsg-3+deb10u2 2.4.47+dfsg-3+deb10u3
2020-11-03 17:20:29 upgrade libldap-2.4-2:amd64 2.4.47+dfsg-3+deb10u2 2.4.47+dfsg-3+deb10u3
2020-11-03 17:20:29 upgrade libproxmox-backup-qemu0:amd64 0.7.0-1 0.7.1-1
2020-11-03 17:20:29 upgrade proxmox-backup-client:amd64 0.9.1-1 0.9.4-1
2020-11-03 17:20:30 upgrade proxmox-widget-toolkit:all 2.3-1 2.3-6
2020-11-03 17:20:30 upgrade pve-i18n:all 2.2-1 2.2-2
2020-11-03 17:20:30 upgrade pve-qemu-kvm:amd64 5.1.0-3 5.1.0-4
2020-11-03 17:20:38 upgrade qemu-server:amd64 6.2-15 6.2-18
2020-11-03 17:20:40 upgrade pve-manager:amd64 6.2-12 6.2-15

On PBS, these were the package changes:
Code:
2020-11-03 10:15:04 upgrade libldap-common:all 2.4.47+dfsg-3+deb10u2 2.4.47+dfsg-3+deb10u3
2020-11-03 10:15:05 upgrade libldap-2.4-2:amd64 2.4.47+dfsg-3+deb10u2 2.4.47+dfsg-3+deb10u3
2020-11-03 10:15:05 upgrade proxmox-backup-client:amd64 0.9.1-1 0.9.4-1
2020-11-03 10:15:05 upgrade proxmox-backup-docs:all 0.9.1-1 0.9.4-1
2020-11-03 10:15:06 upgrade proxmox-widget-toolkit:all 2.3-2 2.3-6
2020-11-03 10:15:06 upgrade proxmox-backup-server:amd64 0.9.1-1 0.9.4-2

After the package updates, I migrated the VM that I've been testing with to another PVE host and then back to the PVE host with the updated packages. I then did a snapshot backup of the VM, then a restore using that backup to a new VM ID #. The results are the same - no errors backing up or restoring but the restore is corrupt and results in a black screen after the initial boot logo/Windows logo and spinning dots.
 
There were additional package updates on the PBS server available this morning, so I updated those:

Code:
2020-11-04 09:03:32 upgrade proxmox-backup-client:amd64 0.9.4-1 0.9.5-1
2020-11-04 09:03:33 upgrade proxmox-backup-docs:all 0.9.4-1 0.9.5-1
2020-11-04 09:03:33 upgrade proxmox-widget-toolkit:all 2.3-6 2.3-8
2020-11-04 09:03:33 upgrade proxmox-backup-server:amd64 0.9.4-2 0.9.5-1

I downloaded and installed proxmox-widget-toolkit_2.3-8_all.deb and proxmox-backup-client_0.9.5-1_amd64.deb from the pvetest repo and installed on the PVE host.

I did another snapshot backup to PBS:
Code:
INFO: starting new backup job: vzdump 142 --storage pbs01-store1 --remove 0 --mode snapshot --node proxmox01
INFO: Starting Backup of VM 142 (qemu)
INFO: Backup started at 2020-11-04 09:12:26
INFO: status = running
INFO: VM Name: TestVM
INFO: include disk 'scsi0' 'ssd-pool:vm-142-disk-0' 100G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/142/2020-11-04T14:12:26Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: enabling encryption
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'a463eae0-b5ef-47f1-bf76-ca0142827898'
INFO: resuming VM again
INFO: scsi0: dirty-bitmap status: OK (3.0 GiB of 100.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 3.0 GiB dirty of 100.0 GiB total
INFO:  10% (316.0 MiB of 3.0 GiB) in  3s, read: 105.3 MiB/s, write: 104.0 MiB/s
INFO:  16% (508.0 MiB of 3.0 GiB) in  6s, read: 64.0 MiB/s, write: 61.3 MiB/s
INFO:  23% (704.0 MiB of 3.0 GiB) in  9s, read: 65.3 MiB/s, write: 61.3 MiB/s
INFO:  29% (892.0 MiB of 3.0 GiB) in 12s, read: 62.7 MiB/s, write: 60.0 MiB/s
INFO:  34% (1.0 GiB of 3.0 GiB) in 15s, read: 58.7 MiB/s, write: 58.7 MiB/s
INFO:  40% (1.2 GiB of 3.0 GiB) in 18s, read: 61.3 MiB/s, write: 60.0 MiB/s
INFO:  47% (1.4 GiB of 3.0 GiB) in 21s, read: 66.7 MiB/s, write: 64.0 MiB/s
INFO:  53% (1.6 GiB of 3.0 GiB) in 24s, read: 65.3 MiB/s, write: 62.7 MiB/s
INFO:  60% (1.8 GiB of 3.0 GiB) in 27s, read: 65.3 MiB/s, write: 64.0 MiB/s
INFO:  66% (2.0 GiB of 3.0 GiB) in 30s, read: 60.0 MiB/s, write: 54.7 MiB/s
INFO:  72% (2.2 GiB of 3.0 GiB) in 33s, read: 64.0 MiB/s, write: 60.0 MiB/s
INFO:  78% (2.4 GiB of 3.0 GiB) in 36s, read: 65.3 MiB/s, write: 61.3 MiB/s
INFO:  85% (2.6 GiB of 3.0 GiB) in 39s, read: 70.7 MiB/s, write: 64.0 MiB/s
INFO:  91% (2.7 GiB of 3.0 GiB) in 42s, read: 60.0 MiB/s, write: 58.7 MiB/s
INFO:  97% (2.9 GiB of 3.0 GiB) in 45s, read: 58.7 MiB/s, write: 58.7 MiB/s
INFO: 100% (3.0 GiB of 3.0 GiB) in 47s, read: 40.0 MiB/s, write: 40.0 MiB/s
INFO: backup is sparse: 4.00 MiB (0%) total zero data
INFO: backup was done incrementally, reused 97.13 GiB (97%)
INFO: transferred 2.99 GiB in 47 seconds (65.1 MiB/s)
INFO: Finished Backup of VM 142 (00:00:53)
INFO: Backup finished at 2020-11-04 09:13:19
INFO: Backup job finished successfully
TASK OK

I did a restore from that backup. Still corrupt/broken (no difference from the past failed backup/restore using PBS).

Again I have to reiterate that this is an issue with PBS based on the successful vzdump-to-NFS-storage backups vs the failed backup/restore using PBS.
 
Today's backup/restore test was with the latest updates for PVE:
Code:
2020-11-05 12:00:52 upgrade libpve-common-perl:all 6.2-2 6.2-4
2020-11-05 12:00:52 upgrade proxmox-backup-client:amd64 0.9.5-1 0.9.6-1
2020-11-05 12:00:53 upgrade pve-qemu-kvm:amd64 5.1.0-4 5.1.0-5
2020-11-05 12:01:00 upgrade qemu-server:amd64 6.2-18 6.2-19

...and latest updates for PBS:
Code:
2020-11-05 11:59:51 upgrade proxmox-backup-client:amd64 0.9.5-1 0.9.6-1
2020-11-05 11:59:51 upgrade proxmox-backup-docs:all 0.9.5-1 0.9.6-1
2020-11-05 11:59:52 upgrade proxmox-backup-server:amd64 0.9.5-1 0.9.6-1

Same issue as before. No errors on backup or restore but the restored VM is corrupt (black screen).
 
What happen if you disable windows firewall in your original vm, test if ip is pingable. Backup with pbs and restore to a different vm. Disable nic from original vm, copy mac address from original vm to new vm nic and start the vm. Is the restored vm pingable and only windows explorer has issues?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!