Restore from PVE results in corrupted VM image

dsg

New Member
Jan 20, 2021
3
0
1
45
Hi,

I'm trying out the Proxmox Backup Server in a testing environment. Both PVE and PBS are on the no-subscription repositories, using the latest packages.

When doing a VM backup and restore through the PVE interface, the restored VM images are always corrupted. Restoring the same VM image through the PBS interface gives a working copy.

Steps to reproduce:
* Pick VM with data on virtual drives
* Shut VM down
* Dump VM images directly from ceph:
root@ht-virt03:~/foo# sha256sum vm-104-disk-*
6222a996226769bdb2b6f290e65360ca2b7e4c541704aea2363b9f697b50700d vm-104-disk-0
ddeff1b1fde12f624c9ed7dcc3e96fb508bd7fc6e10c388280dde69ea838b593 vm-104-disk-1

* Using PVE interface, go to VM page -> backup, press "Backup now"
* Still in PVE interface, select the new backup, press "Restore" and wait for it to finish
* Dump VM images directly from ceph and compare:

root@ht-virt03:~/foo/restore# sha256sum vm-104-disk-*
5d1008ead8ede560769684908196c162b2071edd6f137ac353d0acad0ec8f992 vm-104-disk-0
2923599ef66bee8ed2a03bfde2b965f4726aedc31a17e90d819b080525c5f666 vm-104-disk-1

root@ht-virt03:~/foo/restore# cmp ../vm-104-disk-0 vm-104-disk-0
../vm-104-disk-0 vm-104-disk-0 differ: byte 4697620481, line 16655800
root@ht-virt03:~/foo/restore# cmp ../vm-104-disk-1 vm-104-disk-1
../vm-104-disk-1 vm-104-disk-1 differ: byte 30408705, line 4

Checksums don't match.

* Go to PBS web UI, find the backup and press the download button:
user@web-management:~/Downloads$ sha256sum drive-virtio1.img
ddeff1b1fde12f624c9ed7dcc3e96fb508bd7fc6e10c388280dde69ea838b593 drive-virtio1.img

Checksum matches the original image.

Is this a known bug?
 
no.

could you please include
Code:
pveversion -v
qm config XXX

and details about your ceph setup
 
No problem.

This is a 3-node PVE cluster with ceph. We are running a replicated pool with size=3, min_size=2. We have not had any issues with ceph, and doing an `rbd export` for the VM image repeatedly gets the same data back. Whatever the issue is I don't think it's ceph related.

The requested details are below, let me know if there is some more information I can provide to help diagnose this.

Code:
root@ht-virt03:/home/ansible# pveversion -v
proxmox-ve: 6.3-1 (running kernel: 5.4.78-2-pve)
pve-manager: 6.3-3 (running version: 6.3-3/eee5f901)
pve-kernel-5.4: 6.3-3
pve-kernel-helper: 6.3-3
pve-kernel-5.4.78-2-pve: 5.4.78-2
pve-kernel-5.4.65-1-pve: 5.4.65-1
pve-kernel-5.4.60-1-pve: 5.4.60-2
pve-kernel-5.4.34-1-pve: 5.4.34-2
ceph: 14.2.16-pve1
ceph-fuse: 14.2.16-pve1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.7
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.1-3
libpve-common-perl: 6.3-2
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.1-1
libpve-storage-perl: 6.3-3
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.6-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-2
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-2
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1

Code:
root@ht-virt03:/home/ansible# qm config 104
acpi: 1
autostart: 0
balloon: 0
boot: order=virtio0;ide2
cores: 2
cpu: SandyBridge-IBRS
cpuunits: 1000
description: Test machine for Dav%C3%AD%C3%B0 Steinn Geirsson.%0ANot used for anything important.
ide2: cephfs:iso/deploy-dev-dsg01.isnic.is.iso,media=cdrom
kvm: 1
memory: 1536
name: dev-dsg01.isnic.is
net0: virtio=36:0D:5F:8A:07:1F,bridge=vmbr0,tag=4
net1: virtio=02:9D:33:33:2B:54,bridge=vmbr0,tag=12
onboot: 1
ostype: l26
scsihw: virtio-scsi-pci
smbios1: uuid=035fab17-8737-4326-8f7f-46c35d7cace8
sockets: 1
tablet: 0
template: 0
vga: std
virtio0: virt:vm-104-disk-0,size=10G
virtio1: virt:vm-104-disk-1,size=2G
vmgenid: f3b92fc7-9dfb-4a80-8b29-4d46b4cb54fe
 
are the checksum correct if you restore to a non-ceph storage?
 
Yes, surprisingly it works if restored to NFS, the checksums match the original:
Code:
root@ht-virt03:/mnt/pve/nfs/images/104# sha256sum *
6222a996226769bdb2b6f290e65360ca2b7e4c541704aea2363b9f697b50700d  vm-104-disk-0.raw
ddeff1b1fde12f624c9ed7dcc3e96fb508bd7fc6e10c388280dde69ea838b593  vm-104-disk-1.raw