"File restore" in PVE GUI is not listing content of disk

proxmox-ve: 7.1-1 (running kernel: 5.13.19-1-pve)
pve-manager: 7.1-6 (running version: 7.1-6/4e61e21c)
pve-kernel-5.13: 7.1-4
pve-kernel-helper: 7.1-4
pve-kernel-5.11: 7.0-10
pve-kernel-5.13.19-1-pve: 5.13.19-3
pve-kernel-5.11.22-7-pve: 5.11.22-12
pve-kernel-5.11.22-5-pve: 5.11.22-10
pve-kernel-4.15: 5.4-6
pve-kernel-4.15.18-18-pve: 4.15.18-44
pve-kernel-4.15.18-12-pve: 4.15.18-36
ceph: 16.2.6-pve2
ceph-fuse: 16.2.6-pve2
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-3
libpve-storage-perl: 7.0-15
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.9-4
lxcfs: 4.0.8-pve2
novnc-pve: 1.2.0-3
openvswitch-switch: 2.15.0+ds1-2
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.2-1
proxmox-widget-toolkit: 3.4-3
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-2
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3
 
thanks! if you can reproduce this with some non-production VM, can you see if the issue is still triggered when you use cache=none for the disks or when using a different storage rather than Ceph?
 
Hi,
A new backup, of an always "failing" vm, with "no cache" on btrfs disks looks ok. Disk content is shown.
Ceph is our main storage and will continue to be so, no other storage is available.
/Bengt
 
one more thing if it's possible for you to test - what is the behaviour with 'writeback' but the storage configured to use the kernel RBD driver instead of librbd (e.g., set 'krbd 1' in storage.cfg before starting the VM in question, then revert back so that other guests are not affected)?
 
thanks. so that basically narrows it down to the (lib)rbd write cache being the culprit :-/
 
Hmm, it certainly looks like it due to the test results.
I currently do backups by;
1. fsfreeze
2. Create Ceph snapshot
3. fsthaw

The Ceph snapshot is mountable with no problems either directly (map snapshot and mount) or doing a ceph image export/import.
Any thoughts?
 
if you do a ceph snapshot, that's on another layer (below the librbd cache), so that might not be affected by some cache coherency problem. I'd suggest switching to krbd until we find out more!
 
we are attempting to reproduce the issue, and if we can, we'll try to fix and/or report to the appropriate upstream (qemu or ceph). I'll update the thread once we have more information.
 
so far we haven't been able to reproduce this issue with BTRFS inside the VM and RBD as storage on the PVE side.. could you try updating to the just released PVE 7.2 (comes with a new qemu release ;)) and if the issue persists, give us as much detail about failing VMs as possible? also if you have similar VMs, but only some are affected (often) and others never, please try to give details on both variants maybe we can figure out a root cause/common factor..
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!