Backup of a VM with disk passthrough corrupts ZFS pool inside VM

felix.steinbeis

New Member
Jan 24, 2023
22
4
3
Hello all,

I have on a PVE host (proxmox-ve: 7.3-1, running kernel: 5.15.102-1-pve) PBS and TrueNAS Scale running as a VM. On the TrueNAS there is a ZFS mirror pool. The two hard disks are passed through by disk.

The backup via PBS does not work. The access to the two disks of the ZFS mirror is extremely slow and in the end the ZFS pool on the TrueNAS was corrupted. There were two warnings in TrueNAS: Slow IO and subsequent ZFS pool degraded. Disk are new and with no Smart errors.

Does PBS work with a ZFS mirror in a VM via disk passthrough? Or what is the right way to backup the ZFS pool on a VM with disk passthrough? Or is this a problem like here: https://forum.proxmox.com/threads/super-slow-speed-when-backup-nas.123588/

Thanks for your help.

Many greetings
Felix
 
did you passhtrough the disks for the zfs to both pbs and truenas scale vms? if yes, that cannot ever work, since neither is aware that there is another one modifying the disks
 
did you passhtrough the disks for the zfs to both pbs and truenas scale vms? if yes, that cannot ever work, since neither is aware that there is another one modifying the disks
No. The disks only passthrough to the TrueNAS VM. The backup target disk in PBS is another disk with disk passthrough only to the PBS VM.
 
how do the vm configs look like? (qm config ID) ?
 
how do the vm configs look like? (qm config ID) ?

TrueNAS Scale 22.12.1
Code:
agent: 1
boot: order=scsi0;ide2;net0
cores: 2
ide2: none,media=cdrom
machine: q35
memory: 16384
meta: creation-qemu=7.2.0,ctime=1679350590
name: truenas
net0: virtio=46:FB:7A:83:88:C6,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-100-disk-0,iothread=1,size=32G,ssd=1
scsi1: /dev/disk/by-id/ata-KINGSTON_SEDC500M1920G_50026B7271E9BECF,serial=50026B7271E9BECF,size=1875374424K,ssd=1
scsi2: /dev/disk/by-id/ata-KINGSTON_SEDC500M1920G_50026B7271E9BFB3,serial=50026B7271E9BFB3,size=1875374424K,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=bf733b46-1cfe-4e6c-b314-9632a2236894
sockets: 1
startup: order=5
vmgenid: fb71086c-adb9-463d-b161-f79189f0c699

PBS 2.3-3 (latest updates)
Code:
agent: 1
boot: order=scsi0;ide2;net0
cores: 2
ide2: none,media=cdrom
machine: q35
memory: 6144
meta: creation-qemu=7.1.0,ctime=1678319314
name: pbs
net0: virtio=8A:46:D8:24:8F:14,bridge=vmbr0,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-zfs:vm-106-disk-0,iothread=1,size=16G,ssd=1
scsi1: /dev/disk/by-id/nvme-WD_Red_SN700_1000GB_22481C800028,serial=22481C800028,size=976762584K,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=59d8e955-4fa6-4517-8b4c-b61ad30cd0ed
sockets: 1
startup: order=7
vmgenid: 9604abb9-1517-4785-ad81-da83ea3130b9
 
ok the configs look ok so far. is there anything in the guest or host syslog/journal of interest ?
 
ok the configs look ok so far. is there anything in the guest or host syslog/journal of interest ?

Hi.

I just checked the logs. There are no errors or warnings on the PVE host and the PBS VM. In the TrueNAS VM only the warning that IO is slow and the error that the pool is corrupted at the end, with no indication of what caused it or what triggered it.

Should PBS/vzdump recognize, that it is a ZFS mirro pool in TrueNAS (created via TrueNAS) and only read the data from the pool and not "raw" from the two disks?

Best regards,
Felix
 
Hi.

I just checked the logs. There are no errors or warnings on the PVE host and the PBS VM. In the TrueNAS VM only the warning that IO is slow and the error that the pool is corrupted at the end, with no indication of what caused it or what triggered it.

Should PBS/vzdump recognize, that it is a ZFS mirro pool in TrueNAS (created via TrueNAS) and only read the data from the pool and not "raw" from the two disks?

Best regards,
Felix
i'm feeling i am missing something here, what does the pbs have to do with the zpool inside the truenas?

AFAIU the storage is setup like this:

host:
local-zfs -> both pbs/truenas root disks are on there

pbs:
scsi1: /dev/disk/by-id/nvme-WD_Red_SN700_1000GB_22481C800028,serial=22481C800028,size=976762584K,ssd=1 -> datastore ?

truenas:
scsi1: /dev/disk/by-id/ata-KINGSTON_SEDC500M1920G_50026B7271E9BECF,serial=50026B7271E9BECF,size=1875374424K,ssd=1
scsi2: /dev/disk/by-id/ata-KINGSTON_SEDC500M1920G_50026B7271E9BFB3,serial=50026B7271E9BFB3,size=1875374424K,ssd=1
-> the truenas zpool


what has the zpool in truenas to do with the pbs ? (is it mounted there via nfs/smb?)
 
Hi,
Hi.

I just checked the logs. There are no errors or warnings on the PVE host and the PBS VM. In the TrueNAS VM only the warning that IO is slow and the error that the pool is corrupted at the end, with no indication of what caused it or what triggered it.
if IO inside the VM is slow during backup, I'd try reducing the amount of workers for the backup (job), see here.

Should PBS/vzdump recognize, that it is a ZFS mirro pool in TrueNAS (created via TrueNAS) and only read the data from the pool and not "raw" from the two disks?
No, there is no mechanism to recognize how the data inside the guest is organized. The backup is done via QEMU's block layer.
 
i'm feeling i am missing something here, what does the pbs have to do with the zpool inside the truenas?
Probably nothing directly

AFAIU the storage is setup like this:

host:
local-zfs -> both pbs/truenas root disks are on there
Correct

pbs:
scsi1: /dev/disk/by-id/nvme-WD_Red_SN700_1000GB_22481C800028,serial=22481C800028,size=976762584K,ssd=1 -> datastore ?
Correct

truenas:
scsi1: /dev/disk/by-id/ata-KINGSTON_SEDC500M1920G_50026B7271E9BECF,serial=50026B7271E9BECF,size=1875374424K,ssd=1
scsi2: /dev/disk/by-id/ata-KINGSTON_SEDC500M1920G_50026B7271E9BFB3,serial=50026B7271E9BFB3,size=1875374424K,ssd=1
-> the truenas zpool
Correct

what has the zpool in truenas to do with the pbs ? (is it mounted there via nfs/smb?)
Nothing

I was hoping that the backup (vzdump) would back up the TrueNAS VM including all disks, but be smart enough to back up the ZFS mirror and not the two disks in raw format. I wanted to have the TrueNAS backup accessible via the PVE host and also use single file restore if necessary.
 
Hi,

if IO inside the VM is slow during backup, I'd try reducing the amount of workers for the backup (job), see here.


No, there is no mechanism to recognize how the data inside the guest is organized. The backup is done via QEMU's block layer.
Thank you. I'll try it out.

No, there is no mechanism to recognize how the data inside the guest is organized. The backup is done via QEMU's block layer.
I was hoping that the backup (vzdump) would back up the TrueNAS VM including all disks, but be smart enough to back up the ZFS mirror and not the two disks in "raw format". I wanted to have the TrueNAS backup accessible via the PVE host and also use single file restore if necessary.

So what is the best way to ensure that the two disks from the TrueNAS mirror pool do not both need to be read and how can I backup them?

Should I only include the system drive in the backup and back up the files from the TrueNAS zpool separately? For example with the Proxmox backup client on the TrueNAS VM to the PBS datastore.

What is the best practice and your recommendation?

Best regards,
Felix
 
I was hoping that the backup (vzdump) would back up the TrueNAS VM including all disks, but be smart enough to back up the ZFS mirror and not the two disks in "raw format". I wanted to have the TrueNAS backup accessible via the PVE host and also use single file restore if necessary.

So what is the best way to ensure that the two disks from the TrueNAS mirror pool do not both need to be read and how can I backup them?

Should I only include the system drive in the backup and back up the files from the TrueNAS zpool separately? For example with the Proxmox backup client on the TrueNAS VM to the PBS datastore.

What is the best practice and your recommendation?
Not sure what the best way is, but yes, you could try to only back up the system disk from "outside" and run proxmox-backup-client within the VM to create a file-level backup of the ZFS. But note that restoring the VM will be more complicated like that.
 
I was hoping that the backup (vzdump) would back up the TrueNAS VM including all disks,
How did you set that up?
but be smart enough to back up the ZFS mirror and not the two disks in "raw format".
That would only be possible if the pve hosts mounts that zfs pool during the backup. Which is a very bad idea while the guest is running.

So what is the best way to ensure that the two disks from the TrueNAS mirror pool do not both need to be read and how can I backup them?

Should I only include the system drive in the backup and back up the files from the TrueNAS zpool separately? For example with the Proxmox backup client on the TrueNAS VM to the PBS datastore.
Either that, or do a cold backup while the VM is shut down (NOT just suspended) so the pool can be accessed on the host without breaking stuff.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!