Backups of LXC with new 8.3 mount option "discard" fail

meyergru

Active Member
Jan 28, 2023
109
40
33
www.congenio.de
Today I noticed that when I changed my LXC mount options to "lazyatime, discard", my nightly backups fail:

Code:
INFO: starting new backup job: vzdump 1010 --storage pbs.xyz--notification-mode auto --remove 0 --notes-template 'Daily {{guestname}}' --node kaiju --mode snapshot
INFO: Starting Backup of VM 1010 (lxc)
INFO: Backup started at 2024-11-27 21:23:21
INFO: status = running
INFO: CT Name: uwe
INFO: including mount point rootfs ('/') in backup
INFO: found old vzdump snapshot (force removal)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
filesystem 'rpool/data/subvol-1010-disk-0@vzdump' cannot be mounted due to invalid option 'discard'.
Use the '-s' option to ignore the bad mount option.
umount: /mnt/vzsnap0/: not mounted.
command 'umount -l -d /mnt/vzsnap0/' failed: exit code 32
ERROR: Backup of VM 1010 failed - command 'mount -o ro -o discard,lazytime -t zfs rpool/data/subvol-1010-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 1
INFO: Failed at 2024-11-27 21:23:21
INFO: Backup job finished with errors
TASK ERROR: job errors

I cannot see why this should not be possible. After all, I get that the snapshot will obviously be read-only, but to do that, the original mount option could (and probably should) be ignored for that backup snapshot.

There is a bug report open on this, even with a proposed patch, but the latter does not work for me.
 
Last edited:
Could you provide further details on the container's config and how the patch was applied?

I could reproduce the same error as described in the bug report with an unpatched pve-container with a container having root disks/mount points with discard and lazy mount options and backing that CT to a PBS instance with proxmox-backup-server version 8.2.10-1, and with a patched pve-container, this bug has been resolved as the change makes sure that during backups the invalid mountpoints are stripped away.
 
I did this on the PVE (as root):

Code:
cd /root
cat >patch <<EOF
---
 src/PVE/LXC.pm        | 8 ++++++--
 src/PVE/LXC/Config.pm | 6 ++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm
index e78e365..d01fafc 100644
--- a/src/PVE/LXC.pm
+++ b/src/PVE/LXC.pm
@@ -1865,11 +1865,16 @@ sub __mountpoint_mount {

     die "unknown snapshot path for '$volid'" if !$storage && defined($snapname);

+    my $readonly = $mountpoint->{ro};
     my $optlist = [];

     if (my $mountopts = $mountpoint->{mountoptions}) {
        my @opts = split(/;/, $mountpoint->{mountoptions});
-       push @$optlist, grep { PVE::LXC::Config::is_valid_mount_option($_) } @opts;
+       if ($readonly || defined($snapname)) {
+           push @$optlist, grep { PVE::LXC::Config::is_valid_ro_mount_option($_) } @opts
;
+       } else {
+           push @$optlist, grep { PVE::LXC::Config::is_valid_mount_option($_) } @opts;
+       }
     }

     my $acl = $mountpoint->{acl};
@@ -1880,7 +1885,6 @@ sub __mountpoint_mount {
     }

     my $optstring = join(',', @$optlist);
-    my $readonly = $mountpoint->{ro};

     my @extra_opts;
     @extra_opts = ('-o', $optstring) if $optstring;
diff --git a/src/PVE/LXC/Config.pm b/src/PVE/LXC/Config.pm
index 5cc37f7..0740e8c 100644
--- a/src/PVE/LXC/Config.pm
+++ b/src/PVE/LXC/Config.pm
@@ -312,12 +312,18 @@ cfs_register_file('/lxc/', \&parse_pct_config, \&write_pct_config);


 my $valid_mount_option_re = qr/(discard|lazytime|noatime|nodev|noexec|nosuid)/;
+my $valid_ro_mount_option_re = qr/(nodev|noexec|nosuid)/;

 sub is_valid_mount_option {
     my ($option) = @_;
     return $option =~ $valid_mount_option_re;
 }

+sub is_valid_ro_mount_option {
+    my ($option) = @_;
+    return $option =~ $valid_ro_mount_option_re;
+}
+
 my $rootfs_desc = {
     volume => {
        type => 'string',
--

EOF

cd /usr/share/perl5/
patch -p2 < /root/patch

I verified that /usr/share/perl5/PVE/LXC.pm and /usr/share/perl5/PVE/LXC/Config.pm were indeed changed.
I first tried a backup after "systemctl restart pveproxy", to no avail. Then, I then retried the backup of the same LXC container with "discard, lazytime" mount options, yielding the same error message as before.

This is all to a PBS on a VM, but that should not matter.

Here is the LXC configuration:

Code:
#Uwes LXC
#
#Debian 12
#
#VLAN0010
#
#
#```
#/etc/systemd/network/eth0.network%3A
#[Match]
#Name = eth0
#
#[Network]
#Description = Interface eth0 autoconfigured by PVE
#IPv6AcceptRA = true
#IPv6PrivacyExtensions = true
#```
arch: amd64
cores: 2
features: nesting=1
hostname: uwe
memory: 2048
nameserver: 192.168.177.1
net0: name=eth0,bridge=vlanbridge,firewall=1,gw6=fe80::1,hwaddr=BC:24:11:EE:55:BB,ip=dhcp
,ip6=2a01:4444:3333:ffff::192.168.177.10/64,tag=10,type=veth
onboot: 1
ostype: debian
parent: vzdump
protection: 1
rootfs: local-zfs:subvol-1010-disk-0,size=4G,mountoptions=lazytime;discard
searchdomain: xyz.de
swap: 512
tags: autostart;vlan
unprivileged: 1

[autodaily241122000103]
#cv4pve-autosnap
arch: amd64
cores: 2
features: nesting=1
hostname: uwe
memory: 2048
nameserver: 192.168.177.1
net0: name=eth0,bridge=vlanbridge,firewall=1,gw6=fe80::1,hwaddr=BC:24:11:EE:55:BB,ip=dhcp
,ip6=2a01:4444:3333:ffff::192.168.177.10/64,tag=10,type=veth
onboot: 1
ostype: debian
parent: autoweekly241117000102
protection: 1
rootfs: local-zfs:subvol-1010-disk-0,size=4G
searchdomain: xyz.de
snaptime: 1732230063
swap: 512
tags: autostart;vlan
unprivileged: 1
 
Last edited:
I've recreated the exact same container (also with one snapshot, where the mount options were not applied to the rootfs, while the current container config has those mount options) and I could reproduce the issue. After applying the patch manually and restarting pvedaemon, it worked as expected.

I first tried a backup after "systemctl restart pveproxy", to no avail. Then, I then retried the backup of the same LXC container with "discard, lazytime" mount options, yielding the same error message as before.
The pveproxy service is used to proxy API calls between different nodes, e.g. when stopping a VM from node1 in the WebGUI of node3. To allow the new patched code to be run when issuing a new backup to the PBS instance, you must restart the pvedaemon with systemctl restart pvedaemon.
 
Thanks for getting back here!

Yes, if the service is not restarted, it will still run the same (compiled) Perl code, that is still in memory. It has to be restarted to take effect, which is triggered when a package is reinstalled.
 
I have this bug with the latest versions of todays pve-packages as well.
All I did was to add the mount options "discard, lazytime" to the root disks of all the LXCs. Now the backup fails.


This is the relevant part of the backup log:

Code:
INFO: create storage snapshot 'vzdump'
filesystem 'rpool/Containers/subvol-103-disk-0@vzdump' cannot be mounted due to invalid option 'discard'.
Use the '-s' option to ignore the bad mount option.
umount: /mnt/vzsnap0/: not mounted.
command 'umount -l -d /mnt/vzsnap0/' failed: exit code 32
ERROR: Backup of VM 103 failed - command 'mount -o ro -o discard,lazytime -t zfs rpool/Containers/subvol-103-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 1
INFO: Failed at 2025-02-22 09:17:12
 
Last edited:
This should be pushed up higher. This bug is not only causing the absence of backups(!), it corrupts the config files of LXCs/VMs as well.
 
This should be pushed up higher. This bug is not only causing the absence of backups(!), it corrupts the config files of LXCs/VMs as well.
Can you elaborate a bit more please. How does this corrupt config files - of both LXCs and/or VMs presumably ?

I am having this issue as well (no corruption, just no backups) with a CT that I created today. While on the same host I have another CT (created earlier) with the same debian12 template, that has lazytime+discard and that one backups normally. Which is even more strange.
 
I have the exact same errors and backups fail with options discard and noatime enabled. Today is 3/12 and this is going on since November of 2024. Is the patch going to be pushed to an official update? Or at least can we have a relatively straightforward/temporary solution for this?

INFO: starting new backup job: vzdump 101 --notes-template '{{guestname}}' --compress zstd --remove 0 --mode snapshot --notification-mode auto --storage network-storage --node proxmox

INFO: filesystem type on dumpdir is 'cifs' -using /var/tmp/vzdumptmp1138068_101 for temporary files

INFO: Starting Backup of VM 101 (lxc)

INFO: Backup started at 2025-03-12 11:48:03

INFO: status = running

INFO: CT Name: zerotier-lxc

INFO: including mount point rootfs ('/') in backup

INFO: found old vzdump snapshot (force removal)

INFO: backup mode: snapshot

INFO: ionice priority: 7

INFO: create storage snapshot 'vzdump'

filesystem 'rpool/data/subvol-101-disk-0@vzdump' cannot be mounted due to invalid option 'discard'.

Use the '-s' option to ignore the bad mount option.

umount: /mnt/vzsnap0/: not mounted.

command 'umount -l -d /mnt/vzsnap0/' failed: exit code 32

ERROR: Backup of VM 101 failed - command 'mount -o ro -o noatime,discard -t zfs rpool/data/subvol-101-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 1

INFO: Failed at 2025-03-12 11:48:03

INFO: Backup job finished with errors

TASK ERROR: job errors
 
Last edited: