Backups of LXC with new 8.3 mount option "discard" fail

meyergru · Nov 27, 2024

Today I noticed that when I changed my LXC mount options to "lazyatime, discard", my nightly backups fail:

Code:

INFO: starting new backup job: vzdump 1010 --storage pbs.xyz--notification-mode auto --remove 0 --notes-template 'Daily {{guestname}}' --node kaiju --mode snapshot
INFO: Starting Backup of VM 1010 (lxc)
INFO: Backup started at 2024-11-27 21:23:21
INFO: status = running
INFO: CT Name: uwe
INFO: including mount point rootfs ('/') in backup
INFO: found old vzdump snapshot (force removal)
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
filesystem 'rpool/data/subvol-1010-disk-0@vzdump' cannot be mounted due to invalid option 'discard'.
Use the '-s' option to ignore the bad mount option.
umount: /mnt/vzsnap0/: not mounted.
command 'umount -l -d /mnt/vzsnap0/' failed: exit code 32
ERROR: Backup of VM 1010 failed - command 'mount -o ro -o discard,lazytime -t zfs rpool/data/subvol-1010-disk-0@vzdump /mnt/vzsnap0//' failed: exit code 1
INFO: Failed at 2024-11-27 21:23:21
INFO: Backup job finished with errors
TASK ERROR: job errors

I cannot see why this should not be possible. After all, I get that the snapshot will obviously be read-only, but to do that, the original mount option could (and probably should) be ignored for that backup snapshot.

There is a bug report open on this, even with a proposed patch, but the latter does not work for me.

dakralex · Nov 28, 2024

Could you provide further details on the container's config and how the patch was applied?

I could reproduce the same error as described in the bug report with an unpatched pve-container with a container having root disks/mount points with discard and lazy mount options and backing that CT to a PBS instance with proxmox-backup-server version 8.2.10-1, and with a patched pve-container, this bug has been resolved as the change makes sure that during backups the invalid mountpoints are stripped away.

meyergru · Nov 28, 2024

I did this on the PVE (as root):

Code:

cd /root
cat >patch <<EOF
---
 src/PVE/LXC.pm        | 8 ++++++--
 src/PVE/LXC/Config.pm | 6 ++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm
index e78e365..d01fafc 100644
--- a/src/PVE/LXC.pm
+++ b/src/PVE/LXC.pm
@@ -1865,11 +1865,16 @@ sub __mountpoint_mount {

     die "unknown snapshot path for '$volid'" if !$storage && defined($snapname);

+    my $readonly = $mountpoint->{ro};
     my $optlist = [];

     if (my $mountopts = $mountpoint->{mountoptions}) {
        my @opts = split(/;/, $mountpoint->{mountoptions});
-       push @$optlist, grep { PVE::LXC::Config::is_valid_mount_option($_) } @opts;
+       if ($readonly || defined($snapname)) {
+           push @$optlist, grep { PVE::LXC::Config::is_valid_ro_mount_option($_) } @opts
;
+       } else {
+           push @$optlist, grep { PVE::LXC::Config::is_valid_mount_option($_) } @opts;
+       }
     }

     my $acl = $mountpoint->{acl};
@@ -1880,7 +1885,6 @@ sub __mountpoint_mount {
     }

     my $optstring = join(',', @$optlist);
-    my $readonly = $mountpoint->{ro};

     my @extra_opts;
     @extra_opts = ('-o', $optstring) if $optstring;
diff --git a/src/PVE/LXC/Config.pm b/src/PVE/LXC/Config.pm
index 5cc37f7..0740e8c 100644
--- a/src/PVE/LXC/Config.pm
+++ b/src/PVE/LXC/Config.pm
@@ -312,12 +312,18 @@ cfs_register_file('/lxc/', \&parse_pct_config, \&write_pct_config);


 my $valid_mount_option_re = qr/(discard|lazytime|noatime|nodev|noexec|nosuid)/;
+my $valid_ro_mount_option_re = qr/(nodev|noexec|nosuid)/;

 sub is_valid_mount_option {
     my ($option) = @_;
     return $option =~ $valid_mount_option_re;
 }

+sub is_valid_ro_mount_option {
+    my ($option) = @_;
+    return $option =~ $valid_ro_mount_option_re;
+}
+
 my $rootfs_desc = {
     volume => {
        type => 'string',
--

EOF

cd /usr/share/perl5/
patch -p2 < /root/patch

I verified that /usr/share/perl5/PVE/LXC.pm and /usr/share/perl5/PVE/LXC/Config.pm were indeed changed.
I first tried a backup after "systemctl restart pveproxy", to no avail. Then, I then retried the backup of the same LXC container with "discard, lazytime" mount options, yielding the same error message as before.

This is all to a PBS on a VM, but that should not matter.

Here is the LXC configuration:

Code:

#Uwes LXC
#
#Debian 12
#
#VLAN0010
#
#
#```
#/etc/systemd/network/eth0.network%3A
#[Match]
#Name = eth0
#
#[Network]
#Description = Interface eth0 autoconfigured by PVE
#IPv6AcceptRA = true
#IPv6PrivacyExtensions = true
#```
arch: amd64
cores: 2
features: nesting=1
hostname: uwe
memory: 2048
nameserver: 192.168.177.1
net0: name=eth0,bridge=vlanbridge,firewall=1,gw6=fe80::1,hwaddr=BC:24:11:EE:55:BB,ip=dhcp
,ip6=2a01:4444:3333:ffff::192.168.177.10/64,tag=10,type=veth
onboot: 1
ostype: debian
parent: vzdump
protection: 1
rootfs: local-zfs:subvol-1010-disk-0,size=4G,mountoptions=lazytime;discard
searchdomain: xyz.de
swap: 512
tags: autostart;vlan
unprivileged: 1

[autodaily241122000103]
#cv4pve-autosnap
arch: amd64
cores: 2
features: nesting=1
hostname: uwe
memory: 2048
nameserver: 192.168.177.1
net0: name=eth0,bridge=vlanbridge,firewall=1,gw6=fe80::1,hwaddr=BC:24:11:EE:55:BB,ip=dhcp
,ip6=2a01:4444:3333:ffff::192.168.177.10/64,tag=10,type=veth
onboot: 1
ostype: debian
parent: autoweekly241117000102
protection: 1
rootfs: local-zfs:subvol-1010-disk-0,size=4G
searchdomain: xyz.de
snaptime: 1732230063
swap: 512
tags: autostart;vlan
unprivileged: 1

dakralex · Nov 29, 2024

I've recreated the exact same container (also with one snapshot, where the mount options were not applied to the rootfs, while the current container config has those mount options) and I could reproduce the issue. After applying the patch manually and restarting pvedaemon, it worked as expected.

meyergru said:
I first tried a backup after "systemctl restart pveproxy", to no avail. Then, I then retried the backup of the same LXC container with "discard, lazytime" mount options, yielding the same error message as before.

The pveproxy service is used to proxy API calls between different nodes, e.g. when stopping a VM from node1 in the WebGUI of node3. To allow the new patched code to be run when issuing a new backup to the PBS instance, you must restart the pvedaemon with systemctl restart pvedaemon.

meyergru · Nov 29, 2024

I also rebooted, but FWIW, now it works??? I'll update the bug...

dakralex · Nov 29, 2024

Thanks for getting back here!

Yes, if the service is not restarted, it will still run the same (compiled) Perl code, that is still in memory. It has to be restarted to take effect, which is triggered when a package is reinstalled.

meyergru · Nov 29, 2024

I knew that, for that reason I even rebooted after the patch, but I have a few Proxmox instances, so I probably tested the wrong one.

why-be-banned · 2025-01-06T03:21:06+0100

How do I do the path, I don't understand that.

meyergru · 2025-01-06T09:28:15+0100

I do not understand the question.

BTW: The bug is still unfixed in the official PVE release despite the fix being readily available...

Search

Search

Backups of LXC with new 8.3 mount option "discard" fail

meyergru

Member

dakralex

Proxmox Staff Member

meyergru

Member

dakralex

Proxmox Staff Member

meyergru

Member

dakralex

Proxmox Staff Member

meyergru

Member

why-be-banned

Member

meyergru

Member