[SOLVED] CT stuck on "create storage snapshot 'vzdump' " when doing a snapshot backup

bootsie123

Well-Known Member
Dec 29, 2018
53
13
48
Hi everyone! During one of my automated weekly backup jobs, I noticed one of my new containers that I setup this week seems to make the job get stuck when doing a snapshot backup. For some reason, it gets stuck saying INFO: create storage snapshot 'vzdump'. I also tried doing a snapshot backup separately with the same results.

Any ideas on what might be causing this? Thanks!

EDIT:

Looks to be an issue with FUSE enabled on the CT. See this post and this other post on the subject


Backup Logs:
Code:
INFO: starting new backup job: vzdump 106 --storage prox_backup --remove 0 --notes-template '{{guestname}}' --node vmworld --mode snapshot --compress zstd
INFO: Starting Backup of VM 106 (lxc)
INFO: Backup started at 2022-11-13 22:14:54
INFO: status = running
INFO: CT Name: Authentik
INFO: including mount point rootfs ('/') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'

CT Config (Ubuntu 22.04):
Code:
arch: amd64
cores: 4
features: fuse=1
hostname: Authentik
memory: 4096
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.1.1,hwaddr=0A:5C:A6:FC:C2:34,ip=192.168.1.84/24,ip6=dhcp,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-zfs:basevol-134-disk-0/subvol-106-disk-0,size=16G
swap: 0
lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: a
lxc.cap.drop:

PVE Versions:
Code:
proxmox-ve: 7.2-1 (running kernel: 5.15.64-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-5.15: 7.2-13
pve-kernel-helper: 7.2-13
pve-kernel-5.4: 6.4-18
pve-kernel-5.3: 6.1-6
pve-kernel-5.15.64-1-pve: 5.15.64-1
pve-kernel-5.15.53-1-pve: 5.15.53-1
pve-kernel-5.15.39-4-pve: 5.15.39-4
pve-kernel-5.15.39-1-pve: 5.15.39-1
pve-kernel-5.15.35-3-pve: 5.15.35-6
pve-kernel-5.4.189-2-pve: 5.4.189-2
pve-kernel-5.4.189-1-pve: 5.4.189-1
pve-kernel-4.15: 5.4-14
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-4.15.18-26-pve: 4.15.18-54
pve-kernel-4.15.18-9-pve: 4.15.18-30
ceph-fuse: 14.2.21-1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-3
libpve-guest-common-perl: 4.1-4
libpve-http-server-perl: 4.1-4
libpve-network-perl: 0.7.1
libpve-storage-perl: 7.2-10
libqb0: 1.0.5-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.7-1
proxmox-backup-file-restore: 2.2.7-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-3
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-6
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.6-pve1
 
Last edited:
Hi,

All containers do the same behavior, or only the 106?

Do you see anything interesting in the syslog during the backup job?
 
Good question! It seems to only be CT 106. The only thing I'm seeing in syslog during the backup job is this:

Code:
Nov 14 09:18:32 vmworld pvedaemon[222298]: INFO: starting new backup job: vzdump 106 --compress zstd --mode snapshot --node vmworld --notes-template '{{guestname}}' --remove 0 --storage prox_backup
Nov 14 09:18:32 vmworld pvedaemon[222298]: INFO: Starting Backup of VM 106 (lxc)
 
Hello,

Can you provide us with another CT configuration that works with backup without any issue, pct config <CTID>? In order to compare the configs
 
Sure!

Code:
arch: amd64
cores: 2
hostname: Heimdall
memory: 2048
net0: name=eth0,bridge=vmbr0,gw=192.168.1.1,hwaddr=16:77:88:68:9D:6F,ip=192.168.1.221/24,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-zfs:subvol-100-disk-0,size=32G
swap: 512
lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: a
lxc.cap.drop:

Just checked the configs of all of my CTs and after doing a bit of digging I'm pretty sure this has to due with using having FUSE enabled on the container. This also seems to be a known issue with other posts on the forums talking about it. Initially, I had it enabled because of issues I ran into with Certbot.

Anyways, thanks for pointing me in the right direction! I probably should have started with that first
 
Is there a solution or workaround for this without disabling FUSE? I just noticed that backups are broken for all containers that use this feature.
Is there an open issue for this with LXC or the kernel?

I'm running Docker inside of LXC on ZFS, which requires fuse-overlayfs in order to work.
 
To clarify the issue of snapshots on LXC containers with the FUSE feature enabled.

If your LXC container is Unprivileged (So your options read: Unprivileged container = Yes), and you have the FUSE feature enabled (fuse=1). Then snapshot backup will work.

The problem here is setting your container to privileged (Unprivileged container = No) and then applying settings such as
Code:
lxc.cap.drop:
lxc.cgroup2.devices.allow: c 189:* rwm
lxc.mount.entry: /dev/bus/usb/004 dev/bus/usb/004 none bind,optional,create=dir
will cause the snapshot backup process to remain stuck on the LXC container.
 
Last edited:
Hi,
To clarify the issue of snapshots on LXC containers with the FUSE feature enabled.

If your LXC container is Unprivileged (So your options read: Unprivileged container = Yes), and you have the FUSE feature enabled (fuse=1). Then snapshot backup will work.
did you actually have an active FUSE mount within the container when you made the backup?

The problem here is setting your container to privileged (Unprivileged container = No) and then applying settings such as
Code:
lxc.cap.drop:
lxc.cgroup2.devices.allow: c 189:* rwm
lxc.mount.entry: /dev/bus/usb/004 dev/bus/usb/004 none bind,optional,create=dir
will cause the snapshot backup process to remain stuck on the LXC container.
Sounds like a different issue that the one with FUSE.
 
Hi,

did you actually have an active FUSE mount within the container when you made the backup?
Not sure what you mean by active FUSE mount? Let me know how to check and I'll happily do it.

This LXC runs docker and has containers for MQTT, MySQL, PostgreSQL, Redis, and SurrealDB.
1683056400907.png
1683057277834.png

I have the fuse-overlayfs binary
1683056923635.png

which is used by docker as a storage driver
1683056987706.png

I haven't had issues with snapshot backups or restoring.

The LXC that runs as privileged and uses lxc to passthrough the USB port always gets stuck on a snapshot backup ¯\_(ツ)_/¯
 

Attachments

  • 1683056891138.png
    1683056891138.png
    9.1 KB · Views: 5
Last edited:
Not sure what you mean by active FUSE mount? Let me know how to check and I'll happily do it.
Well, you could set the fuse feature without actually having a FUSE mount point inside the LXC ;) But I guess from what you say below, you did have some (can be checked with mount | grep fuse).

This LXC runs docker and has containers for MQTT, MySQL, PostgreSQL, Redis, and SurrealDB.
View attachment 49919
View attachment 49924
It's highly recommended to use Docker in a VM instead. Search the forum for problems people had running docker in LXC.

I haven't had issues with snapshot backups or restoring.
Other people did have the issue upon backup, but maybe the situation in the freezer subsystem improved a little. Or maybe some configurations are not affected.

The LXC that runs as privileged and uses lxc to passthrough the USB port always gets stuck on a snapshot backup ¯\_(ツ)_/¯
You could try to exclude the path where the USB is mounted, there is an --exclude-path option for vzdump. Maybe that helps, otherwise you might need to use a different backup mode. Of course Proxmox VE cannot snapshot the USB port ;)
 
Well, you could set the fuse feature without actually having a FUSE mount point inside the LXC ;) But I guess from what you say below, you did have some (can be checked with mount | grep fuse).
1683139174907.png


It's highly recommended to use Docker in a VM instead. Search the forum for problems people had running docker in LXC.
When I was first reading up on using Docker in a LXC, a lot of posts mentioned reduced performance and increased disk usage because Docker could not use overlay2. fuse-overlayfs solves that problem and even though it introduces some performance hit, this is a home server so I don't really care.


You could try to exclude the path where the USB is mounted, there is an --exclude-path option for vzdump. Maybe that helps, otherwise you might need to use a different backup mode. Of course Proxmox VE cannot snapshot the USB port ;)
I tried giving --exclude-path a shot and unfortunately it still doesn't work.
Code:
vzdump 107 --exclude-path /dev/bus/* --mode snapshot

I'm honestly not trying to solve this problem and was just sharing my experience and findings for anyone else who finds this post. I'm happy enough with using stop on this LXC and snapshot for everything else.

--

I really do think the problem is using a privileged LXC container running FUSE + Docker. I have two privileged LXC's and both have USB passthrough.

Privileged without Docker or fuse-overlayfs with successful snapshots
Code:
arch: amd64
cores: 1
hostname: zigbee2mqtt
memory: 256
nameserver: 192.168.0.254 192.168.0.1
net0: name=eth0,bridge=vmbr0,hwaddr=2a:70:53:73:06:8b,ip=dhcp,type=veth
onboot: 1
ostype: debian
rootfs: local-lvm:vm-112-disk-0,size=2G
swap: 256
lxc.cap.drop:
lxc.cgroup2.devices.allow: c 188:* rwm
lxc.mount.entry: /dev/serial/by-id/usb-ITead_Sonoff_Zigbee_3.0_USB_Dongle_Plus_5c05cfb94a19ec1190ab39cc47486eb0-if00-port0 dev/serial/by-id/usb-ITead_Sonoff_Zigbee_3.0_USB_Dongle_Plus_5c05cfb94a19ec1190ab39cc47486eb0-if00-port0 none bind,optional,create=file

Privileged with Docker and fuse-overlayfs with stuck snapshots
Code:
arch: amd64
cores: 3
features: fuse=1,nesting=1
hostname: Frigate
memory: 1024
nameserver: 192.168.0.254 192.168.0.1
net0: name=eth0,bridge=vmbr0,firewall=1,hwaddr=3a:a4:3b:03:1c:56,ip=dhcp,type=veth
onboot: 1
ostype: debian
rootfs: local-lvm:vm-107-disk-0,mountoptions=noatime,size=4128M
swap: 256
lxc.cap.drop:
lxc.cgroup2.devices.allow: c 189:* rwm
lxc.mount.entry: /dev/bus/usb/004 dev/bus/usb/004 none bind,optional,create=dir
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!