problem with hanging vzdump jobs

alexskysilk

Distinguished Member
Oct 16, 2015
1,521
263
153
Chatsworth, CA
www.skysilk.com
I have a problem with a specific container hanging on vzdump jobs. It has a unionfs mounted file system and are a result the vzdump task hangs. The consequences of this is that no other backup jobs on that node can process until that task is cancelled manually.

1. Is there a way to control (exclude) directories or filesystems from a vzdump job?
2. Is there a way to force timeout of a vzdump task?

I really like this tool (vzdump) but this issue makes it undependable.
 
@1: Just exclude the vms which are in the storage you want backup?
@2: If as exaple your NFS backup Storage is not available, by design of NFS the system is waiting until it is aviable again...
 
@1: Just exclude the vms which are in the storage you want backup?
@2: If as exaple your NFS backup Storage is not available, by design of NFS the system is waiting until it is aviable again...

All true, but I think you misunderstand my issue. I dont want to exclude the container from backup, I want to exclude the non native file system mounted INSIDE the container mounted via unionfs or similar.

The consequence of attempting vzdump backup on such a container is that the backup doesnt complete, and no other backup jobs can process.
 
What storage do you use? When I use local ZFS, it will exclude my mount points automatically.
 
What storage do you use? When I use local ZFS, it will exclude my mount points automatically.

This particular case has a user mounting his gdrive via unionfs. I am curious, how are you mounting zfs inside containers?

What I am trying to accomplish is twofold: a vzdump switch similar to the --one-file-system rsync switch, and a method to autokill hung vzdump tasks.
 
This particular case has a user mounting his gdrive via unionfs. I am curious, how are you mounting zfs inside containers?

What I am trying to accomplish is twofold: a vzdump switch similar to the --one-file-system rsync switch, and a method to autokill hung vzdump tasks.

Like so:

mp0: /tank/media,mp=/mnt/media
mp1: /tank/timemachine,mp=/mnt/timemachine
mp2: /tank/downloads,mp=/mnt/downloads
mp3: /tank/backup,mp=/mnt/backup

This is inside /etc/pve/lxc/100.conf

You can do it in the GUI as well under mount point.
 
that makes sense, although it has nothing to do with zfs (what that does is bind mounts or nullfs mounts.) I believe that vzdump ignores /mnt by default although I have it explicitly stated in vzdump.conf just in case.

What if the mounts are not directed from outside the container? if the user decided to make the mounts himself and grafted them elsewhere from /mnt?
 
I have other containers that mounts to /download /media etc. and they also work just fine.
 
I have other containers that mounts to /download /media etc. and they also work just fine.

again, you're mounting using bind mounts which dont cause any issue to lxc-freeze the way userspace mounts do. userspace mounts such as fuse and unionfs cause lxc-freeze to break which, in turn, hangs the backup. I can manually ignore filesystem locations but that would mean I know where they are ahead of time, which is not an available option for userspace mounted file systems (I dont have any way to prevent users from WHERE they are mounting stuff...)
 
again, you're mounting using bind mounts which dont cause any issue to lxc-freeze the way userspace mounts do. userspace mounts such as fuse and unionfs cause lxc-freeze to break which, in turn, hangs the backup. I can manually ignore filesystem locations but that would mean I know where they are ahead of time, which is not an available option for userspace mounted file systems (I dont have any way to prevent users from WHERE they are mounting stuff...)

see the note about FUSE mounts at https://pve.proxmox.com/pve-docs/pve-admin-guide.html#pct_container_storage . freezing the LXC cgroup is all or nothing, you cannot exclude certain paths from it. we need the freeze to ensure consistency (we don't have a dirty bitmap like mechanism available like for VMs). either use stop backup mode, or don't allow FUSE mounts in your containers, or establish them on the host and bind-mount them into the container.
 
Fabian, I get that. Can we have a graceful failure for a situation where lxc-freeze is unable to quiesce a filesystem with this condition? At the very least it will prevent such a situation from disrupting other containers from having backups process.
 
Fabian, I get that. Can we have a graceful failure for a situation where lxc-freeze is unable to quiesce a filesystem with this condition? At the very least it will prevent such a situation from disrupting other containers from having backups process.

if you enable FUSE for a container (it does not work out of the box after all), make sure to only use stop mode backups for that container. alternatively, you could probably cook up a hook script that aborts backup tasks (e.g., at the backup-start hook point) for containers which have an active FUSE mount, giving you your fail-safe exit without affecting other, non-FUSE containers.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!