NVMe-backed ZFS containers don't mount automatically after boot

twp

New Member
Sep 19, 2019
9
0
1
41
TL;DR I need to run "fs mount -O -a" to start my containers after a reboot, and I don't know why.

---------------------

I'm brand new to Proxmox and to ZFS, so I apologize in advance if I'm missing something obvious here. But I've run out of forum & Google searches and seem to have hit a brick wall in terms of properly solving this on my own.

My server is using 2x SSD drives for the OS, and 2x NVMe drives (via PCIe) for the images. ZFS Raid 1 for both. Running 6.0-4. Fully up to date.

The issue I ran into immediately after installing Proxmox was that when my server booted, it would fail because the NVMe drives did not mount in time. At least that's what I gathered. Either way, I solved this by setting:

/etc/default/zfs
ZFS_INITRD_PRE_MOUNTROOT_SLEEP='3'
ZFS_INITRD_POST_MODPROBE_SLEEP='3'

With those changes, Proxmox booted fine. However, I now run into an issue of containers failing to start:

Job for pve-container@100.service failed because the control process exited with error code.
See "systemctl status pve-container@100.service" and "journalctl -xe" for details. TASK ERROR: command 'systemctl start pve-container@100' failed: exit code 1

If I run the systemctl command:
root@hfx1:~# systemctl status pve-container@100.service
pve-container@100.service - PVE LXC Container: 100
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2019-09-18 09:49:00 ADT; 12s ago
Docs: man:lxc-start
man:lxc
man:pct
Process: 30095 ExecStart=/usr/bin/lxc-start -n 100 (code=exited, status=1/FAILURE)

Sep 18 09:48:59 hfx1 systemd[1]: Starting PVE LXC Container: 100...
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: lxccontainer.c: wait_on_daemonized_start: 856 No such fil
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: tools/lxc_start.c: main: 330 The container failed to star
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: tools/lxc_start.c: main: 333 To get more details, run the
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: tools/lxc_start.c: main: 336 Additional information can b
Sep 18 09:49:00 hfx1 systemd[1]: pve-container@100.service: Control process exited, code=exited, status=1/FAILUR
Sep 18 09:49:00 hfx1 systemd[1]: pve-container@100.service: Failed with result 'exit-code'.
Sep 18 09:49:00 hfx1 systemd[1]: Failed to start PVE LXC Container: 100.

These same errors occur if I try to manually start the containers from the control panel. They also occur if I add a grub boot delay of 10secs, which I saw suggested online.

However, if I manually run "zfs mount -O -a", I can then start the containers successfully.

So I have a "solution" but to be honest I don't really understand why it works, and what the proper fix is for this ongoing. I feel like there is probably a better solution to resolve this the proper way.

Any ideas or suggestions would be appreciated!
 
Misc system info for reference:
root@hfx1:~# cat /etc/pve/lxc/100.conf
arch: amd64
cores: 1
hostname: mm-slave
memory: 4000
net0: name=eth0,bridge=vmbr0,firewall=1,gw=192.168.2.1,hwaddr=42:C8:89:B2:3D:D7,ip=192.168.2.100/24,ip6=dhcp,type=veth
onboot: 1
ostype: centos
rootfs: images:subvol-100-disk-0,size=26G
swap: 4000
unprivileged: 1

root@hfx1:~# zpool status
pool: images
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
images ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
nvme-eui.1906708018550001001b448b44b6f15b ONLINE 0 0 0
nvme-eui.19117c8025750001001b448b44eb3b31 ONLINE 0 0 0

errors: No known data errors

pool: rpool
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-INTEL_SSDSC2KW256G8_PHLA807002E7256CGN-part3 ONLINE 0 0 0
ata-INTEL_SSDSC2KW256G8_PHLA806401HZ256CGN-part3 ONLINE 0 0 0

errors: No known data errors

root@hfx1:~# cat /etc/pve/storage.cfg
dir: local
path /var/lib/vz
content vztmpl,iso,backup

zfspool: local-zfs
pool rpool/data
content rootdir,images
sparse 1

zfspool: images
pool images
content rootdir,images
nodes hfx1

root@hfx1:~# cat /tmp/100.log
#root@hfx1:~# lxc-start -n 100 -l debug -o /tmp/100.log
lxc-start 100 20190918203838.317 INFO confile - confile.c:set_config_idmaps:1673 - Read uid map: type u nsid 0 hostid 100000 range 65536
lxc-start 100 20190918203838.317 INFO confile - confile.c:set_config_idmaps:1673 - Read uid map: type g nsid 0 hostid 100000 range 65536
lxc-start 100 20190918203838.318 INFO lxccontainer - lxccontainer.c:do_lxcapi_start:984 - Set process title to [lxc monitor] /var/lib/lxc 100
lxc-start 100 20190918203838.319 INFO lsm - lsm/lsm.c:lsm_init:50 - LSM security driver AppArmor
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "reject_force_umount # comment this to allow umount -f; not recommended"
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for reject_force_umount action 0(kill)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for reject_force_umount action 0(kill)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for reject_force_umount action 0(kill)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for reject_force_umount action 0(kill)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "[all]"
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "kexec_load errno 1"
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for kexec_load action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for kexec_load action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for kexec_load action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for kexec_load action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "open_by_handle_at errno 1"
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for open_by_handle_at action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for open_by_handle_at action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for open_by_handle_at action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for open_by_handle_at action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "init_module errno 1"
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for init_module action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for init_module action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for init_module action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for init_module action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "finit_module errno 1"
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for finit_module action 327681(errno)
lxc-start 100 20190918203838.320 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for finit_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for finit_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for finit_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "delete_module errno 1"
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for delete_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for delete_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for delete_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for delete_module action 327681(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "keyctl errno 38"
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for keyctl action 327718(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for keyctl action 327718(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for keyctl action 327718(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for keyctl action 327718(errno)
lxc-start 100 20190918203838.321 INFO seccomp - seccomp.c:parse_config_v2:970 - Merging compat seccomp contexts into main context
lxc-start 100 20190918203838.321 INFO conf - conf.c:run_script_argv:356 - Executing script "/usr/share/lxc/hooks/lxc-pve-prestart-hook" for container "100", config section "lxc"
lxc-start 100 20190918203839.232 DEBUG conf - conf.c:run_buffer:326 - Script exec /usr/share/lxc/hooks/lxc-pve-prestart-hook 100 lxc pre-start with output: cannot open directory //images/subvol-100-disk-0: No such file or directory

lxc-start 100 20190918203839.237 ERROR conf - conf.c:run_buffer:335 - Script exited with status 2
lxc-start 100 20190918203839.237 ERROR start - start.c:lxc_init:861 - Failed to run lxc.hook.pre-start for container "100"
lxc-start 100 20190918203839.237 ERROR start - start.c:__lxc_start:1944 - Failed to initialize container "100"
lxc-start 100 20190918203839.238 DEBUG lxccontainer - lxccontainer.c:wait_on_daemonized_start:853 - First child 30996 exited
lxc-start 100 20190918203839.238 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:856 - No such file or directory - Failed to receive the container state
lxc-start 100 20190918203839.238 ERROR lxc_start - tools/lxc_start.c:main:330 - The container failed to start
lxc-start 100 20190918203839.238 ERROR lxc_start - tools/lxc_start.c:main:333 - To get more details, run the container in foreground mode
lxc-start 100 20190918203839.238 ERROR lxc_start - tools/lxc_start.c:main:336 - Additional information can be obtained by setting the --logfile and --logpriority options
 
try setting this . see man zfs
Code:
     overlay=off|on
       Allow mounting on a busy directory or a directory which already contains files or directories. This is the default
       mount behavior for Linux file systems.  For consistency with OpenZFS on other platforms overlay mounts are off by
       default. Set to on to enable overlay mounts.

# so :
zfs set overlay=on  <zpool name>
 
Thanks @RobFantini - unfortunately the same issue occurs on boot.

My configuration:
Code:
root@hfx1:~# zfs list
NAME                       USED  AVAIL     REFER  MOUNTPOINT
images                    15.3G   884G      104K  /images
images/subvol-100-disk-0  15.2G  10.8G     15.2G  /images/subvol-100-disk-0
rpool                     6.60G   210G      104K  /rpool
rpool/ROOT                6.60G   210G       96K  /rpool/ROOT
rpool/ROOT/pve-1          6.60G   210G     6.60G  /
rpool/data                  96K   210G       96K  /rpool/data


root@hfx1:~# zfs list -o overlay
OVERLAY
     on
     on
    off
    off
    off
    off
 
we recently started using a lot of nvme's , for ceph and some single drive ext4 . I have seen some warnings in syslog and dmesg.

check;
Code:
dmesg|grep -i nvme

grep -i nvme  /var/log/syslog

grep -i  zpool /var/log/syslog

grep -i images /var/log/syslog
# and other strings you can think of.  there may be some clues on the cause of the problem
 
Thanks for the reply, @RobFantini. Looking at the logs, the only things that stand out to me are:
root@hfx1:~# grep -i zpool /var/log/syslog
Sep 20 11:21:44 hfx1 zpool[1790]: no pools available to import

root@hfx1:~# grep -i images /var/log/syslog
Sep 20 11:21:53 hfx1 kernel: [ 44.730921] audit: type=1400 audit(1568989313.677:18): apparmor="DENIED" operation="mount" info="failed flags match" error=-13 profile="/usr/bin/lxc-start" name="/images/" pid=3372 comm="mount.zfs" fstype="zfs" srcname="images" flags="rw, strictatime"
 
the line
Code:
Sep 20 11:21:44 hfx1 zpool[1790]: no pools available to import

check lines before and after , was that from system startup or cli attempt?
 
@RobFantini. here's from a system startup:

Sep 24 20:26:49 hfx1 systemd[1]: Starting Import ZFS pools by cache file...
Sep 24 20:26:49 hfx1 kernel: [ 0.031489] ACPI: MCFG 0x00000000BD334A10 00003C (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031491] ACPI: WD__ 0x00000000BD334A50 000134 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031493] ACPI: SLIC 0x00000000BD334B88 000024 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 systemd[1]: Mounting /mnt/disk5...
Sep 24 20:26:49 hfx1 kernel: [ 0.031495] ACPI: ERST 0x00000000BD3241EC 000270 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031496] ACPI: HEST 0x00000000BD32445C 000620 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031498] ACPI: BERT 0x00000000BD32402C 000030 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031500] ACPI: EINJ 0x00000000BD32405C 000190 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031502] ACPI: TCPA 0x00000000BD3352E4 000064 (v02 DELL PE_SC$
Sep 24 20:26:49 hfx1 systemd[1]: Mounting /mnt/disk1...
Sep 24 20:26:49 hfx1 kernel: [ 0.031504] ACPI: PC__ 0x00000000BD335274 00006E (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031506] ACPI: SRAT 0x00000000BD334DB0 0004C0 (v01 DELL PE_SC$
Sep 24 20:26:49 hfx1 kernel: [ 0.031508] ACPI: SSDT 0x00000000BD338000 00A2B4 (v01 INTEL PPM R$
Sep 24 20:26:49 hfx1 kernel: [ 0.031518] ACPI: Local APIC address 0xfee00000
Sep 24 20:26:49 hfx1 zpool[1723]: no pools available to import
Sep 24 20:26:49 hfx1 kernel: [ 0.031559] SRAT: PXM 1 -> APIC 0x02 -> Node 0
Sep 24 20:26:49 hfx1 kernel: [ 0.031559] SRAT: PXM 2 -> APIC 0x22 -> Node 1
Sep 24 20:26:49 hfx1 systemd[1]: Started Import ZFS pools by cache file.
Sep 24 20:26:49 hfx1 kernel: [ 0.031560] SRAT: PXM 1 -> APIC 0x04 -> Node 0
Sep 24 20:26:49 hfx1 kernel: [ 0.031560] SRAT: PXM 2 -> APIC 0x24 -> Node 1
Sep 24 20:26:49 hfx1 kernel: [ 0.031561] SRAT: PXM 1 -> APIC 0x06 -> Node 0
Sep 24 20:26:49 hfx1 kernel: [ 0.031561] SRAT: PXM 2 -> APIC 0x26 -> Node 1
Sep 24 20:26:49 hfx1 kernel: [ 0.031561] SRAT: PXM 1 -> APIC 0x08 -> Node 0
Sep 24 20:26:49 hfx1 kernel: [ 0.031562] SRAT: PXM 2 -> APIC 0x28 -> Node 1
Sep 24 20:26:49 hfx1 systemd[1]: Reached target ZFS pool import target.
 
so at boot the system fails to import the zpool , then you can do so after boot. sounds like a bug somewhere related to nvme drivers or quirks on the particular nvme model..

i'd try using /etc/rc.local to do zfs mount -O -a . are you familiar with how to use rc.local with systemd ? if not search for how to do so , i did it once and do not remember the details.
 
  • Like
Reactions: tghd
Thanks - that's what I ended up having to do. I still feel like I must be doing something wrong, but in case anyone else runs into this:

I configured the following in /etc/rc.local (setup guide):

Code:
#!/bin/bash
sleep 15
zfs mount -O -a
for container in $(lxc-ls); do
    pct start $container
done
exit 0

The containers now start up after a reboot.
 
i can confirm having the same problem, i will implement the mentioned workaround for now...
 
@twp i am using two Samsung PM983 960GB drives in a mirrored ZFS setup

PS: your workaround works, thanks for that!
 
glad to be of help

to not have it again:

* make sure to set the cachefile-option when creating a new zpool
* since you usually reboot when there's a new kernel, and a the kernel's postinst-scripts update the initramfs this should work out automatically
 
glad to be of help

to not have it again:

* make sure to set the cachefile-option when creating a new zpool
* since you usually reboot when there's a new kernel, and a the kernel's postinst-scripts update the initramfs this should work out automatically

how do i set the cachefile-option for a new zpool?

Does proxmox take care of that when the zpool is created from the GUI?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!