TL;DR I need to run "fs mount -O -a" to start my containers after a reboot, and I don't know why.
---------------------
I'm brand new to Proxmox and to ZFS, so I apologize in advance if I'm missing something obvious here. But I've run out of forum & Google searches and seem to have hit a brick wall in terms of properly solving this on my own.
My server is using 2x SSD drives for the OS, and 2x NVMe drives (via PCIe) for the images. ZFS Raid 1 for both. Running 6.0-4. Fully up to date.
The issue I ran into immediately after installing Proxmox was that when my server booted, it would fail because the NVMe drives did not mount in time. At least that's what I gathered. Either way, I solved this by setting:
/etc/default/zfs
With those changes, Proxmox booted fine. However, I now run into an issue of containers failing to start:
If I run the systemctl command:
These same errors occur if I try to manually start the containers from the control panel. They also occur if I add a grub boot delay of 10secs, which I saw suggested online.
However, if I manually run "zfs mount -O -a", I can then start the containers successfully.
So I have a "solution" but to be honest I don't really understand why it works, and what the proper fix is for this ongoing. I feel like there is probably a better solution to resolve this the proper way.
Any ideas or suggestions would be appreciated!
---------------------
I'm brand new to Proxmox and to ZFS, so I apologize in advance if I'm missing something obvious here. But I've run out of forum & Google searches and seem to have hit a brick wall in terms of properly solving this on my own.
My server is using 2x SSD drives for the OS, and 2x NVMe drives (via PCIe) for the images. ZFS Raid 1 for both. Running 6.0-4. Fully up to date.
The issue I ran into immediately after installing Proxmox was that when my server booted, it would fail because the NVMe drives did not mount in time. At least that's what I gathered. Either way, I solved this by setting:
/etc/default/zfs
ZFS_INITRD_PRE_MOUNTROOT_SLEEP='3'
ZFS_INITRD_POST_MODPROBE_SLEEP='3'
With those changes, Proxmox booted fine. However, I now run into an issue of containers failing to start:
Job for pve-container@100.service failed because the control process exited with error code.
See "systemctl status pve-container@100.service" and "journalctl -xe" for details. TASK ERROR: command 'systemctl start pve-container@100' failed: exit code 1
If I run the systemctl command:
root@hfx1:~# systemctl status pve-container@100.service
● pve-container@100.service - PVE LXC Container: 100
Loaded: loaded (/lib/systemd/system/pve-container@.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2019-09-18 09:49:00 ADT; 12s ago
Docs: man:lxc-start
man:lxc
manct
Process: 30095 ExecStart=/usr/bin/lxc-start -n 100 (code=exited, status=1/FAILURE)
Sep 18 09:48:59 hfx1 systemd[1]: Starting PVE LXC Container: 100...
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: lxccontainer.c: wait_on_daemonized_start: 856 No such fil
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: tools/lxc_start.c: main: 330 The container failed to star
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: tools/lxc_start.c: main: 333 To get more details, run the
Sep 18 09:49:00 hfx1 lxc-start[30095]: lxc-start: 100: tools/lxc_start.c: main: 336 Additional information can b
Sep 18 09:49:00 hfx1 systemd[1]: pve-container@100.service: Control process exited, code=exited, status=1/FAILUR
Sep 18 09:49:00 hfx1 systemd[1]: pve-container@100.service: Failed with result 'exit-code'.
Sep 18 09:49:00 hfx1 systemd[1]: Failed to start PVE LXC Container: 100.
These same errors occur if I try to manually start the containers from the control panel. They also occur if I add a grub boot delay of 10secs, which I saw suggested online.
However, if I manually run "zfs mount -O -a", I can then start the containers successfully.
So I have a "solution" but to be honest I don't really understand why it works, and what the proper fix is for this ongoing. I feel like there is probably a better solution to resolve this the proper way.
Any ideas or suggestions would be appreciated!