I have Proxmox 5.0-32 installed on a Dell R710 with a single RAIDZ1 array implemented across six 2TB disks. This array is shared between VMs, containers, and the Proxmox installation itself. I understand this configuration isn't ideal, but it is stable and performant enough for my needs - assuming that I can power cycle it like any other machine.
Any time I reboot after issuing an "apt-get dist-upgrade", some obscure part of the ZFS pool apparently fails to mount on reboot. This is unlike the other related issues I have seen here, as the pool is successfully imported, but only these strange subdirectories are not mounted. Below is an example display output after being dropped into BusyBox:
If I attempt to manually run the command, I get the exact same output:
I get no output whatsoever If I run "mkdir /root//<string-goes-here>" before running mount, but I do get an additional and useless filesystem mounted within root.
Output of "zpool status":
If I issue the "exit" command from here, it will display the exact same message as above, but with the random string changed to reflect the next entry from "zfs list". If I issue the exit command 42 times in a row, the machine will (finally) fully and properly boot. This behavior persists through subsequent reboots, so I currently have to type exit[enter] 42 times following each reboot before Proxmox finally boots up, regardless of whether I run a dist-upgrade or not.
I've found some resources but I can't post them here because reasons.
I have attempted to implement all of these, but have had no success whatsoever. I can successfully modify system files within /root, and I can access utilities like "update-grub" and "update-initramfs" after running the exit procedure and getting out of BusyBox, but the problem persists. "zpool import rpool" just errors out and shows that a pool with that name already exists, as indicated by the output of "zpool status".
The simplest solution I have found is to freshly reinstall Proxmox and recreate the ZFS pool. I have reinstalled to address this issue twice, and am now stuck at BusyBox a third time after narrowing down the issue as much as possible. I am new to ZFS and Proxmox and am definitely out of my depth.
Is there something I should try next or any more information I can provide?
Any time I reboot after issuing an "apt-get dist-upgrade", some obscure part of the ZFS pool apparently fails to mount on reboot. This is unlike the other related issues I have seen here, as the pool is successfully imported, but only these strange subdirectories are not mounted. Below is an example display output after being dropped into BusyBox:
Code:
Command: mount -o zfsutil -t zfs rpool/ROOT/pve-1/07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f /root//07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f
Message: filesystem 'rpool/ROOT/pve-1/07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f' cannot be mounted at '/root//07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f' due to canonicalization error 2.
mount: mounting rpool/ROOT/pve-1/07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f on /root//07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f failed: No such file or directory
Error: 2
Failed to mount rpool/ROOT/pve-1/07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f on /root//07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f.
Manually mount the filesystem and exit.
BusyBox v1.22.1 (Debian 1:1.22.0-19+b3) built-in shell (ash)
Enter 'help' for a list of built-in commands.
/bin/sh: can't access tty: job control turned off
/ #
If I attempt to manually run the command, I get the exact same output:
Code:
filesystem 'rpool/ROOT/pve-1/07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f' cannot be mounted at '/root//07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f' due to canonicalization error 2.
mount: mounting rpool/ROOT/pve-1/07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f on /root//07b3e44b4ec33fc185144a325971911c45547f051a1f95a39314afdab787862f failed: No such file or directory
I get no output whatsoever If I run "mkdir /root//<string-goes-here>" before running mount, but I do get an additional and useless filesystem mounted within root.
Output of "zpool status":
Code:
/ # zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sda2 ONLINE 0 0 0
sdb2 ONLINE 0 0 0
sdc2 ONLINE 0 0 0
sdd2 ONLINE 0 0 0
sde2 ONLINE 0 0 0
sdf2 ONLINE 0 0 0
errors: No known data errors
If I issue the "exit" command from here, it will display the exact same message as above, but with the random string changed to reflect the next entry from "zfs list". If I issue the exit command 42 times in a row, the machine will (finally) fully and properly boot. This behavior persists through subsequent reboots, so I currently have to type exit[enter] 42 times following each reboot before Proxmox finally boots up, regardless of whether I run a dist-upgrade or not.
I've found some resources but I can't post them here because reasons.
I have attempted to implement all of these, but have had no success whatsoever. I can successfully modify system files within /root, and I can access utilities like "update-grub" and "update-initramfs" after running the exit procedure and getting out of BusyBox, but the problem persists. "zpool import rpool" just errors out and shows that a pool with that name already exists, as indicated by the output of "zpool status".
The simplest solution I have found is to freshly reinstall Proxmox and recreate the ZFS pool. I have reinstalled to address this issue twice, and am now stuck at BusyBox a third time after narrowing down the issue as much as possible. I am new to ZFS and Proxmox and am definitely out of my depth.
Is there something I should try next or any more information I can provide?