Updates break zfs

lweidig

Active Member
Oct 20, 2011
104
2
38
Sheboygan, WI
Ok, so I have a zfs RAID1 pool for boot on two SSD's and 8 drive ZFS10 pool for storage. Was working fine, could reboot server... until updates today. This had a new kernel, grub updates, zfs updates... Now I reboot and my 8 drive pool does not get mounted before Proxmox decides to create the dump images private and template folders before the storage is mounted. This is not the first time we have had this issue. Is there something that can be done to fix this permanently, it is very annoying!

So:

service pveproxy stop
service pvestatd stop
service pvedaemon stop
cd /Storage
# Delete all of the empty directories
rm -rf *
cd /
service zfs-mount start
service pvedaemon start
service pvestatd start
service pveproxy start

And now we are good until next reboot.
 
What I have always done to get my ZFS pools mounted in proxmox is to change the mount mode to legacy.

zfs set mountpoint=legacy poolname/datasetname

Then you can mount zfs datasets through fstab like normal and they will allow mounting into the proxmox directories that exist.

example fstab line:

zpoolname/dataset /mountpoint zfs defaults 0 0
 
Hi,
what Kernel you are using?
What are you use as Controller for the disks?
What repository you are using?
 
Linux version 2.6.32-39-pve (root@lola) (gcc version 4.7.2 (Debian 4.7.2-5) ) #1 SMP Fri May 8 11:27:35 CEST 2015

The two built in controllers on the SuperMicro X10DRW-iT board (GREAT BOARD BTW - 2 10G Ethernet)
SATA controller: Intel Corporation C610/X99 series chipset sSATA Controller [AHCI mode] (rev 05)
SATA controller: Intel Corporation C610/X99 series chipset 6-Port SATA Controller [AHCI mode] (rev 05)
The two boot ZFS SSD's are on the first controller allong with 2 of the drives of the 8 drive mirror. The other 6 are on the second controller (yes after building it would likely have been better to put boot on 6 drive controller and split load of data, but too late at least for now)

Enterprise repository:

# pveversion -v
proxmox-ve-2.6.32: 3.4-156 (running kernel: 2.6.32-39-pve)
pve-manager: 3.4-6 (running version: 3.4-6/102d4547)
pve-kernel-2.6.32-39-pve: 2.6.32-156
pve-kernel-2.6.32-37-pve: 2.6.32-150
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.7-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.10-2
pve-cluster: 3.0-17
qemu-server: 3.4-6
pve-firmware: 1.1-4
libpve-common-perl: 3.0-24
libpve-access-control: 3.0-16
libpve-storage-perl: 3.0-33
pve-libspice-server1: 0.12.4-3
vncterm: 1.1-8
vzctl: 4.0-1pve6
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 2.2-10
ksm-control-daemon: 1.1-1
glusterfs-client: 3.5.2-1
 
Yes, you would think that would do it, but looking at /etc/rc2.d shows:

-rw-r--r-- 1 root root 677 Jul 14 2013 README
lrwxrwxrwx 1 root root 18 Feb 18 23:41 S01bootlogs -> ../init.d/bootlogs
lrwxrwxrwx 1 root root 14 Feb 18 23:41 S01motd -> ../init.d/motd
lrwxrwxrwx 1 root root 19 Feb 18 23:41 S01zfs-mount -> ../init.d/zfs-mount
lrwxrwxrwx 1 root root 17 May 14 09:44 S14rpcbind -> ../init.d/rpcbind
lrwxrwxrwx 1 root root 20 May 14 09:45 S15nfs-common -> ../init.d/nfs-common
lrwxrwxrwx 1 root root 18 May 14 09:45 S17ksmtuned -> ../init.d/ksmtuned
lrwxrwxrwx 1 root root 19 May 14 09:45 S17rrdcached -> ../init.d/rrdcached
lrwxrwxrwx 1 root root 17 May 14 09:45 S17rsyslog -> ../init.d/rsyslog
lrwxrwxrwx 1 root root 18 May 14 09:45 S17vzeventd -> ../init.d/vzeventd
lrwxrwxrwx 1 root root 13 May 19 16:23 S17zed -> ../init.d/zed
lrwxrwxrwx 1 root root 19 May 14 09:45 S17zfs-share -> ../init.d/zfs-share
lrwxrwxrwx 1 root root 13 May 14 09:45 S18atd -> ../init.d/atd
lrwxrwxrwx 1 root root 14 May 15 10:23 S18dbus -> ../init.d/dbus
lrwxrwxrwx 1 root root 17 May 14 15:57 S18ipmievd -> ../init.d/ipmievd
lrwxrwxrwx 1 root root 13 May 14 09:45 S18ntp -> ../init.d/ntp
lrwxrwxrwx 1 root root 17 May 14 09:45 S18postfix -> ../init.d/postfix
lrwxrwxrwx 1 root root 21 May 14 09:45 S18pve-cluster -> ../init.d/pve-cluster
lrwxrwxrwx 1 root root 15 May 14 09:45 S18rsync -> ../init.d/rsync
lrwxrwxrwx 1 root root 13 May 14 09:45 S18ssh -> ../init.d/ssh
lrwxrwxrwx 1 root root 14 May 14 09:45 S19cman -> ../init.d/cman
lrwxrwxrwx 1 root root 14 May 14 09:45 S19cron -> ../init.d/cron
lrwxrwxrwx 1 root root 15 May 15 10:23 S19saned -> ../init.d/saned
lrwxrwxrwx 1 root root 14 May 14 09:45 S20clvm -> ../init.d/clvm
lrwxrwxrwx 1 root root 19 May 14 09:46 S21pvedaemon -> ../init.d/pvedaemon
lrwxrwxrwx 1 root root 22 May 14 09:45 S21pvefw-logger -> ../init.d/pvefw-logger
lrwxrwxrwx 1 root root 21 May 14 09:45 S21qemu-server -> ../init.d/qemu-server
lrwxrwxrwx 1 root root 12 May 14 09:45 S21vz -> ../init.d/vz
lrwxrwxrwx 1 root root 22 May 14 09:45 S22pve-firewall -> ../init.d/pve-firewall
lrwxrwxrwx 1 root root 18 May 14 09:46 S22pveproxy -> ../init.d/pveproxy
lrwxrwxrwx 1 root root 18 May 14 09:46 S22pvestatd -> ../init.d/pvestatd
lrwxrwxrwx 1 root root 19 May 14 09:45 S22rgmanager -> ../init.d/rgmanager
lrwxrwxrwx 1 root root 21 May 14 09:46 S23pve-manager -> ../init.d/pve-manager
lrwxrwxrwx 1 root root 20 May 14 09:46 S23spiceproxy -> ../init.d/spiceproxy
lrwxrwxrwx 1 root root 19 May 14 09:46 S24pvebanner -> ../init.d/pvebanner
lrwxrwxrwx 1 root root 18 May 14 09:46 S24rc.local -> ../init.d/rc.local
lrwxrwxrwx 1 root root 19 May 14 09:46 S24rmnologin -> ../init.d/rmnologin
lrwxrwxrwx 1 root root 23 May 14 09:46 S24stop-bootlogd -> ../init.d/stop-bootlogd


Theoretically that should be taking place, but something is going wrong.
 
Yes, of course I checked the logs, but this is all early in the boot process. Nothing that I see in logs / dmesg that offers any insight.
 
my /var/log/boot :

Code:
Wed Mar 18 22:10:16 2015: Setting parameters of disc: (none).
Wed Mar 18 22:10:16 2015: Setting up LVM Volume Groups...done.
Wed Mar 18 22:10:16 2015: Activating swap...done.
Wed Mar 18 22:10:16 2015: Checking root file system...fsck from util-linux 2.20.1
Wed Mar 18 22:10:16 2015: /dev/mapper/pve-root: clean, 81476/655360 files, 1065694/2621440 blocks
Wed Mar 18 22:10:16 2015: done.
...............
Wed Mar 18 22:10:16 2015: Mounting local filesystems...done.
Wed Mar 18 22:10:16 2015: Activating swapfile swap...done.
Wed Mar 18 22:10:16 2015: Cleaning up temporary files....
Wed Mar 18 22:10:16 2015: Regulating system clock...done.
...............
Wed Mar 18 22:10:45 2015: INIT: Entering runlevel: 2
Wed Mar 18 22:10:45 2015: Checking if zfs userspace tools present.
Wed Mar 18 22:10:46 2015: Importing ZFS pools.
Wed Mar 18 22:10:47 2015: Mounting ZFS filesystems not yet mounted.
Wed Mar 18 22:10:54 2015: Mounting volumes registered in fstab: .
...............
Wed Mar 18 22:11:03 2015: Starting pve cluster filesystem : pve-cluster.
...............
Wed Mar 18 22:11:17 2015: clvmd: cluster not configured.
Wed Mar 18 22:11:17 2015: Starting Sheepdog Server : sheepdog.
Wed Mar 18 22:11:17 2015: Starting PVE Daemon: pvedaemon.
Wed Mar 18 22:11:17 2015: Starting PVE firewall logger: pvefw-logger.
Wed Mar 18 22:11:17 2015: Starting OpenVZ: ..done
Wed Mar 18 22:11:18 2015: Bringing up interface venet0: ..done
Wed Mar 18 22:11:18 2015: Starting Proxmox VE firewall: pve-firewall.
Wed Mar 18 22:11:18 2015: Starting PVE API Proxy Server: pveproxy.
Wed Mar 18 22:11:18 2015: Starting PVE Status Daemon: pvestatd.
Wed Mar 18 22:11:19 2015: Starting VMs and Containers
Wed Mar 18 22:11:19 2015: Starting PVE SPICE Proxy Server: spiceproxy.
Wed Mar 18 22:11:20 2015: Starting rpcbind daemon...Already running..
Wed Mar 18 22:11:20 2015: Starting NFS common utilities: statd idmapd.
Wed Mar 18 22:11:20 2015: Exporting directories for NFS kernel daemon....
Wed Mar 18 22:11:20 2015: Starting NFS kernel daemon: nfsd mountd.
Wed Mar 18 22:11:20 2015: Checking if zfs userspace tools present.
Wed Mar 18 22:11:21 2015: Sharing ZFS filesystems.
...............
 
I stand corrected, I just looked at the wrong files I guess. Thanks Nemesiz for the added post. Here is my /var/log/boot showing that even though my init.d seems to say zfs would come first it did not:

Code:
Tue May 19 16:31:50 2015: Starting pve cluster filesystem : pve-cluster.
Tue May 19 16:31:50 2015: clvmd: cluster not configured.
Tue May 19 16:31:50 2015: Starting periodic command scheduler: cron.
Tue May 19 16:31:50 2015: Starting PVE firewall logger: pvefw-logger.
Tue May 19 16:31:51 2015: Starting OpenVZ: ..done
Tue May 19 16:31:51 2015: Bringing up interface venet0: ..done
Tue May 19 16:31:51 2015: Starting Proxmox VE firewall: pve-firewall.
Tue May 19 16:31:52 2015: Starting PVE Daemon: pvedaemon.
Tue May 19 16:31:52 2015: Starting PVE Status Daemon: pvestatd.
Tue May 19 16:31:52 2015: Starting PVE API Proxy Server: pveproxy.
Tue May 19 16:31:53 2015: Starting PVE SPICE Proxy Server: spiceproxy.
Tue May 19 16:31:54 2015: Starting VMs and Containers
Tue May 19 16:31:54 2015: Starting VM 100
Tue May 19 16:31:54 2015: trying to aquire lock... OK
Tue May 19 16:31:54 2015: Starting VM 100 failed: status
Tue May 19 16:31:54 2015: Checking if zfs userspace tools present.
Tue May 19 16:31:55 2015: Importing ZFS pool Storage (using /dev/disk/by-vdev)Importing ZFS pool Storage (using /dev/disk/by-id).
Tue May 19 16:31:59 2015: Mounting ZFS filesystems not yet mountedcannot mount '/Storage': directory is not empty
Tue May 19 16:31:59 2015:  failed!
Tue May 19 16:31:59 2015: Checking if zfs userspace tools present.
Tue May 19 16:32:01 2015: Sharing ZFS filesystems.

Wonder why it is so late in the process for this setup.
 
I stand corrected, I just looked at the wrong files I guess. Thanks Nemesiz for the added post. Here is my /var/log/boot showing that even though my init.d seems to say zfs would come first it did not:

Code:
Tue May 19 16:31:50 2015: Starting pve cluster filesystem : pve-cluster.
Tue May 19 16:31:50 2015: clvmd: cluster not configured.
Tue May 19 16:31:50 2015: Starting periodic command scheduler: cron.
Tue May 19 16:31:50 2015: Starting PVE firewall logger: pvefw-logger.
Tue May 19 16:31:51 2015: Starting OpenVZ: ..done
Tue May 19 16:31:51 2015: Bringing up interface venet0: ..done
Tue May 19 16:31:51 2015: Starting Proxmox VE firewall: pve-firewall.
Tue May 19 16:31:52 2015: Starting PVE Daemon: pvedaemon.
Tue May 19 16:31:52 2015: Starting PVE Status Daemon: pvestatd.
Tue May 19 16:31:52 2015: Starting PVE API Proxy Server: pveproxy.
Tue May 19 16:31:53 2015: Starting PVE SPICE Proxy Server: spiceproxy.
Tue May 19 16:31:54 2015: Starting VMs and Containers
Tue May 19 16:31:54 2015: Starting VM 100
Tue May 19 16:31:54 2015: trying to aquire lock... OK
Tue May 19 16:31:54 2015: Starting VM 100 failed: status
Tue May 19 16:31:54 2015: Checking if zfs userspace tools present.
Tue May 19 16:31:55 2015: Importing ZFS pool Storage (using /dev/disk/by-vdev)Importing ZFS pool Storage (using /dev/disk/by-id).
Tue May 19 16:31:59 2015: Mounting ZFS filesystems not yet mountedcannot mount '/Storage': directory is not empty
Tue May 19 16:31:59 2015:  failed!
Tue May 19 16:31:59 2015: Checking if zfs userspace tools present.
Tue May 19 16:32:01 2015: Sharing ZFS filesystems.

Wonder why it is so late in the process for this setup.

Hi,
does
Code:
insserv -v
change anything?

Udo
 
Tried that and got:

Code:
# insserv -v
insserv: creating .depend.boot
insserv: creating .depend.start
insserv: creating .depend.stop

Restarted and nothing different. Not sure why it is not following order. Might try and place dependencies on the zfs-mount server in the pve services. The problem is I suspect ever update this will get overwritten and broke again.
 
It just happen to me with "pve-manager/3.4-11/6502936f (running kernel: 2.6.32-41-pve)"

Do we have a permanent clean solution to this problem ?

Or is it a Proxmox (or debian) bug that we have to live with ?
 
It just happen to me with "pve-manager/3.4-11/6502936f (running kernel: 2.6.32-41-pve)"

Do we have a permanent clean solution to this problem ?

Or is it a Proxmox (or debian) bug that we have to live with ?

I personally was not ever able to get a fix for this problem, other than every restart to run the commands listed in the first post. HOWEVER, this node was part of a cluster and I was able to wipe it completely, now am fully patched and have not experienced the issue since. Not sure if you have this luxury or not. It is pretty annoying without a good work around though.
 
How about this option ?

editing "/etc/rc.local"
adding "zfs mount -O -a"

It works well, on a single machine but now, you make me wonder if this is a good patch when machines are on a cluster !?!

I m also not very excited to blindly Overlay, unless Proxmox can certify this is okay.

I m just trying to move my V.2 cluster to Proxmox V.3 and it not easy
I m not looking forward to V.4 :( .. not that is not exciting, but they are other task on my todo.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!