[SOLVED] zfs mount problems after filling filesystem

naudster

Active Member
Sep 11, 2017
3
0
41
54
I had the unfortunate scenario of completely filling the root zfs filesystem after a rogue process (my fault) filled /tmp with 22GB of junk. I noticed this within about 30mins and removed the file in /tmp. As expected, various processes had hit "no space left on device" problems and I thought it best to reboot the server. However upon reboot I was dropped to the recovery shell as the zpool wasn't found. Error message contained:
Code:
cannot import 'rpool': no such pool or dataset.
I ran "zfs import rpool" which seemed to work fine and showed both disks in my mirror as ONLINE. I exited the recovery shell and boot continued, but things weren't right and containers failed to start.

I rebooted again and this time no missing rpool on boot. However, once booted I can see one disk is now unavailable:


Code:
# zpool status

  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
  scan: scrub repaired 0B in 0h8m with 0 errors on Sun Dec 24 04:52:41 2017
config:

        NAME                                                     STATE     READ WRITE CKSUM
        rpool                                                    DEGRADED     0     0     0
          mirror-0                                               DEGRADED     0     0     0
            15895554979573075729                                 UNAVAIL      0     0     0  was /dev/sde2
            ata-Samsung_SSD_850_EVO_250GB_S3NYNF0J886394T-part2  ONLINE       0     0     0

errors: No known data errors

That's almost expected - it was on old early generation SSD and completely filling it would have stressed the hell out of it. Thankfully I'd paired it with a brand new disk.

Obviously I'll replace the failed disk, but my immediate issue seems to be that zfs filesystems other than root aren't mounted. i.e. /rpool, /rpool/data and the specific container images under it. This is why containers aren't starting: their images are just empty directories.

Not sure what the proper procedure is for making the system mount these on startup, as they were before, and would appreciate any help.

Some more info:

Code:
# pveversion  -v
proxmox-ve: 5.1-28 (running kernel: 4.13.8-2-pve)
pve-manager: 5.1-36 (running version: 5.1-36/131401db)
pve-kernel-4.13.4-1-pve: 4.13.4-26
pve-kernel-4.13.8-2-pve: 4.13.8-28
pve-kernel-4.10.17-4-pve: 4.10.17-24
pve-kernel-4.10.17-2-pve: 4.10.17-20
pve-kernel-4.10.15-1-pve: 4.10.15-15
pve-kernel-4.10.17-3-pve: 4.10.17-23
pve-kernel-4.10.17-1-pve: 4.10.17-18
libpve-http-server-perl: 2.0-6
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-15
qemu-server: 5.0-17
pve-firmware: 2.0-3
libpve-common-perl: 5.0-20
libpve-guest-common-perl: 2.0-13
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-16
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-2
pve-docs: 5.1-12
pve-qemu-kvm: 2.9.1-2
pve-container: 2.0-17
pve-firewall: 3.0-3
pve-ha-manager: 2.0-3
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.0-2
lxcfs: 2.0.7-pve4
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9
Code:
# mount | grep zfs
rpool/ROOT/pve-1 on / type zfs (rw,relatime,xattr,noacl)
rpool/ROOT on /rpool/ROOT type zfs (rw,noatime,xattr,noacl)
Code:
# zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
rpool                         36.5G  21.1G    96K  /rpool
rpool/ROOT                    15.9G  21.1G    96K  /rpool/ROOT
rpool/ROOT/pve-1              15.9G  21.1G  10.6G  /
rpool/data                    16.3G  21.1G   316K  /rpool/data
rpool/data/appconfig          1.77G  21.1G   946M  /rpool/data/appconfig
rpool/data/subvol-100-disk-1  2.50G  6.95G  1.05G  /rpool/data/subvol-100-disk-1
rpool/data/subvol-102-disk-1  3.66G  21.1G  1.42G  /rpool/data/subvol-102-disk-1
rpool/data/subvol-103-disk-1   500M  7.55G   463M  /rpool/data/subvol-103-disk-1
rpool/data/vm-101-disk-1      4.42G  21.1G  3.94G  -
rpool/data/vm-101-state-Base   320M  21.1G   320M  -
rpool/data/vm-104-disk-1      3.17G  21.1G  3.05G  -
rpool/swap                    4.25G  21.8G  3.59G  -

Note that I've previously created an extra zfs subvolume, appconfig. I've also got multiple snapshots of each subvol should I need to rollback.
 
Actually, the problem was rather straightforward:
Code:
# systemctl status zfs-mount.service
● zfs-mount.service - Mount ZFS filesystems
   Loaded: loaded (/lib/systemd/system/zfs-mount.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2017-12-31 08:59:46 AEDT; 7h ago
     Docs: man:zfs(8)
  Process: 1809 ExecStart=/sbin/zfs mount -a (code=exited, status=1/FAILURE)
 Main PID: 1809 (code=exited, status=1/FAILURE)
      CPU: 37ms

Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/appconfig': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/subvol-100-disk-1': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/subvol-102-disk-1': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/subvol-103-disk-1': directory is not empty
Dec 31 08:59:46 pve systemd[1]: zfs-mount.service: Main process exited, code=exited, status=1/FAILURE
Dec 31 08:59:46 pve systemd[1]: Failed to start Mount ZFS filesystems.
Dec 31 08:59:46 pve systemd[1]: zfs-mount.service: Unit entered failed state.
Dec 31 08:59:46 pve systemd[1]: zfs-mount.service: Failed with result 'exit-code'.

Some process had recreated all the directory structure (mount points) under /rpool. I rebooted into emergency mode and removed everything under /rpool, and the system booted without issue.

As a bonus, my dead disk pulled off a Lazarus!
Code:
# zpool status 
 pool: rpool 
state: ONLINE 
status: One or more devices has experienced an unrecoverable error.  An 
       attempt was made to correct the error.  Applications are unaffected. 
action: Determine if the device needs to be replaced, and clear the errors 
       using 'zpool clear' or replace the device with 'zpool replace'. 
 scan: resilvered 32.1M in 0h0m with 0 errors on Sun Dec 31 16:36:22 2017 
config: 

       NAME                                                     STATE     READ WRITE CKSUM 
       rpool                                                    ONLINE       0     0     0 
         mirror-0                                               ONLINE       0     0     0 
           sdf2                                                 ONLINE       0     0     9 
           ata-Samsung_SSD_850_EVO_250GB_S3NYNF0J886394T-part2  ONLINE       0     0     0 

errors: No known data errors

Sure, a few checksum errors, but that's no problem for me - I run ZFS! I look forward to another 10 years of service from this trusty disk. ;) (Kidding, of course - replacement drive has been ordered!)

So, marking as solved, but ... I plan to change /tmp and /var/tmp over to tmpfs mounts so this particular problem won't bite me again. Any reason why this isn't the ProxMox default?
 
Actually, the problem was rather straightforward:
Code:
# systemctl status zfs-mount.service
● zfs-mount.service - Mount ZFS filesystems
   Loaded: loaded (/lib/systemd/system/zfs-mount.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2017-12-31 08:59:46 AEDT; 7h ago
     Docs: man:zfs(8)
  Process: 1809 ExecStart=/sbin/zfs mount -a (code=exited, status=1/FAILURE)
 Main PID: 1809 (code=exited, status=1/FAILURE)
      CPU: 37ms

Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/appconfig': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/subvol-100-disk-1': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/subvol-102-disk-1': directory is not empty
Dec 31 08:59:46 pve zfs[1809]: cannot mount '/rpool/data/subvol-103-disk-1': directory is not empty
Dec 31 08:59:46 pve systemd[1]: zfs-mount.service: Main process exited, code=exited, status=1/FAILURE
Dec 31 08:59:46 pve systemd[1]: Failed to start Mount ZFS filesystems.
Dec 31 08:59:46 pve systemd[1]: zfs-mount.service: Unit entered failed state.
Dec 31 08:59:46 pve systemd[1]: zfs-mount.service: Failed with result 'exit-code'.

Some process had recreated all the directory structure (mount points) under /rpool. I rebooted into emergency mode and removed everything under /rpool, and the system booted without issue.

As a bonus, my dead disk pulled off a Lazarus!
Code:
# zpool status
 pool: rpool
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
       attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
       using 'zpool clear' or replace the device with 'zpool replace'.
 scan: resilvered 32.1M in 0h0m with 0 errors on Sun Dec 31 16:36:22 2017
config:

       NAME                                                     STATE     READ WRITE CKSUM
       rpool                                                    ONLINE       0     0     0
         mirror-0                                               ONLINE       0     0     0
           sdf2                                                 ONLINE       0     0     9
           ata-Samsung_SSD_850_EVO_250GB_S3NYNF0J886394T-part2  ONLINE       0     0     0

errors: No known data errors

Sure, a few checksum errors, but that's no problem for me - I run ZFS! I look forward to another 10 years of service from this trusty disk. ;) (Kidding, of course - replacement drive has been ordered!)

So, marking as solved, but ... I plan to change /tmp and /var/tmp over to tmpfs mounts so this particular problem won't bite me again. Any reason why this isn't the ProxMox default?

tmpfs eats RAM, of which there is usually less than disk space on a hypervisor node. systemd takes care of cleaning up /tmp on reboots. if you want and know about the consequences, switching to tmpfs for /tmp is very easy (hint: check out /usr/share/systemd/tmp.mount)
 
Some process had recreated all the directory structure (mount points) under /rpool. I rebooted into emergency mode and removed everything under /rpool, and the system booted without issue.

can somebody describe this step by step how to do it? i can not boot in emergency with proxmox install usb...

unable to find boot disk automaticaly
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!