[SOLVED] Today's Kernel / Firmware Update has really messed up my boxes!

sirebral

Member
Feb 12, 2022
49
10
13
Oregon, USA
Hi All,



EDIT: Here's the fix, it's clean for most setups, things like Home Assistant OS may or may not work, still finding out. Ultimately, you'll need to create a file in /etc/docker/ called daemon.json. Inside the file put a line:

Code:
storage-driver": "vfs"

This will restore your old config, and you should be good to go. This only applies is you didn't previously explicitly call out the file system driver to use with docker. Previously it detected VFS and used that, without issue. If not specifically assigned in the daemon.json, it will incorrectly pick overlayfs, which is not a functional configuration for nested docker inside LXC on Proxmox.



I updated my 2 hosts (not clustered) and my backup box earlier today with the firmware on the free release repo. I've had nothing but trouble since.

1. I run docker inside LXC containers. All the containers disappeared after the upgrade. I was able to reload a few as I have the compose files, yet many just won't work anymore. Looks like some sort of permissions errors. This happens on 2 physical hosts. Here's an example of the error, this happens on MOST docker containers, yet not all when I try to use compose to bring them back.

"ApplyLayer exit status 1 stdout: stderr: unlinkat /tmp/patch/etc/s6/init: invalid argument"

On the guest LXC, running docker-compose, this happens before it dies, no deployment occurs. Looks to me like something us up with AppArmor/permissions, and there are several threads that point to this, yet no actual resolutions.

I also applied the same patches on my PBS guest, after which nothing could talk to the host anymore, all timeouts. Also, the box would no longer reboot, I had to hard power it down to restore. Luckily, that was virtual, and I was able to restore it. It now works again. It was showing hung process errors, unfortunately, I don't have these errors available as I restored.

Errors on the physical host, pretty generic for backups:

"pve pvestatd[9652]: backup_server: error fetching datastores - 500 Can't connect to 172.16.25.99:8007"

A few other errors of note in the syslog:

"Dec 23 01:41:13 pve kernel: [ 5220.738979] overlayfs: upper fs does not support RENAME_WHITEOUT.
Dec 23 01:41:13 pve kernel: [ 5220.755935] overlayfs: fs on '/var/lib/docker/overlay2/l/4DSKVQHYOJVZ4CNXH6MTZFDSRP' does not support file handles, falling back to xino=off."

"Dec 23 01:32:37 pve systemd[2453747]: /usr/lib/environment.d/99-environment.conf:3: invalid variable name "export PBS_REPOSITORY", ignoring."


So far I have tried pinning the previous kernel, no difference. I also tried the v6 kernel, just for kicks... also no difference, plus since it's advised not to use it at the moment I'm not going to pursue that as a solution. I've also done an update-bash and update-initramfs, just to be sure everything was kosher, ran with no issues, yet didn't fix anything.

My boxes are all 100% ZFS, and there was a ZFS update included, hence the mention. I haven't done a ton of troubleshooting on Proxmox/Linux, still learning, so please let me know if I can provide anything else. I'm hoping for a quick fix as I'm guessing I'm not alone with this latest update. Any help would be appreciated, even if it's rolling back. I backup my physical hosts, yet I'm a little concerned about restoring that when some things are working properly (VM's and containers without nested Docker) and I don't want to mess things up further. So if there are particular things I can restore, that'd be an option. Also looked into rolling back the updates, yet couldn't find a great way to do that. Thanks all!

I've also been spammed with this error for quite some time, every minute in syslog. This was there before the patching, yet not sure why:

"Dec 23 01:19:53 pve lxcfs[8066]: utils.c: 324: read_file_fuse: Write to cache was truncated"

Example of errors inside of one of the LXC's running docker that has the issue, note they are all unprivileged. Again, pointing to permissions issues.

Dec 22 13:50:23 notify rsyslogd[109]: imklog: cannot open kernel log (/proc/kmsg): Permission denied.
Dec 22 13:50:23 notify rsyslogd[109]: activation of module imklog failed [v8.2112.0 try https://www.rsyslog.com/e/2145 ]
Dec 22 13:50:25 notify ebpf.plugin[342]: PROCFILE: Cannot open file '/proc/442/status'
Dec 22 13:50:25 notify ebpf.plugin[342]: Cannot open /proc/442/status


Installed host software packages, the same versions on both malfunctioning boxes:

Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.3.1-1
proxmox-backup-file-restore: 2.3.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve1
 
Last edited:
  • Like
Reactions: lordcheeto
Hi All,

I updated my 2 hosts (not clustered) and my backup box earlier today with the firmware on the free release repo. I've had nothing but trouble since.

1. I run docker inside LXC containers. All the containers disappeared after the upgrade. I was able to reload a few as I have the compose files, yet many just won't work anymore. Looks like some sort of permissions errors. This happens on 2 physical hosts. Here's an example of the error, this happens on MOST docker containers, yet not all when I try to use compose to bring them back.

"ApplyLayer exit status 1 stdout: stderr: unlinkat /tmp/patch/etc/s6/init: invalid argument"

On the guest LXC, running docker-compose, this happens before it dies, no deployment occurs. Looks to me like something us up with AppArmor/permissions, and there are several threads that point to this, yet no actual resolutions.

docker inside LXC is strongly discouraged for a reason. you need to provide more detail (container & storage config, how you installed docker, full logs of host and container and all error message) if you want help with such kind of issues.

I also applied the same patches on my PBS guest, after which nothing could talk to the host anymore, all timeouts. Also, the box would no longer reboot, I had to hard power it down to restore. Luckily, that was virtual, and I was able to restore it. It now works again. It was showing hung process errors, unfortunately, I don't have these errors available as I restored.

no errors, no option to help..

Errors on the physical host, pretty generic for backups:

"pve pvestatd[9652]: backup_server: error fetching datastores - 500 Can't connect to 172.16.25.99:8007"

well, that's how pvestatd logs storage errors (additionally, the GUI will mark the storage as "status unknown" as a result, and any actions using it will error out (e.g., a backup job should notify you about such an error)

A few other errors of note in the syslog:

"Dec 23 01:41:13 pve kernel: [ 5220.738979] overlayfs: upper fs does not support RENAME_WHITEOUT.
Dec 23 01:41:13 pve kernel: [ 5220.755935] overlayfs: fs on '/var/lib/docker/overlay2/l/4DSKVQHYOJVZ4CNXH6MTZFDSRP' does not support file handles, falling back to xino=off."

yeah, this is the kernel/overlayfs complaining that ZFS doesn't have full overlay support..

"Dec 23 01:32:37 pve systemd[2453747]: /usr/lib/environment.d/99-environment.conf:3: invalid variable name "export PBS_REPOSITORY", ignoring."

this file is custom

So far I have tried pinning the previous kernel, no difference. I also tried the v6 kernel, just for kicks... also no difference, plus since it's advised not to use it at the moment I'm not going to pursue that as a solution. I've also done an update-bash and update-initramfs, just to be sure everything was kosher, ran with no issues, yet didn't fix anything.

if booting the previous kernel doesn't help, then it can't be the kernel upgrade (which includes the ZFS module, the userspace part in the zfs* packages doesn't do that much with ZFS). what other packages did you upgrade at the same time? (/var/log/apt/history.log) did you upgrade the packages/software inside the containers as well?

My boxes are all 100% ZFS, and there was a ZFS update included, hence the mention. I haven't done a ton of troubleshooting on Proxmox/Linux, still learning, so please let me know if I can provide anything else. I'm hoping for a quick fix as I'm guessing I'm not alone with this latest update. Any help would be appreciated, even if it's rolling back. I backup my physical hosts, yet I'm a little concerned about restoring that when some things are working properly (VM's and containers without nested Docker) and I don't want to mess things up further. So if there are particular things I can restore, that'd be an option. Also looked into rolling back the updates, yet couldn't find a great way to do that. Thanks all!

I've also been spammed with this error for quite some time, every minute in syslog. This was there before the patching, yet not sure why:

"Dec 23 01:19:53 pve lxcfs[8066]: utils.c: 324: read_file_fuse: Write to cache was truncated"

Example of errors inside of one of the LXC's running docker that has the issue, note they are all unprivileged. Again, pointing to permissions issues.

Dec 22 13:50:23 notify rsyslogd[109]: imklog: cannot open kernel log (/proc/kmsg): Permission denied.
Dec 22 13:50:23 notify rsyslogd[109]: activation of module imklog failed [v8.2112.0 try https://www.rsyslog.com/e/2145 ]

normal in a container

Dec 22 13:50:25 notify ebpf.plugin[342]: PROCFILE: Cannot open file '/proc/442/status'
Dec 22 13:50:25 notify ebpf.plugin[342]: Cannot open /proc/442/status
some sort of netdata plugin? not sure, but probably not affecting docker, just not working correctly in a container?
Installed host software packages, the same versions on both malfunctioning boxes:

Code:
proxmox-ve: 7.3-1 (running kernel: 5.15.83-1-pve)
pve-manager: 7.3-4 (running version: 7.3-4/d69b70d4)
pve-kernel-5.15: 7.3-1
pve-kernel-helper: 7.3-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
pve-kernel-5.15.74-1-pve: 5.15.74-1
ceph-fuse: 14.2.21-1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.3
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.3-1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.3-1
libpve-guest-common-perl: 4.2-3
libpve-http-server-perl: 4.1-5
libpve-storage-perl: 7.3-1
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
openvswitch-switch: 2.15.0+ds1-2+deb11u1
proxmox-backup-client: 2.3.1-1
proxmox-backup-file-restore: 2.3.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.0-1
proxmox-widget-toolkit: 3.5.3
pve-cluster: 7.3-1
pve-container: 4.4-2
pve-docs: 7.3-1
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-7
pve-firmware: 3.6-2
pve-ha-manager: 3.5.1
pve-i18n: 2.8-1
pve-qemu-kvm: 7.1.0-4
pve-xtermjs: 4.16.0-1
qemu-server: 7.3-2
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+2
vncterm: 1.7-1
zfsutils-linux: 2.1.7-pve1
 
Didn't upgrade the containers themselves today, here's what was upgraded on the hosts. Everything worked before, not so much after.

Code:
Install: pve-kernel-5.15.83-1-pve:amd64 (5.15.83-1, automatic), pve-headers-5.15.83-1-pve:amd64 (5.15.83-1, automatic)
Upgrade: pve-firmware:amd64 (3.6-1, 3.6-2), zfs-zed:amd64 (2.1.6-pve1, 2.1.7-pve1), zfs-initramfs:amd64 (2.1.6-pve1, 2.1.7-pve1), spl:amd64 (2.1.6-pve1, 2.1.7-pve1), libnvpair3linux:amd64 (2.1.6-pve1, 2.1.7-pve1), libuutil3linux:amd64 (2.1.6-pve1, 2.1.7-pve1), libzpool5linux:amd64 (2.1.6-pve1, 2.1.7-pve1), netdata:amd64 (1.37.0-71-nightly, 1.37.0-78-nightly), qemu-server:amd64 (7.3-1, 7.3-2), libpve-access-control:amd64 (7.2-5, 7.3-1), pve-manager:amd64 (7.3-3, 7.3-4), pve-kernel-5.15:amd64 (7.2-14, 7.3-1), libzfs4linux:amd64 (2.1.6-pve1, 2.1.7-pve1), pve-headers-5.15:amd64 (7.2-14, 7.3-1), zfsutils-linux:amd64 (2.1.6-pve1, 2.1.7-pve1)
End-Date: 2022-12-22  12:59:45

I'd roll it all back, yet if I try it says I cannot as I'd be uninstalling proxmox....
 
please verify that booting the previous kernel version really does not solve your issue!

furthermore:

you need to provide more detail (container & storage config, how you installed docker, full logs of host and container and all error message) if you want help with such kind of issues.

a basic install of
- current PVE packages with 5.15.3-1 kernel
- debian 11 container from stock template
- docker.io package installed

seems to work..
 
Last edited:
docker inside LXC is strongly discouraged for a reason. you need to provide more detail (container & storage config, how you installed docker, full logs of host and container and all error message) if you want help with such kind of issues.

I've been running this way for a few years without issues, so as much as it may be discouraged, it hasn't been a problem before today. To answer your questions:

Here's an example of a container that loads without any issues.

YAML:
version: "3.1"
services:
  audiobookshelf:
    image: advplyr/audiobookshelf:latest
    environment:
      - AUDIOBOOKSHELF_UID=0
      - AUDIOBOOKSHELF_GID=0
    ports:
      - 13378:80
    volumes:
      - /hybrid/media/library/audiobooks:/audiobooks
      - /hybrid/media/library/podcasts:/podcasts
      - audiobookshelf:/config
      - /hybrid/media/temp/audiobookshelf:/metadata
     
volumes:
     audiobookshelf:

And one that does not, with the error: "Failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat /etc/ssl/certs: invalid argument" , they all have similar errors, refrencing stdout:stderr and then invalid argument, or permission denied, yet similar none the less.


YAML:
version: '2'
services:
  cloudflare-ddns:
    image: oznu/cloudflare-ddns:latest
    restart: always
    environment:
      - API_KEY=<REDACTED>
      - ZONE=<REDACTED>
      - SUBDOMAIN=cotb
      - PROXIED=false


Full text of the compose output:


Code:
failed to pull images of the stack: cloudflare-ddns Pulling df20fa9351a1 Pulling fs layer 63be11c42f3c Pulling fs layer bdca5d7e1d13 Pulling fs layer 1ab4fa5f9d8e Pulling fs layer a180b4b854a6 Pulling fs layer a180b4b854a6 Waiting 1ab4fa5f9d8e Waiting df20fa9351a1 Downloading [> ] 28.75kB/2.798MB bdca5d7e1d13 Downloading [> ] 36.88kB/3.683MB 63be11c42f3c Downloading [> ] 17.8kB/1.725MB 63be11c42f3c Verifying Checksum 63be11c42f3c Download complete df20fa9351a1 Verifying Checksum df20fa9351a1 Download complete df20fa9351a1 Extracting [> ] 32.77kB/2.798MB bdca5d7e1d13 Verifying Checksum bdca5d7e1d13 Download complete df20fa9351a1 Extracting [======> ] 360.4kB/2.798MB df20fa9351a1 Extracting [==================================================>] 2.798MB/2.798MB df20fa9351a1 Pull complete 63be11c42f3c Extracting [> ] 32.77kB/1.725MB 1ab4fa5f9d8e Downloading [> ] 65.54kB/6.337MB a180b4b854a6 Downloading [==============> ] 758B/2.667kB a180b4b854a6 Downloading [==================================================>] 2.667kB/2.667kB a180b4b854a6 Verifying Checksum a180b4b854a6 Download complete 63be11c42f3c Extracting [==================================================>] 1.725MB/1.725MB 63be11c42f3c Extracting [==================================================>] 1.725MB/1.725MB 1ab4fa5f9d8e Verifying Checksum 1ab4fa5f9d8e Download complete 63be11c42f3c Pull complete bdca5d7e1d13 Extracting [> ] 65.54kB/3.683MB bdca5d7e1d13 Extracting [====================> ] 1.507MB/3.683MB bdca5d7e1d13 Extracting [==============================> ] 2.228MB/3.683MB bdca5d7e1d13 Extracting [==================================================>] 3.683MB/3.683MB failed to register layer: ApplyLayer exit status 1 stdout: stderr: unlinkat /etc/ssl/certs: invalid argument

Here's the LXC config for the guest above:

Code:
rch: amd64
cores: 16
features: keyctl=1,nesting=1
hostname: docker.cotb.local.lan
memory: 20480
mp0: /hybrid,mp=/hybrid,mountoptions=noatime,size=0T
net0: name=eth0,bridge=vmbr0,gw=172.16.25.1,hwaddr=1A:75:2D:BD:BF:81,ip=172.16.25.55/24,type=veth
onboot: 1
ostype: ubuntu
parent: vzdump
protection: 1
rootfs: flash:subvol-105-disk-0,mountoptions=noatime,size=206G
startup: order=1,up=30,down=120
swap: 512
unprivileged: 1
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net dev/net none bind,create=dir

Storage, as mentioned, is all ZFS. The guests run on a flash array. Docker was installed by using the Debian apt repo. I also tried to remediate using the docker repo, yet had the same issues.

I've attached journalctl -xe for both the host and the guest.

Lots of errors here. Please let me know what other logs you'd like, and I'll post them straight away.

Thanks!
 

Attachments

  • guestjournal.txt
    212.1 KB · Views: 4
  • hostjournal.txt
    88.1 KB · Views: 3
Last edited:
Looking over the guest journal I cleaned up a few things checkmk mainly, we can ignore the directory not found as well, cleaned that up too, old configs.
 
can you try the following on your setup while monitoring the host journalctl:
- install a new debian 11 container using our standard template and start it
- update it to the current package versions
- follow https://docs.docker.com/engine/install/debian/
- run docker run hello-world
- run docker run -it ubuntu bash

report the output of the first thing that fails, including any journalctl errors printed while it was attempted. all of the above works without issues for a container on ZFS for me (the overlay warning is still printed, but like I said, ZFS simply doesn't support overlayfs 100% yet)..
 
On it, just FYI, the guests that are having trouble with many of my compose yaml's can run hello world without any issues. I'll go through the process regardless.
 
Last edited:
Can you see if you can get this to run with Compose? For me, it won't even complete the build, error's out before it can.

Code:
version: '2'
services:
  cloudflare-ddns:
    image: oznu/cloudflare-ddns:latest
    restart: always
    environment:
      - API_KEY=<REDACTED>
      - ZONE=<REDACTED>
      - SUBDOMAIN=cotb
      - PROXIED=false
 
yes, this fails as well with

Code:
 docker compose create
[+] Running 2/6
 ⠋ cloudflare-ddns Pulling                                                                                                                                                             3.0s
   ⠿ df20fa9351a1 Pull complete                                                                                                                                                        0.6s
   ⠿ 63be11c42f3c Pull complete                                                                                                                                                        0.6s
   ⠼ bdca5d7e1d13 Extracting      [==================================================>]  3.683MB/3.683MB                                                                               1.4s
   ⠼ 1ab4fa5f9d8e Download complete                                                                                                                                                    1.4s
   ⠼ a180b4b854a6 Download complete                                                                                                                                                    1.4s
failed to register layer: ApplyLayer exit status 1 stdout:  stderr: unlinkat /etc/ssl/certs: invalid argument

but with no other errors anywhere I'd need to know more about docker to debug this ;)

what I can tell you though is that rebooting into the previous kernel (5.15.74-1-pve) makes docker fail to start at all because of missing overlay support => did you maybe configure docker to not use overlay, but now with the new kernel partial overlay support for ZFS is there so that configuration doesn't apply anymore, yet overlay is still broken?
 
with storage driver switched to "vs", docker works on both the old and new kernel for me (even after removing /var/lib/docker to force it to pull/recreate everything):

Code:
~/test# docker compose create
[+] Running 6/6
 ⠿ cloudflare-ddns Pulled                                                                                                                                                              2.3s
   ⠿ df20fa9351a1 Pull complete                                                                                                                                                        0.6s
   ⠿ 63be11c42f3c Pull complete                                                                                                                                                        0.6s
   ⠿ bdca5d7e1d13 Pull complete                                                                                                                                                        0.9s
   ⠿ 1ab4fa5f9d8e Pull complete                                                                                                                                                        1.2s
   ⠿ a180b4b854a6 Pull complete                                                                                                                                                        1.3s
[+] Running 2/2
 ⠿ Network test_default              Created                                                                                                                                           0.0s
 ⠿ Container test-cloudflare-ddns-1  Created
 
That's interesting, so that same compose yaml I've been using for over a year with no problems. That being the case, any ideas why after this new kernel update it would stop working? I have about 30 containers that work together to form my media setup, and since I can no longer load them, I'm in a tough position.

I'm not sure that I follow:

did you maybe configure docker to not use overlay, but now with the new kernel partial overlay support for ZFS is there so that configuration doesn't apply anymore, yet overlay is still broken?

Here's what my docker that's failing tells me about its config:

Code:
Client:
 Context:    default
 Debug Mode: false

Server:
 Containers: 5
  Running: 3
  Paused: 0
  Stopped: 2
 Images: 5
 Server Version: 20.10.12
 Storage Driver: overlay2
  Backing Filesystem: zfs
  Supports d_type: true
  Native Overlay Diff: false
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version:
 runc version:
 init version:
 Security Options:
  apparmor
  seccomp
   Profile: default
  cgroupns
 Kernel Version: 5.15.83-1-pve
 Operating System: Ubuntu 22.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 20GiB
 Name: docker
 ID: SUNN:YGTJ:JOKE:KSJP:CMJZ:3FWX:JWFY:7HDA:NUGB:UVMC:SAME:BTQ4
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 
"Storage Driver: overlay2"

;) that one doesn't work for me at all when using ZFS with the previous kernel, as in, the docker daemon doesn't even start then without overriding the storage driver..
 
I assume this is why it just showed up....

From the docker filesystem docs.

"The Docker Engine has a prioritized list of which storage driver to use if no storage driver is explicitly configured, assuming that the storage driver meets the prerequisites, and automatically selects a compatible storage driver."
 
Looks like ZFS takes a dedicated pool, so that's a no-go. VFS is called out as do not use as it causes poor performance. On the previous kernel, what would it have used? That was performant enough for my needs.
 
ok, I changed to vfs and everything came back. This will probably break most docker installs that are running on Proxmox in LXC, so this will hopefully be helpful for people.

If anyone can chime in, is there any way to use a more performant driver with the nested config? I'm about to redo my disks, so I'll probably just setup docker on ZFS on bare metal as that seems to be the better solution, yet I'm wondering if there's a better option for me to use until I have time to do my rebuild...

Thanks for the assist Fabian, you may want to spread this info, considering there will be several others with the same issue, considering it defaults to overlay2 even though it's not actually compatible.

Cheers!
 
Note: If you run PBS, it won't work with the overlay file system. So, my suggestion at the moment, stick with VFS. Or, if you're so inclined, move your docker to the host server and avoid all of this fun! :)
 
Vfs can bring performance to a standstill under the right circumstances. Docker does not recommend using it.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!