Help Moving Storage to ZFS // Docker not working!

phrankme

Member
Sep 1, 2020
9
1
6
43
Hi all!

I extended my Proxmox setup to a cluster with three nodes. And (!) I added SSDs for ZFS. For all my LXCs and VMs I moved the root disks from local-lvm to my ZFS (data). Everything works well, also replication!

Except DOCKER! :( I can move my docker volume in Proxmox to NFS or any other local disk and it works. If I move it to ZFS I get this error:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

DOCKER is simply not running! What exactly am I missing here??


Detailed information below

Bash:
root@reactorlab-3:~# pct config 104
arch: amd64
cores: 4
description: Docker for IoT #20 only%3A%0A- homebridge%0A
features: keyctl=1,nesting=1
hostname: vaultboy
memory: 4048
net0: name=eth0,bridge=vmbr0,firewall=1,gw=10.20.0.1,hwaddr=0E:1B:F7:F4:83:42,ip=10.20.0.111/24,ip6=dhcp,tag=20,type=veth
onboot: 1
ostype: ubuntu
rootfs: local-lvm:vm-104-disk-0,size=64G
swap: 4048
unprivileged: 1
root@reactorlab-3:~#

Bash:
root@reactorlab-3:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content images,iso,vztmpl,snippets
        shared 0

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

nfs: Proxmox
        export /volume1/proxmox
        path /mnt/pve/Proxmox
        server 10.10.0.6
        content vztmpl,snippets,iso,backup,rootdir,images
        prune-backups keep-last=1

nfs: Backup
        export /volume1/_backup
        path /mnt/pve/Backup
        server 10.10.0.6
        content backup
        prune-backups keep-last=2

zfspool: data
        pool data
        content rootdir,images
        mountpoint /data
        nodes reactorlab-2,reactorlab-3
        sparse 0

root@reactorlab-3:~#

Bash:
root@reactorlab-3:~# zpool status
  pool: data
 state: ONLINE
config:

        NAME                                         STATE     READ WRITE CKSUM
        data                                         ONLINE       0     0     0
          mirror-0                                   ONLINE       0     0     0
            ata-WDC_WDS100T1R0A-...  ONLINE       0     0     0
            ata-WDC_WDS100T1R0A-...  ONLINE       0     0     0

errors: No known data errors

Bash:
root@reactorlab-3:~# zfs list
NAME                     USED  AVAIL     REFER  MOUNTPOINT
data                    1.91G   897G      104K  /data
data/subvol-100-disk-0  1.91G  6.10G     1.90G  /data/subvol-100-disk-0

Bash:
root@reactorlab-3:~# mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=16103268k,nr_inodes=4025817,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3227392k,mode=755,inode64)
/dev/mapper/pve-root on / type ext4 (rw,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=27827)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
/dev/nvme0n1p2 on /boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
lxcfs on /var/lib/lxcfs type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
/dev/fuse on /etc/pve type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other)
10.10.0.6:/volume1/_backup on /mnt/pve/Backup type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.62,local_lock=none,addr=10.10.0.6)
10.10.0.6:/volume1/proxmox on /mnt/pve/Proxmox type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.62,local_lock=none,addr=10.10.0.6)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
data on /data type zfs (rw,xattr,noacl)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=3227388k,nr_inodes=806847,mode=700,inode64)
data/subvol-100-disk-0 on /data/subvol-100-disk-0 type zfs (rw,xattr,posixacl)

Bash:
root@reactorlab-3:~# zfs get all zpool
cannot open 'zpool': dataset does not exist
 
Last edited:
I have the same issue - I backed up from a node using LVm and restored onto a node with zfs storage

Bash:
root@docker-portainer:~# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Docker Buildx (Docker Inc., v0.9.1-docker)
  compose: Docker Compose (Docker Inc., v2.12.2)
  scan: Docker Scan (Docker Inc., v0.21.0)

Server:
ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
errors pretty printing info

Bash:
root@docker-portainer:~# docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Where it was restored to:
Bash:
root@thin:~# pct config 1311
arch: amd64
cores: 2
features: nesting=1
hostname: docker-portainer
lock: backup
memory: 1500
net0: name=eth0,bridge=vmbr1,firewall=1,hwaddr=0A:74:0D:67:86:36,ip=dhcp,tag=30,type=veth
onboot: 0
ostype: debian
rootfs: zfs-on-thin:subvol-1311-disk-0,size=30G
swap: 512
unprivileged: 1


Where it came from
Bash:
root@hpnote:~# pct config 1311
arch: amd64
cores: 2
features: nesting=1
hostname: docker-portainer
memory: 1500
net0: name=eth0,bridge=vmbr1,firewall=1,hwaddr=0A:74:0D:67:86:36,ip=dhcp,tag=30,type=veth
onboot: 1
ostype: debian
rootfs: local:1311/vm-1311-disk-0.raw,size=30G
swap: 512
unprivileged: 1

Bash:
root@thin:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content vztmpl,backup,iso

lvmthin: local-lvm
        thinpool data
        vgname pve
        content rootdir,images

zfspool: zfs-on-thin
        pool zfs-on-thin
        content rootdir,images
        mountpoint /zfs-on-thin
        nodes thin
        sparse 1

nfs: backup
        export /mnt/HD/HD_b2/xfer/proxmox
        path /mnt/pve/backup
        server 192.168.1.16
        content vztmpl,backup,snippets,iso
        prune-backups keep-all=1

Bash:
root@thin:~# zpool status
  pool: zfs-on-thin
 state: ONLINE
config:

        NAME                                    STATE     READ WRITE CKSUM
        zfs-on-thin                             ONLINE       0     0     0
          ata-LITEON_CV3-8D256-HP_0027321000SS  ONLINE       0     0     0

errors: No known data errors

docker daemon failed to start, not running

Code:
root@docker-portainer:~# systemctl status docker
* docker.service - Docker Application Container Engine
     Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Sun 2022-11-27 23:20:14 UTC; 1min 18s ago
TriggeredBy: * docker.socket
       Docs: https://docs.docker.com
    Process: 325 ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock (code=exited, status=1/FAILURE)
   Main PID: 325 (code=exited, status=1/FAILURE)
        CPU: 126ms

Nov 27 23:20:12 docker-portainer dockerd[325]: time="2022-11-27T23:20:12.842303343Z" level=error msg="[graphdriver] prior storage driver overlay2 failed: driver not supported"
Nov 27 23:20:12 docker-portainer dockerd[325]: failed to start daemon: error initializing graphdriver: driver not supported
Nov 27 23:20:12 docker-portainer systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Nov 27 23:20:12 docker-portainer systemd[1]: docker.service: Failed with result 'exit-code'.
Nov 27 23:20:12 docker-portainer systemd[1]: Failed to start Docker Application Container Engine.
Nov 27 23:20:14 docker-portainer systemd[1]: docker.service: Scheduled restart job, restart counter is at 3.
Nov 27 23:20:14 docker-portainer systemd[1]: Stopped Docker Application Container Engine.
Nov 27 23:20:14 docker-portainer systemd[1]: docker.service: Start request repeated too quickly.
Nov 27 23:20:14 docker-portainer systemd[1]: docker.service: Failed with result 'exit-code'.
Nov 27 23:20:14 docker-portainer systemd[1]: Failed to start Docker Application Container Engine.
 
Last edited:
You have to tell docker to use its zfs storage driver.

In the file /etc/docker/demon.json add this:

{ "storage-driver": "zfs" }

Then restart the docker daemon.

See here for more info.
 
You have to tell docker to use its zfs storage driver.

In the file /etc/docker/demon.json add this:

{ "storage-driver": "zfs" }

Then restart the docker daemon.

See here for more info.
That is only required if you do not have a ZFS dataset mounted at /var/lib/docker. Having a dataset is the best approach to use the benefits of ZFS and encapsulate everything. In addition to the "normal" Docker ZFS driver, I recommend also installing the ZFS volume driver for Docker, which will create each volume directly as ZFS dataset instead of a volume ON your Docker ZFS dataset in /var/lib/docker/volumes.
 
  • Like
Reactions: Dunuin
You have to tell docker to use its zfs storage driver.

In the file /etc/docker/demon.json add this:

{ "storage-driver": "zfs" }

Then restart the docker daemon.

See here for more info.

There is no daemon.json in that folder, so I created one and I'm still unable to start docker.

As docker with proxmox migration to zfs is obviously flaky/boken I will have to painstakingly rebuild the whole VM.

Thanks anyway for your input.

JayS
 
As docker with proxmox migration to zfs is obviously flaky/boken I will have to painstakingly rebuild the whole VM.
That has nothing to do with PVE, just Docker.

The easiest way is to start over:
  • remove everything below /var/lib/docker (this will delete EVERYTHING, so make backups if necessary)
  • create a new ZFS dataset and set the mountpoint to /var/lib/docker
  • start Docker and enjoy Docker on ZFS
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!