PVE 6.3 - Clean Install - Failed to start Import ZFS pool rpool

Jota V.

Well-Known Member
Jan 29, 2018
59
9
48
48
I've installed PVE 6.3 in a Dell R620 with this disk config.

- 2TB SATA 7k OS Disk as Ext4 with LVM, with 0 GB Swap (we have 256 GB RAM) and 0 GB VMs
- 2TB SATA 7k disk unused and formatted with LSI Utility and "sgdisk -Z"
- 4 x 2 TB SSD Enterprise disk (new disks not ever used)

Once installed and updated packages, I've created a ZFS Pool before add to our cluster.

From the GUI created a ZFS RAID 10 using the four SSD disks, compression on, named "rpool" and no add to storage.

From console:

Code:
zfs set compression=on rpool
zfs create rpool/data
zfs set atime=off rpool/data
zfs set xattr=sa dnodesize=auto rpool/data

And added to the cluster. All our LXC/VM are on rpool/data and we can do migrations to our new server. Fine!

When server reboots/starts we can see "Failed to start Import ZFS pool rpool" but zfs rpool imports correctly.

The result of running systemctl tests

Code:
root@vcloud06:~# systemctl list-units --failed
  UNIT                     LOAD   ACTIVE SUB    DESCRIPTION
● zfs-import@rpool.service loaded failed failed Import ZFS pool rpool


LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.


1 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

Code:
# systemctl status zfs-import@rpool.service
● zfs-import@rpool.service - Import ZFS pool rpool
   Loaded: loaded (/lib/systemd/system/zfs-import@.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2020-12-28 16:39:42 CET; 1min 38s ago
     Docs: man:zpool(8)
  Process: 940 ExecStart=/sbin/zpool import -N -d /dev/disk/by-id -o cachefile=none rpool (code=exited, status=1
Main PID: 940 (code=exited, status=1/FAILURE)

Dec 28 16:39:41 vcloud06 systemd[1]: Starting Import ZFS pool rpool...
Dec 28 16:39:42 vcloud06 zpool[940]: cannot import 'rpool': pool already exists
Dec 28 16:39:42 vcloud06 systemd[1]: zfs-import@rpool.service: Main process exited, code=exited, status=1/FAILUR
Dec 28 16:39:42 vcloud06 systemd[1]: zfs-import@rpool.service: Failed with result 'exit-code'.
Dec 28 16:39:42 vcloud06 systemd[1]: Failed to start Import ZFS pool rpool.

root@vcloud06:~# zpool list
Code:
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool  3.47T  1.05M  3.47T        -         -     0%     0%  1.00x    ONLINE  -

root@vcloud06:~# zpool status
Code:
pool: rpool
state: ONLINE
  scan: none requested
config:


    NAME                                             STATE     READ WRITE CKSUM
    rpool                                            ONLINE       0     0     0
      mirror-0                                       ONLINE       0     0     0
        ata-KINGSTON_SEDC500M1920G_50026B76839BE9D4  ONLINE       0     0     0
        ata-KINGSTON_SEDC500M1920G_50026B76839BF05C  ONLINE       0     0     0
      mirror-1                                       ONLINE       0     0     0
        ata-KINGSTON_SEDC500M1920G_50026B76839BF3A2  ONLINE       0     0     0
        ata-KINGSTON_SEDC500M1920G_50026B76839BEA6B  ONLINE       0     0     0


errors: No known data errors

root@vcloud06:~# zfs list
Code:
NAME         USED  AVAIL     REFER  MOUNTPOINT
rpool        996K  3.36T       96K  /rpool
rpool/data    96K  3.36T       96K  /rpool/data

My versions

proxmox-ve: 6.3-1 (running kernel: 5.4.73-1-pve)
pve-manager: 6.3-2 (running version: 6.3-2/22f57405)
pve-kernel-5.4: 6.3-1
pve-kernel-helper: 6.3-1
pve-kernel-5.4.73-1-pve: 5.4.73-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.4-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.16-pve1
libproxmox-acme-perl: 1.0.5
libproxmox-backup-qemu0: 1.0.2-1
libpve-access-control: 6.1-3
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.2-6
libpve-guest-common-perl: 3.1-3
libpve-http-server-perl: 3.0-6
libpve-storage-perl: 6.3-1
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 4.0.3-1
lxcfs: 4.0.3-pve3
novnc-pve: 1.1.0-1
proxmox-backup-client: 1.0.5-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.4-3
pve-cluster: 6.2-1
pve-container: 3.3-1
pve-docs: 6.3-1
pve-edk2-firmware: 2.20200531-1
pve-firewall: 4.1-3
pve-firmware: 3.1-3
pve-ha-manager: 3.1-1
pve-i18n: 2.2-2
pve-qemu-kvm: 5.1.0-7
pve-xtermjs: 4.7.0-3
qemu-server: 6.3-1
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-2
zfsutils-linux: 0.8.5-pve1
 
Last edited:
Dec 28 16:39:42 vcloud06 zpool[940]: cannot import 'rpool': pool already exists
Are you using shared storage?
Is rpool accessible multiple times?
The message speaks for itself IMHO...
 
The pool rpool is on every node of cluster with same name . It's not shared between nodes, it's local storage.
 
There is also an option to use ZFS over iscsi.
Hence I was asking.
So the question is why does the system try to import twice?
 
Well, that wasn't really obvious for me from your first post. Apologies.
You could try recreating the zpool cache file and see if that helps. Maybe for whatever reason the pool was put in there twice?
How many hosts are in your cluster?
 
Well, that wasn't really obvious for me from your first post. Apologies.
You could try recreating the zpool cache file and see if that helps. Maybe for whatever reason the pool was put in there twice?
How many hosts are in your cluster?

Don't worry :)

Yes I've deleted the pool, sgdisk -Z in pool disks, deleted /etc/zfs/zpool.cache, rebooted, created again and same error. In other node installed today got same error but this node only have 2 SATA 7k disks, one Ext4 for PVE 6.3 and one RAID-0 ZFS Pool.

We have a cluster with five nodes (PVE 6.1-6.2) and we have added this two nodes with this trouble.
 
Don't worry :)

Yes I've deleted the pool, sgdisk -Z in pool disks, deleted /etc/zfs/zpool.cache, rebooted, created again and same error. In other node installed today got same error but this node only have 2 SATA 7k disks, one Ext4 for PVE 6.3 and one RAID-0 ZFS Pool.

We have a cluster with five nodes (PVE 6.1-6.2) and we have added this two nodes with this trouble.

All nodes have this similar config, one SATA rotational or SSD for PVE 6.x (Ext4 LVM) and Four rotational or SSD for ZFS Pool.
 
That's odd.
Have you tried another pool name and see if that issue moves with it?
 
Sorry but I dont get what you want to tell me ;)
Does it work with a different pool-name?
 
Sorry but I dont get what you want to tell me ;)
Does it work with a different pool-name?

If I create a different pool name, I'll need add another ZFS storage on cluster, then replication will fail and can't migrate.
 
I've done same on virtual machines and

PVE 6.3 same error. Updated to last packages with non subscription repos and got same error.
PVE 6.2 clean. Works OK.
PVE 6.2 upgraded to PVE 6.3 with non subscription repos and no error!!

We'll rollback to PVE 6.2 and report this.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!