ZFS: raid disks

Could /sys/block/sd[x]/device be used as a work around?

that's not a block device, so no?

nothing beside the device nodes directly in /dev work there (the symlinks all exist and point to the right stuff, but zfs refuses to work with them..)
 
this is not to disagree with any other suggestion,

for stretch i've found wwn's are the easiest for tracking with drive is which on a zfs or ceph setup.
Code:
ls -l  /dev/disk/by-id/ |grep -v part |grep wwn

then
Code:
zpool create -f -o ashift=12 tank mirror wwn-0x55cd2e40xxxxxxxx wwn-0x55cd2e4088888888

is there a better way to ID drive for a mirror?
 
this is not to disagree with any other suggestion,

for stretch i've found wwn's are the easiest for tracking with drive is which on a zfs or ceph setup.
Code:
ls -l  /dev/disk/by-id/ |grep -v part |grep wwn

then
Code:
zpool create -f -o ashift=12 tank mirror wwn-0x55cd2e40xxxxxxxx wwn-0x55cd2e4088888888

is there a better way to ID drive for a mirror?

there is also by-path, or you can do your own IDs (either via udev or via vdev_id).
 
You can safely do it from initrd:
- reboot, and when grub displays the menu, press 'e' on the first line
- edit the linux line so it has break=mount at the end and then press ctrl-x
- when you get the promt, do 'modprobe zfs'
- now do a 'zpool import -d /dev/disk/by-id/ rpool'
- verify status with 'zpool status' - you should see device ids in place of the standard device names
- ctrl-d to continue booting

From now on you should see your pool based on device ids.

Followed these steps but pool is still made with sd*

EDIT: please ignore, my mistake, is working properly now
 
Last edited:
I'm trying to simulate a disk failure, with ZFS using paths and not sd*

I've set "autoreplace=on" to rpool, then i've removed on of the working SSDs, ZFS properly complained about this, but now I'm unable to replace the disk:

Code:
# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 0h0m with 0 errors on Thu Nov  9 11:35:01 2017
config:

    NAME                                       STATE     READ WRITE CKSUM
    rpool                                      DEGRADED     0     0     6
     raidz2-0                                 DEGRADED     0     0    12
       pci-0000:03:00.0-sas-phy0-lun-0-part2  ONLINE       0   137     0
       pci-0000:03:00.0-sas-phy1-lun-0-part2  ONLINE       0     0     3
       pci-0000:03:00.0-sas-phy2-lun-0-part2  OFFLINE      0     0     1
       pci-0000:03:00.0-sas-phy3-lun-0-part2  ONLINE       0     0     0

errors: No known data errors



# sgdisk --replicate /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy3-lun-0 /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy2-lun-0
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.



# sgdisk --randomize-guids /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy2-lun-0
The operation has completed successfully.



# zpool replace rpool /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy2-lun-0
invalid vdev specification
use '-f' to override the following errors:
/dev/disk/by-path/pci-0000:03:00.0-sas-phy2-lun-0-part1 is part of active pool 'rpool'



# zpool replace -f rpool /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy2-lun-0
invalid vdev specification
the following errors must be manually repaired:
/dev/disk/by-path/pci-0000:03:00.0-sas-phy2-lun-0-part1 is part of active pool 'rpool'


# zpool offline rpool /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy2-lun-0-part1 
cannot offline /dev/disk/by-path/pci-0000:03:00.0-sas-phy2-lun-0-part1: no such device in pool

WTF ? How can I replace a member disk ?
 
have you tried using "zpool online rpool /dev/..." ? it seems you are trying to replace an offlined disk with itself. replace is for replacing with a NEW disk
 
have you tried using "zpool online rpool /dev/..." ? it seems you are trying to replace an offlined disk with itself. replace is for replacing with a NEW disk

Code:
# zpool online rpool /dev/disk/by-path/pci-0000\:03\:00.0-sas-phy2-lun-0
cannot online /dev/disk/by-path/pci-0000:03:00.0-sas-phy2-lun-0: no such device in pool

pool is made by "-part2" (why are you creating partition and not usign whole disks?), i've offlined disks, cleared the partition table with "dd" simulating a new disk and now i'm trying to put that back online, but "-part2" is not seen by the kernel, may be due to the "sgdisk" warning.
I've took sgdisk command directly from your wiki, but PVE doesn't include partprobe nor kpartx (you should include parted as default).

I've installed parted, run partprobe but new partitioning schema is still unreconized because disks seems to be in use, but i've already offlined that!

EDIT:
anyway, is the Wiki correct? Every time I have to replace a disk, should I run sgdisk and the following procedure (like we need to do with mdadm?). ZFS should be able to handle all of that automatically...
 
you replicated from the offline disk to an in-use one, hence the kernel keeps the old table in use.

I don't really know what you are doing, and without complete logs of the commands you are attempting, I can't..

if you need to replace a failed disk with a new one, you just issue "zpool replace POOL OLD NEW" and it works. I think the errors you are encountering are because you are messing up when "clearing" the "fake failed" disk.
 
What I did was moving from using /dev/sd* to /dev/disk/by-path/*

When importing with by-path, ZFS imported via partition #2 and not the whole disk, thus (maybe i'm wrong), when simulating a failure, i've offlined the partition #2 (as wrote in zpool status), then tried to clear and remove the whole disk. Obviously, this fails, because ZFS is still using partition #1 and #9 (both created by ZFS itself)

I know that usually the only needed command is "zpool replace POOL OLD NEW" but PVE Wiki is saying that some more commands are required (I don't know why, usually ZFS take care of partitioning)

Tomorrow i'll start from scratch again.
 
You can safely do it from initrd:
- reboot, and when grub displays the menu, press 'e' on the first line
- edit the linux line so it has break=mount at the end and then press ctrl-x
- when you get the promt, do 'modprobe zfs'
- now do a 'zpool import -d /dev/disk/by-id/ rpool'
- verify status with 'zpool status' - you should see device ids in place of the standard device names
- ctrl-d to continue booting

From now on you should see your pool based on device ids.

Hi folks, by following this, but using vdev_id.conf, it did not import the whole disk but just the partition. It works but I'd like to know if it's ok and if there is a way to import the disk instead of the partition.

Code:
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:02:22 with 0 errors on Sun Nov 10 00:26:23 2019
config:

        NAME           STATE     READ WRITE CKSUM
        rpool          ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            d00-part3  ONLINE       0     0     0
            d01-part3  ONLINE       0     0     0

errors: No known data errors
Code:
alias d00 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0
alias d01 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0
alias d02 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:2:0
alias d03 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:3:0
alias d04 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:4:0
alias d05 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:5:0
alias d06 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:6:0
alias d07 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:7:0
alias d08 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:8:0
alias d09 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:9:0
alias d10 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:10:0
alias d11 /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:11:0
 
Hi folks, by following this, but using vdev_id.conf, it did not import the whole disk but just the partition. It works but I'd like to know if it's ok and if there is a way to import the disk instead of the partition.

It is OK! Even if you create a pool using disks, in backstage zfs will create some partition and alocate one partion to the pool(the biggest one) from each disk!
 
It is OK! Even if you create a pool using disks, in backstage zfs will create some partition and alocate one partion to the pool(the biggest one) from each disk!

Great, thanks.

I did really like to see the whole disk as it was before and end up with d00 and d01, but I've tried in different ways and I always had the same result, d00-part3 and d01-part3 :/
 
It's normal, because part1-2 is used for EFI systeam partition and boot, respectively (if I remember correctly). And it's better to have a partition table anyway to avoid "accidents" with bare disks.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!