[SOLVED] Replace Disk on ZFS pool --> I did not look through

fireon

Distinguished Member
Oct 25, 2010
4,484
466
153
Austria/Graz
deepdoc.at
Hello,

i've setuped this day an machine with the new Proxmoxinstaller. I build the system with ZFS Raid1. So i would like to to simulate a harddisk failure. Ok, changed one disk:

Code:
pool: rpool state: DEGRADED
status: One or more devices could not be used because the label is missing or
	invalid.  Sufficient replicas exist for the pool to continue
	functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       DEGRADED     0     0     0
	  mirror-0  DEGRADED     0     0     0
	    sda3    ONLINE       0     0     0
	    sdb3    UNAVAIL      0   116     0  corrupted data
So the first thing was, to put a new disk in server, done... but... nothing is going. The system can't see disk. So google around and found this:

Code:
echo "0 0 0" >/sys/class/scsi_host/host3/scan

Scan every port. But i think there is a special command for ZFS. I Hope So. Then i can see the disk with fdisk.

Code:
Disk /dev/sdb: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders, total 488397168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xbe36be36


   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *        2048      718847      358400    7  HPFS/NTFS/exFAT
/dev/sdb2          718848   488394751   243837952    7  HPFS/NTFS/exFAT
Ok, so i would like to replace the disk. First trial:
Code:
zpool replace -f rpool /dev/sdb3 /dev/sdb3
cannot open '/dev/sdb3': No such device or address
cannot replace /dev/sdb3 with /dev/sdb3: one or more devices is currently unavailable
Second trail:
Code:
zpool replace -f rpool /dev/sdb3 /dev/sdb
Make sure to wait until resilver is done before rebooting.

First status:
Code:
pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Feb  2 20:04:37 2015
    371M scanned out of 639M at 11.2M/s, 0h0m to go
    370M resilvered, 58.05% done
config:


	NAME             STATE     READ WRITE CKSUM
	rpool            DEGRADED     0     0     0
	  mirror-0       DEGRADED     0     0     0
	    sda3         ONLINE       0     0     0
	    replacing-1  UNAVAIL      0     0     0
	      sdb3       UNAVAIL      0   116     0  corrupted data
	      sdb        ONLINE       0     0     0  (resilvering)

After some minutes:
Code:
pool: rpool
 state: ONLINE
  scan: resilvered 639M in 0h0m with 0 errors on Mon Feb  2 20:05:31 2015
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sda3    ONLINE       0     0     0
	    sdb     ONLINE       0     0     0


errors: No known data errors
Is this really ok? What is with EFI and boot listed with parted?
Code:
Model: ATA ST380817AS (scsi)
Disk /dev/sda: 80.0GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot
 3      136MB   80.0GB  79.9GB  zfs          PVE-ZFS-Partition




Model: ATA ST3250820NS (scsi)
Disk /dev/sdb: 250GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt


Number  Start   End    Size    File system  Name  Flags
 1      1049kB  250GB  250GB   zfs          zfs
 9      250GB   250GB  8389kB




Model: Unknown (unknown)
Disk /dev/zd0: 4295MB
Sector size (logical/physical): 512B/4096B
Partition Table: loop


Number  Start  End     Size    File system     Flags
 1      0.00B  4295MB  4295MB  linux-swap(v1)
So i think the system boots only with the first disk.

Thanks and best regards.
 
Hi,
The problem is the boot and efi are no ZFS pool, so they will not restore by zpool.
In this chase you have to part the disk manually and copy boot and efi before you call zpool restore.
Also you should use /dev/disk/by-id/<device> instate of /dev/sd*.
 
How does that works on FreeBSD or Solaris? Do they have scripts to partition a new disk correctly?
I think we should have a GUI to add/replace disks. Then we can generate the correct partition
tables automatically.
 
  • Like
Reactions: chrone
So the first thing was, to put a new disk in server, done... but... nothing is going. The system can't see disk. So google around and found this:

Code:
echo "0 0 0" >/sys/class/scsi_host/host3/scan

What you expected to happen ? Looked at logs? Maybe #ls /dev/sd* ?

Ok, so i would like to replace the disk. First trial:
Code:
zpool replace -f rpool /dev/sdb3 /dev/sdb3
cannot open '/dev/sdb3': No such device or address
cannot replace /dev/sdb3 with /dev/sdb3: one or more devices is currently unavailable

You can not replace with the same name.

First status:
Code:
pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Feb  2 20:04:37 2015
    371M scanned out of 639M at 11.2M/s, 0h0m to go
    370M resilvered, 58.05% done
config:


    NAME             STATE     READ WRITE CKSUM
    rpool            DEGRADED     0     0     0
      mirror-0       DEGRADED     0     0     0
        sda3         ONLINE       0     0     0
        replacing-1  UNAVAIL      0     0     0
          sdb3       UNAVAIL      0   116     0  corrupted data
          sdb        ONLINE       0     0     0  (resilvering)

Instead replacing equal size of partition you set up for all hdd space.

Is this really ok? What is with EFI and boot listed with parted?

ZFS use all sdb hdd now so no other things left in it.

So i think the system boots only with the first disk.

System will always boot from the first disk except for real raid card.


MBR or GPT and others things ZFS dont care. ZFS can use a partition or whole disk without partition table.
 
System will always boot from the first disk except for real raid card.

But our installer create grub boot an efi partitions on each (bootable) drive. So if one disk fails, the system can
boot from other disk. So it is best to use the same disk partitions when you replace a disk.
 
Good evening and thanks for answers :)

I'll setup once again and test an disk failure. So when i understand right, the best is to copy the partitontable from boot and efi, after the setup, then i can copy that on new disk. Then the rest is for zfs. is ZFS matter to use /dev/sdf or /devsdfX?

I had to use this:

echo "0 0 0" >/sys/class/scsi_host/host3/scan
Because the disk where not detected automaticly. Not in the logs, not real. On my Gentoo Desktop this works. So maybe there some packages not avail.

Also you should use /dev/disk/by-id/<device> instate of /dev/sd*.

Will test it in the next session with the right path. And yes, a GUI would be nice, that no mistakes can be happen.
 
is ZFS matter to use /dev/sdf or /devsdfX?

ZFS recommends to use entire hdd to get control of hdd actuator for best performance. But you can use ZFS in partition too. If you link to /dev/sd[a-z] you link to entire hdd, as for /dev/sd[a-z][number] = to hdd partition.
 
IMHO, that is a strange recommendation, because you need boot and/or efi partition for booting.

The problem appears in linux boot load because they do not support boot from ZFS so its impossible to use entire hdd for boot and root as zfs. But if you do not use multi partitions for different purpose its fine to have one partition for boot other for zfs. As for ZFS disks I have set up noop scheduler.

In my desktop PC I use first partition ext3 for /boot (grub + zfs module support) and second partition for ZFS

In my server I use single sdd as linux root (proxmox default installation) and others hdd for zfs pools.
 
Ok, this sounds also nice. One SSD or two in little HWRaid. And the other on seperate disks. It will be also easier to change an disk where is running ZFS.
 
IIRC triggering a rescan is by writing "- - -" to the sysfs, for example: echo "- - -" >/sys/class/scsi_host/host3/scan

Aside from more complex solutions like booting from ZFS on Linux hosts, always provide:
- whole disks to the pool
- symbolic disk names to ease disk identification and to avoid naming problems like the above, like /dev/disk/by-id/ata-XXX without -partX

ZFS will create a properly aligned partition automatically on the disk and that will get added to the pool. You can see the partitions after the pool is created. Setting the alignment with -ashift 12 is always good, helps avoiding performance problems on "advanced format" disks and many SSDs. If the disk is not AF, you will lose some negligible amount of space most of the times (it varies case by case - do some tests if you use huge amounts of files). I personally never use a ZFS pool for booting on Linux, I tend to use a small SSD or HDD mirror for OS and boot. A lot easier to work with. I've been using ZFS for more than 6-7 years with success for various jobs, mainly for large backup pools, since accumulated expertise and Linux support for virtualisation is sparse at best, but it's rapidly improving. I think the PVE team made the right step in that direction.
 
The problem appears in linux boot load because they do not support boot from ZFS

This is not true. We use latest grub an boot directly from ZFS -there is no extra boot partition. But you need an extra
partition for grub and efi.
 
Hello

I was up to move live to Datacenter 6 servers configured with Proxmox 3.3 softraid over Debian today,but could not resist to ZFS and proxmox 3,4
But had to figure out how to replace a failed disk

I removed and replaced one disk with a blank one

Then copied partitions from good disk sda to blank disk sdb
sgdisk -R /dev/sdb /dev/sda
sgdisk -G /dev/sdb

(parted) select /dev/sdb
Using /dev/sdb
(parted) p
Model: ATA WDC WD2000FYYZ-0 (scsi)
Disk /dev/sdb: 2000398934016B
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 1048576B 2097151B 1048576B Grub-Boot-Partition bios_grub
2 2097152B 136314879B 134217728B fat32 EFI-System-Partition boot, esp
3 136314880B 2000397885439B 2000261570560B zfs PVE-ZFS-Partition

Ok partitions copied
Now copy data from /dev/sda1 to /dev/sdb1 and /dev/sda2 to /dev/sdb2

dd if=/dev/sda1 of=/dev/sdb1 bs=512
dd if=/dev/sda2 of=/dev/sdb2 bs=512

Then with some simple zpool comants attached sdb3 to pool, wait few minutes to resilver

Just in case i did
grub-install /dev/sda
grub-install /dev/sdb
update-grub

Then removed disk sda end rebooted the server
The server booted from sdb without problem
 
  • Like
Reactions: tl5k5 and chrone
Hello,

i tested again and again. Why? I build new Server. It tested with USB3.0 Stick. But it is not a real good idea: http://forum.proxmox.com/threads/21390-Install-proxmox-on-an-USB-3-0-stick-gt-Is-this-a-good-idea Or the other way was to set two SSDs in Raid1 only for the system: http://forum.proxmox.com/threads/21401-Does-Proxmox-works-with-this-controllers i searched at internet, but there are no useable only SATA controller, they all have Raid... strange.

Why these things/questions? I don't know what the best way really with ZFS is.

Are there really performance issius when you use ZFS with partitions and not the hole disk?

Best regards.
 
Look up for HBA controller. As for ZFS you can use partition and you do not loose performance if your other partition is for booting or similar.
 
So, i tested it with 4 disk Raid Z10. Installation with proxmoxdisc works fine. Then i had a look at parted, and i see that only on the first two disks partitions for Grub were generated. I test this and boot from disk 3 and 4. And they can't boot. When i 'am not wrong... Grub should stay on every disk...

Code:
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sda: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition


Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdb: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdc: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name  Flags
 1      1049kB  10.7GB  10.7GB  zfs          zfs
 9      10.7GB  10.7GB  8389kB




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdd: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name  Flags
 1      1049kB  10.7GB  10.7GB  zfs          zfs
 9      10.7GB  10.7GB  8389kB




Model: Unknown (unknown)
Disk /dev/zd0: 1208MB
Sector size (logical/physical): 512B/4096B
Partition Table: loop
Disk Flags: 


Number  Start  End     Size    File system     Flags
 1      0.00B  1208MB  1208MB  linux-swap(v1)

Code:
  pool: rpool state: ONLINE
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sda3    ONLINE       0     0     0
	    sdb3    ONLINE       0     0     0
	  mirror-1  ONLINE       0     0     0
	    sdd     ONLINE       0     0     0
	    sdc     ONLINE       0     0     0


errors: No known data errors



Thanks for reply.
 
Last edited:
Tested this also with RaidZ2 and 6 Disks. And it has worked fine! Strange.

Code:
 Model: ATA QEMU HARDDISK (scsi)Disk /dev/sda: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdb: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdc: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdd: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sde: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sdf: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 


Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  2097kB  1049kB               Grub-Boot-Partition   bios_grub
 2      2097kB  136MB   134MB   fat32        EFI-System-Partition  boot, esp
 3      136MB   10.7GB  10.6GB  zfs          PVE-ZFS-Partition




Model: Unknown (unknown)
Disk /dev/zd0: 1208MB
Sector size (logical/physical): 512B/4096B
Partition Table: loop
Disk Flags: 


Number  Start  End     Size    File system     Flags
 1      0.00B  1208MB  1208MB  linux-swap(v1)

Code:
pool: rpool
 state: ONLINE
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    sda3    ONLINE       0     0     0
	    sdb3    ONLINE       0     0     0
	    sdc3    ONLINE       0     0     0
	    sdd3    ONLINE       0     0     0
	    sde3    ONLINE       0     0     0
	    sdf3    ONLINE       0     0     0


errors: No known data errors
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!