hdds failures how to resilver

chalan

Member
Mar 16, 2015
119
3
16
hello i have 2xhdd in raid1 (mirror) zfs in proxmox. one hdd was gone, replace with new one, now i have

Code:
root@pve-klenova:~# fdisk -l
Disk /dev/sda: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 1665524D-C367-4A45-828F-AAACB25FDEFD

Device          Start        End    Sectors  Size Type
/dev/sda1          34       2047       2014 1007K BIOS boot
/dev/sda2        2048 7814020749 7814018702  3,7T Solaris /usr & Apple ZFS
/dev/sda9  7814020750 7814037134      16385    8M Solaris reserved 1


Disk /dev/sdb: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/zd0: 8 GiB, 8589934592 bytes, 16777216 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes


Disk /dev/zd16: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x644c7a42

Device      Boot    Start      End  Sectors  Size Id Type
/dev/zd16p1 *        2048 19922943 19920896  9,5G 83 Linux
/dev/zd16p2      19924990 20969471  1044482  510M  5 Extended
/dev/zd16p5      19924992 20969471  1044480  510M 82 Linux swap / Solaris

Partition 2 does not start on physical sector boundary.


Disk /dev/zd32: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 8192 bytes
I/O size (minimum/optimal): 8192 bytes / 8192 bytes
Disklabel type: dos
Disk identifier: 0x294acf2e

Device      Boot Start       End   Sectors  Size Id Type
/dev/zd32p1       2048 209715199 209713152  100G 83 Linux

Code:
root@pve-klenova:~# ls -lh /dev/disk/by-id/
total 0
lrwxrwxrwx 1 root root  9 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL -> ../../sda
lrwxrwxrwx 1 root root 10 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL-part9 -> ../../sda9
lrwxrwxrwx 1 root root  9 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K7JVZ45T -> ../../sdb
lrwxrwxrwx 1 root root  9 dec  5 10:51 wwn-0x5000cca25cc933fe -> ../../sda
lrwxrwxrwx 1 root root 10 dec  5 10:51 wwn-0x5000cca25cc933fe-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 dec  5 10:51 wwn-0x5000cca25cc933fe-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 dec  5 10:51 wwn-0x5000cca25cc933fe-part9 -> ../../sda9
lrwxrwxrwx 1 root root  9 dec  5 10:51 wwn-0x5000cca269e871c7 -> ../../sdb

and

Code:
root@pve-klenova:~# zpool status -v
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: resilvered 144G in 2h10m with 0 errors on Wed Sep 19 20:45:26 2018
config:

    NAME                              STATE     READ WRITE CKSUM
    rpool                             DEGRADED     0     0     0
      mirror-0                        DEGRADED     0     0     0
        wwn-0x5000cca25cc933fe-part2  ONLINE       0     0     0
        12706416511818272176          UNAVAIL      0     0     0  was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2

errors: No known data errors



/sdb is the new one, how to add it safely to the zpoll and resilver? i have important data on sda and dont want to make any mistake, PLEASE help me providing step by step...
 
Last edited:
ok i have done

1.) zpool offline rpool /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
2.) From the WebUI, Servername -> Disks -> Initialize Disk with GPT (/dev/sdb)
3.) sgdisk --replicate=/dev/sdb /dev/sda
4.) sgdisk --randomize-guids /dev/sdb
5.) grub-install /dev/sdb
6.) zpool replace rpool /dev/disk/by-id/wwn-0x5000cca269e871c7-part2

everythig worked but 6.) ended with

cannot replace /dev/disk/by-id/wwn-0x5000cca269e871c7-part2 with /dev/disk/by-id/wwn-0x5000cca269e871c7-part2: no such device in pool

so what can i do now, please help
 
ok i have done

zpool replace rpool 12706416511818272176 /dev/disk/by-id/wwn-0x5000cca269e871c7-part2

now i have

Code:
root@pve-klenova:~# zpool status -v
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Dec  5 12:18:08 2018
   33M scanned out of 1,09T at 1,83M/s, 172h30m to go
   32,6M resilvered, 0,00% done
config:

   NAME                                STATE     READ WRITE CKSUM
   rpool                               DEGRADED     0     0     0
     mirror-0                          DEGRADED     0     0     0
       wwn-0x5000cca25cc933fe-part2    ONLINE       0     0     0
       replacing-1                     DEGRADED  1004     0     0
         12706416511818272176          OFFLINE      0     0     0  was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
         wwn-0x5000cca269e871c7-part2  ONLINE       0     0  1004  (resilvering)

errors: No known data errors

is it ok?
 
is this normal??? 54,6G scanned out of 1,09T at 14,2M/s, 21h10m to go...??? sata3 disk in sata3 ports with sata3 cables... all vms stoped...
 
is this normal??? 54,6G scanned out of 1,09T at 14,2M/s, 21h10m to go...??? sata3 disk in sata3 ports with sata3 cables... all vms stoped...

If you use very slow disks, disks operations are very slow.
 
Do you think WD GOLD 4TB are very slow disks?

I my workplace there are mainly NVMe ssds, so yes, for my point of view these are slow disks.
 
what does the permanent erros means? and how to get rid of this:

replacing-1 DEGRADED 1008 0 0
12706416511818272176 OFFLINE 0 0 0 was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2

Code:
root@pve-klenova:~# zpool status -v
  pool: rpool
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: resilvered 1,09T in 7h33m with 2 errors on Wed Dec  5 19:51:37 2018
config:

    NAME                                STATE     READ WRITE CKSUM
    rpool                               DEGRADED     0     0     2
      mirror-0                          DEGRADED     0     0     4
        wwn-0x5000cca25cc933fe-part2    ONLINE       0     0     4
        replacing-1                     DEGRADED  1008     0     0
          12706416511818272176          OFFLINE      0     0     0  was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
          wwn-0x5000cca269e871c7-part2  ONLINE       0     0  1008

errors: Permanent errors have been detected in the following files:

        //var/lib/vz/images/200/vm-200-disk-2.qcow2
 
can somebody please help me with the the degraded pool? see above... i have make thease steps and still have degraded pool...

1.) zpool offline rpool /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
2.) From the WebUI, Servername -> Disks -> Initialize Disk with GPT (/dev/sdb)
3.) sgdisk --replicate=/dev/sdb /dev/sda
4.) sgdisk --randomize-guids /dev/sdb
5.) grub-install /dev/sdb
6.) zpool replace rpool 12706416511818272176 /dev/disk/by-id/wwn-0x5000cca269e871c7-part
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!