hdds failures how to resilver

Discussion in 'Proxmox VE: Installation and configuration' started by chalan, Dec 5, 2018.

  1. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    hello i have 2xhdd in raid1 (mirror) zfs in proxmox. one hdd was gone, replace with new one, now i have

    Code:
    root@pve-klenova:~# fdisk -l
    Disk /dev/sda: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disklabel type: gpt
    Disk identifier: 1665524D-C367-4A45-828F-AAACB25FDEFD
    
    Device          Start        End    Sectors  Size Type
    /dev/sda1          34       2047       2014 1007K BIOS boot
    /dev/sda2        2048 7814020749 7814018702  3,7T Solaris /usr & Apple ZFS
    /dev/sda9  7814020750 7814037134      16385    8M Solaris reserved 1
    
    
    Disk /dev/sdb: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    
    
    Disk /dev/zd0: 8 GiB, 8589934592 bytes, 16777216 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 4096 bytes
    I/O size (minimum/optimal): 4096 bytes / 4096 bytes
    
    
    Disk /dev/zd16: 10 GiB, 10737418240 bytes, 20971520 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 8192 bytes
    I/O size (minimum/optimal): 8192 bytes / 8192 bytes
    Disklabel type: dos
    Disk identifier: 0x644c7a42
    
    Device      Boot    Start      End  Sectors  Size Id Type
    /dev/zd16p1 *        2048 19922943 19920896  9,5G 83 Linux
    /dev/zd16p2      19924990 20969471  1044482  510M  5 Extended
    /dev/zd16p5      19924992 20969471  1044480  510M 82 Linux swap / Solaris
    
    Partition 2 does not start on physical sector boundary.
    
    
    Disk /dev/zd32: 100 GiB, 107374182400 bytes, 209715200 sectors
    Units: sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 8192 bytes
    I/O size (minimum/optimal): 8192 bytes / 8192 bytes
    Disklabel type: dos
    Disk identifier: 0x294acf2e
    
    Device      Boot Start       End   Sectors  Size Id Type
    /dev/zd32p1       2048 209715199 209713152  100G 83 Linux
    
    Code:
    root@pve-klenova:~# ls -lh /dev/disk/by-id/
    total 0
    lrwxrwxrwx 1 root root  9 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL -> ../../sda
    lrwxrwxrwx 1 root root 10 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL-part1 -> ../../sda1
    lrwxrwxrwx 1 root root 10 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL-part2 -> ../../sda2
    lrwxrwxrwx 1 root root 10 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K3GN7LYL-part9 -> ../../sda9
    lrwxrwxrwx 1 root root  9 dec  5 10:51 ata-WDC_WD4002FYYZ-01B7CB1_K7JVZ45T -> ../../sdb
    lrwxrwxrwx 1 root root  9 dec  5 10:51 wwn-0x5000cca25cc933fe -> ../../sda
    lrwxrwxrwx 1 root root 10 dec  5 10:51 wwn-0x5000cca25cc933fe-part1 -> ../../sda1
    lrwxrwxrwx 1 root root 10 dec  5 10:51 wwn-0x5000cca25cc933fe-part2 -> ../../sda2
    lrwxrwxrwx 1 root root 10 dec  5 10:51 wwn-0x5000cca25cc933fe-part9 -> ../../sda9
    lrwxrwxrwx 1 root root  9 dec  5 10:51 wwn-0x5000cca269e871c7 -> ../../sdb
    
    and

    Code:
    root@pve-klenova:~# zpool status -v
      pool: rpool
     state: DEGRADED
    status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
    action: Replace the device using 'zpool replace'.
       see: http://zfsonlinux.org/msg/ZFS-8000-4J
      scan: resilvered 144G in 2h10m with 0 errors on Wed Sep 19 20:45:26 2018
    config:
    
        NAME                              STATE     READ WRITE CKSUM
        rpool                             DEGRADED     0     0     0
          mirror-0                        DEGRADED     0     0     0
            wwn-0x5000cca25cc933fe-part2  ONLINE       0     0     0
            12706416511818272176          UNAVAIL      0     0     0  was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
    
    errors: No known data errors
    


    /sdb is the new one, how to add it safely to the zpoll and resilver? i have important data on sda and dont want to make any mistake, PLEASE help me providing step by step...
     
    #1 chalan, Dec 5, 2018
    Last edited: Dec 5, 2018
  2. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    ok i have done

    1.) zpool offline rpool /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
    2.) From the WebUI, Servername -> Disks -> Initialize Disk with GPT (/dev/sdb)
    3.) sgdisk --replicate=/dev/sdb /dev/sda
    4.) sgdisk --randomize-guids /dev/sdb
    5.) grub-install /dev/sdb
    6.) zpool replace rpool /dev/disk/by-id/wwn-0x5000cca269e871c7-part2

    everythig worked but 6.) ended with

    cannot replace /dev/disk/by-id/wwn-0x5000cca269e871c7-part2 with /dev/disk/by-id/wwn-0x5000cca269e871c7-part2: no such device in pool

    so what can i do now, please help
     
  3. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    ok i have done

    zpool replace rpool 12706416511818272176 /dev/disk/by-id/wwn-0x5000cca269e871c7-part2

    now i have

    Code:
    root@pve-klenova:~# zpool status -v
      pool: rpool
     state: DEGRADED
    status: One or more devices is currently being resilvered.  The pool will
       continue to function, possibly in a degraded state.
    action: Wait for the resilver to complete.
      scan: resilver in progress since Wed Dec  5 12:18:08 2018
       33M scanned out of 1,09T at 1,83M/s, 172h30m to go
       32,6M resilvered, 0,00% done
    config:
    
       NAME                                STATE     READ WRITE CKSUM
       rpool                               DEGRADED     0     0     0
         mirror-0                          DEGRADED     0     0     0
           wwn-0x5000cca25cc933fe-part2    ONLINE       0     0     0
           replacing-1                     DEGRADED  1004     0     0
             12706416511818272176          OFFLINE      0     0     0  was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
             wwn-0x5000cca269e871c7-part2  ONLINE       0     0  1004  (resilvering)
    
    errors: No known data errors
    
    
    is it ok?
     
  4. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    is this normal??? 54,6G scanned out of 1,09T at 14,2M/s, 21h10m to go...??? sata3 disk in sata3 ports with sata3 cables... all vms stoped...
     
  5. tom

    tom Proxmox Staff Member
    Staff Member

    Joined:
    Aug 29, 2006
    Messages:
    13,172
    Likes Received:
    352
    If you use very slow disks, disks operations are very slow.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  6. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    Do you think WD GOLD 4TB are very slow disks?
     
  7. tom

    tom Proxmox Staff Member
    Staff Member

    Joined:
    Aug 29, 2006
    Messages:
    13,172
    Likes Received:
    352
    I my workplace there are mainly NVMe ssds, so yes, for my point of view these are slow disks.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    what does the permanent erros means? and how to get rid of this:

    replacing-1 DEGRADED 1008 0 0
    12706416511818272176 OFFLINE 0 0 0 was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2

    Code:
    root@pve-klenova:~# zpool status -v
      pool: rpool
     state: DEGRADED
    status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
    action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
       see: http://zfsonlinux.org/msg/ZFS-8000-8A
      scan: resilvered 1,09T in 7h33m with 2 errors on Wed Dec  5 19:51:37 2018
    config:
    
        NAME                                STATE     READ WRITE CKSUM
        rpool                               DEGRADED     0     0     2
          mirror-0                          DEGRADED     0     0     4
            wwn-0x5000cca25cc933fe-part2    ONLINE       0     0     4
            replacing-1                     DEGRADED  1008     0     0
              12706416511818272176          OFFLINE      0     0     0  was /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
              wwn-0x5000cca269e871c7-part2  ONLINE       0     0  1008
    
    errors: Permanent errors have been detected in the following files:
    
            //var/lib/vz/images/200/vm-200-disk-2.qcow2
    
     
  9. chalan

    chalan Member

    Joined:
    Mar 16, 2015
    Messages:
    112
    Likes Received:
    0
    can somebody please help me with the the degraded pool? see above... i have make thease steps and still have degraded pool...

    1.) zpool offline rpool /dev/disk/by-id/wwn-0x5000cca269c4bd82-part2
    2.) From the WebUI, Servername -> Disks -> Initialize Disk with GPT (/dev/sdb)
    3.) sgdisk --replicate=/dev/sdb /dev/sda
    4.) sgdisk --randomize-guids /dev/sdb
    5.) grub-install /dev/sdb
    6.) zpool replace rpool 12706416511818272176 /dev/disk/by-id/wwn-0x5000cca269e871c7-part
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice