pbs freeze, where to start to debug? (zfs scrub)

Maybe the batch has been dropped or something?
Did you do a long self-test or a complete read with dd? Performance test with fio?
 
  • Like
Reactions: Fra
thanks mow,
yes, hetzner did the hardware test (pretty deep, on everything: it lasted almost 2 days due to disks size (16TB), and everything was fine

no, did not do the dd, or the fio


today we really give back the server: I am almost sure Hetzner will do something before giving it back to somebody else


Code:
# zpool status
  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub in progress since Wed Feb 22 17:01:02 2023
    9.41T scanned at 8.31M/s, 5.17T issued at 4.57M/s, 12.7T total
    6.12M repaired, 40.64% done, no estimated completion time
config:

    NAME                                             STATE     READ WRITE CKSUM
    rpool                                            ONLINE       0     0     0
      raidz2-0                                       ONLINE       0     0     0
        ata-TOSHIBA_MG08ACA16TEY_X1D0A0S0FVNG-part3  ONLINE       0     0    29  (repairing)
        ata-TOSHIBA_MG08ACA16TEY_X1E0A009FVNG-part3  ONLINE       0     0    22  (repairing)
        ata-TOSHIBA_MG08ACA16TEY_X1D0A0FPFVNG-part3  ONLINE       0     0    25  (repairing)
        ata-TOSHIBA_MG08ACA16TEY_X1D0A0UWFVNG-part3  ONLINE       0     0    22  (repairing)

errors: No known data errors


so, 40% done after ...15 days!! (with huge I/O!, only on disk write)

so, we give up, and remove it today
 
Last edited: