Hi guys.
I run PBS with ZFS on a pool named RPOOL that contains 4 drives of 4Tb. /dev/sdb was phasing out and gave tons of errors.
I did a "ls -a /dev/disk/by-id/", and here is the output concerning ths serial number K4KJ220L:
I did a "zpool offline rpool /dev/sdb" and replaced the disk in the datacenter, and I'm trying to REPLACE the old disk with the new one.
First step was to find his new name (the new is the serial number K3H88TRL:
But when I try the REPLACE command I receive this error about a device not existing:
So I tried this, and it started resilvering:
Now if I ask for a status I get:
Is that normal? Was it the good command to enter? Will the FAULTED drive disapear at the end of the (very long) resilvering?
Thank you.
My question is
I run PBS with ZFS on a pool named RPOOL that contains 4 drives of 4Tb. /dev/sdb was phasing out and gave tons of errors.
I did a "ls -a /dev/disk/by-id/", and here is the output concerning ths serial number K4KJ220L:
Code:
ata-HGST_HUS726040ALA610_K4KJ220L lvm-pv-uuid-hQy3ob-CysI-64G2-4oF9-1Awg-KHJz-Vk44Ko wwn-0x5000cca244c909cf-part1
ata-HGST_HUS726040ALA610_K4KJ220L-part1 nvme-eui.e8238fa6bf530001001b448b46ae5183 wwn-0x5000cca244c909cf-part9
ata-HGST_HUS726040ALA610_K4KJ220L-part9 nvme-eui.e8238fa6bf530001001b448b46ae5183-part1 wwn-0x5000cca25ccb252d
I did a "zpool offline rpool /dev/sdb" and replaced the disk in the datacenter, and I'm trying to REPLACE the old disk with the new one.
First step was to find his new name (the new is the serial number K3H88TRL:
Code:
root@pbs104:~# ls -a /dev/disk/by-id/
. ata-HGST_HUS726040ALA610_N8GMWBLY-part9 nvme-WDC_CL_SN720_SDAQNTW-512G-2000_2008B7800452-part3
.. dm-name-pbs-root wwn-0x5000cca244c7153e
ata-HGST_HUS726040ALA610_K3GTJ1RL dm-name-pbs-swap wwn-0x5000cca244c7153e-part1
ata-HGST_HUS726040ALA610_K3GTJ1RL-part1 dm-uuid-LVM-JBgr5pORNQBEXbR0ngPr9lvGBJz93hYl1to1krFai7RxqZDcgqh0MNRPzak47KQX wwn-0x5000cca244c7153e-part9
ata-HGST_HUS726040ALA610_K3GTJ1RL-part9 dm-uuid-LVM-JBgr5pORNQBEXbR0ngPr9lvGBJz93hYlqNbyzKSP7YZlkFI00Ln3TjcHlY6tHeWG wwn-0x5000cca244c909cf
ata-HGST_HUS726040ALA610_K3H88TRL lvm-pv-uuid-hQy3ob-CysI-64G2-4oF9-1Awg-KHJz-Vk44Ko wwn-0x5000cca244c909cf-part1
ata-HGST_HUS726040ALA610_K3H88TRL-part1 nvme-eui.e8238fa6bf530001001b448b46ae5183 wwn-0x5000cca244c909cf-part9
ata-HGST_HUS726040ALA610_K3H88TRL-part9 nvme-eui.e8238fa6bf530001001b448b46ae5183-part1 wwn-0x5000cca25ccb252d
ata-HGST_HUS726040ALA610_N8GHL0WY nvme-eui.e8238fa6bf530001001b448b46ae5183-part2 wwn-0x5000cca25ccb252d-part1
ata-HGST_HUS726040ALA610_N8GHL0WY-part1 nvme-eui.e8238fa6bf530001001b448b46ae5183-part3 wwn-0x5000cca25ccb252d-part9
ata-HGST_HUS726040ALA610_N8GHL0WY-part9 nvme-WDC_CL_SN720_SDAQNTW-512G-2000_2008B7800452 wwn-0x5000cca25cd1db7f
ata-HGST_HUS726040ALA610_N8GMWBLY nvme-WDC_CL_SN720_SDAQNTW-512G-2000_2008B7800452-part1 wwn-0x5000cca25cd1db7f-part1
ata-HGST_HUS726040ALA610_N8GMWBLY-part1 nvme-WDC_CL_SN720_SDAQNTW-512G-2000_2008B7800452-part2 wwn-0x5000cca25cd1db7f-part9
But when I try the REPLACE command I receive this error about a device not existing:
Code:
root@pbs104:~# zpool replace -f rpool /dev/disk/by-id/ata-HGST_HUS726040ALA610_K4KJ220L /dev/disk/by-id/ata-HGST_HUS726040ALA610_K3H88TRL
cannot replace /dev/disk/by-id/ata-HGST_HUS726040ALA610_K4KJ220L with /dev/disk/by-id/ata-HGST_HUS726040ALA610_K3H88TRL: no such device in pool
So I tried this, and it started resilvering:
Code:
zpool replace -f rpool /dev/sdb /dev/disk/by-id/ata-HGST_HUS726040ALA610_K3H88TRL
Now if I ask for a status I get:
Code:
root@pbs104:~# zpool status
pool: rpool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Feb 14 14:30:02 2023
825G scanned at 5.09G/s, 36.3G issued at 229M/s, 10.1T total
8.08G resilvered, 0.35% done, 12:50:02 to go
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
sda ONLINE 0 0 0
replacing-1 DEGRADED 0 0 0
sdb FAULTED 0 0 0 corrupted data
ata-HGST_HUS726040ALA610_K3H88TRL ONLINE 0 0 0 (resilvering)
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
errors: No known data errors
Is that normal? Was it the good command to enter? Will the FAULTED drive disapear at the end of the (very long) resilvering?
Thank you.
My question is