Here you go, thank you for the prompt reply
Code:
zpool status
pool: local-zfs
state: ONLINE
scan: resilvered 14.7M in 00:00:02 with 0 errors on Tue Oct 8 11:38:58 2024
config:
NAME STATE READ WRITE CKSUM
local-zfs ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-SPCC_Solid_State_Disk_AA230715S301KG05744 ONLINE 0 0 0
ata-SPCC_Solid_State_Disk_AA230715S301KG05622 ONLINE 0 0 0
ata-SPCC_Solid_State_Disk_AA230715S301KG05621 ONLINE 0 0 0
ata-SPCC_Solid_State_Disk_AA230715S301KG05795 ONLINE 0 0 0
errors: No known data errors
Code:
zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
local-zfs 3.62T 399G 3.24T - - 16% 10% 1.00x ONLINE -
All the disks are the same brand and size. the S/N printer on the disk is different from what is shown above. How do i know which one is bad & need to be replaced?
NVM - Figured it out. I used CrystalDiskInfo to find that out.
I also figured out how to swap the disk out thanks to ChatGPT
Step 1: Identify the Failed Disk
Check the ZFS Pool Status: First, verify the current status of your ZFS pool and identify the failed or degraded disk by running the following command:
Look for lines indicating a degraded or faulted status like this:
Code:
pool: mypool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
action: Replace the faulted device.
NAME STATE READ WRITE CKSUM
mypool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
sda FAULTED 0 0 0 too many errors
sdb ONLINE 0 0 0
This output shows that sda is faulty and needs to be replaced.
Step 2: Find the UUID of the Failed Disk
In Proxmox, disks are often mounted using UUIDs rather than device names like /dev/sda to avoid issues with device renaming upon reboots.
Find the UUID of the failed disk: Use the following command to check the UUIDs of the disks in your ZFS pool:
You'll see a list of symbolic links that map the disk IDs (UUIDs) to device names:
Code:
ata-WDC_WD10EFRX-68PJCN0_WD-WCC4J1X9V7X0 -> ../../sda
ata-WDC_WD10EFRX-68PJCN0_WD-WCC4J1X9V7X0-part1 -> ../../sda1
Identify which UUID corresponds to the failed disk (sda in our example).
Detach the Faulty Disk: Before physically replacing the disk, offline it in the ZFS pool:
Code:
zpool offline mypool /dev/disk/by-id/{UUID-of-failed-disk}
Replace {UUID-of-failed-disk} with the actual UUID of the failed disk.
Step 3: Physically Replace the Failed Disk
Shut down or ensure your server can safely remove the failed disk (use hot-swap functionality if your hardware supports it).
Replace the failed disk with the new disk.
Boot the server back up if it was shut down.
Step 4: Identify the New Disk
Scan for the new disk: After replacing the disk, check the newly attached disk's UUID using the following command:
Look for the new disk’s UUID or identifier, which should differ from the old one. This UUID will typically include the manufacturer and model of the disk, such as ata-WDC_WD10EFRX.
Ensure the system recognizes the new disk: You can also verify that the system has detected the new disk by running:
Step 5: Attach the New Disk to the ZFS Pool
Replace the old disk with the new disk in the ZFS pool using the new disk's UUID. Run the following command:
Code:
zpool replace mypool /dev/disk/by-id/{UUID-of-failed-disk} /dev/disk/by-id/{UUID-of-new-disk}
Replace {UUID-of-failed-disk} with the UUID of the failed disk, and {UUID-of-new-disk} with the UUID of the newly installed disk.
Rebuild the ZFS pool: ZFS will now start rebuilding the pool by replicating the data to the new disk. You can monitor the progress by running:
The output should show the pool in a resilvering state:
Code:
pool: mypool
state: DEGRADED
status: One or more devices is currently being resilvered.
action: Wait for the resilver process to complete.
NAME STATE READ WRITE CKSUM
mypool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
sda FAULTED 0 0 0 too many errors
sdc ONLINE 0 0 0 (resilvering)
Wait for resilvering to complete: The resilvering process can take some time, depending on the size of your pool and the performance of your system.
Step 6: Verify the Disk Replacement
After the resilvering is complete, verify that the ZFS pool is now in a healthy state by running:
The output should show the pool as ONLINE:
Step 7: Clean Up
Once the replacement and resilvering are complete, you may want to remove any references to the old disk if necessary:
Code:
zpool remove mypool /dev/disk/by-id/{UUID-of-failed-disk}
Now the failed disk has been successfully replaced, and your ZFS pool should be back to normal operation!